[1]迟荣华,黄少滨,吕天阳.基于频繁密度分布模式的不确定数据流查询方法[J].哈尔滨工程大学学报,2018,39(06):1052-1058.[doi:10.11990/jheu.201707094]
 CHI Ronghua,HUANG Shaobin,LYU Tianyang.Query processing on uncertain data stream based on frequency density distribution pattern[J].hebgcdxxb,2018,39(06):1052-1058.[doi:10.11990/jheu.201707094]
点击复制

基于频繁密度分布模式的不确定数据流查询方法(/HTML)
分享到:

《哈尔滨工程大学学报》[ISSN:1006-6977/CN:61-1281/TN]

卷:
39
期数:
2018年06期
页码:
1052-1058
栏目:
出版日期:
2018-06-05

文章信息/Info

Title:
Query processing on uncertain data stream based on frequency density distribution pattern
作者:
迟荣华1 黄少滨1 吕天阳2
1. 哈尔滨工程大学 计算机科学与技术学院, 黑龙江 哈尔滨 150001;
2. 审计署计算机技术中心, 北京 100073
Author(s):
CHI Ronghua1 HUANG Shaobin1 LYU Tianyang2
1. College of Computer Science and Technology, Harbin Engineer University, Harbin 150001, China;
2. Audit Research Institute of Chinese National Audit Office, Beijing 100073, China
关键词:
不确定性数据流相似性查询非参数估计数据挖掘马尔科夫
分类号:
TP274
DOI:
10.11990/jheu.201707094
文献标志码:
A
摘要:
针对当前不确定数据流相似性查询问题中不确定对象建模不准确的问题,提出了一种面向不确定数据流的相似性查询方法HB-UTS。利用非参数估计方法对不确定数据流中的对象建模,得到不确定对象的密度函数。通过谱聚类方法挖掘密度函数的频繁模式,将挖掘后的模式抽象为语义表示的不确定数据流序列。在相似性查询阶段,通过高阶Markov的状态转移矩阵模型构建不确定数据流的索引结构,它在记录不确定数据流存储地址的同时还记录序列元素的存储概率,可有效提高数据流的分步输入查询效率。本文进行了真实与仿真相结合的方法,通过在随机化处理后的真实数据集上的实验以及与其他相似性查询方法的比较,验证了HB-UTS在处理大规模不确定数据流时较好处理能力以及实施效果。

参考文献/References:

[1] RE C, DALVI N, SUCIU D. Efficient top-k query evaluation on probabilistic data[C]//Proceedings of the 23rd International Conference on Data Engineering. Istanbul, Turkey, 2007:886-895.
[2] JEFFERY S R, GAROFALAKIS M, FRANKLIN M J. Adaptive cleaning for RFID data streams[C]//Proceedings of the 32nd International Conference on Very Large Data Bases. Seoul, Korea, 2006:163-174.
[3] TRAN T, SUTTON C, COCCI R, et al. Probabilistic inference over RFID streams in mobile environments[C]//Proceedings of the 25th IEEE International Conference on Data Engineering. Shanghai, China, 2009:1096-1107.
[4] CHEN Lei, ÖZSU M T, ORIA V. Robust and fast similarity search for moving object trajectories[C]//Proceedings of 2005 ACM SIGMOD International Conference on Management of Data. Baltimore, Maryland, 2005:491-502.
[5] LJOSA V, SINGH A K. APLA:indexing arbitrary probability distributions[C]//Proceedings of the 23rd IEEE International Conference on Data Engineering. Istanbul, Turkey, 2007:946-955.
[6] SARMA A D, BENJELLOUN O, Halevy A, et al. Working models for uncertain data[C]//Proceedings of the 22nd International Conference on Data Engineering. Atlanta, GA, USA, 2006.
[7] WIDOM J. Trio:a system for integrated management of data, accuracy, and lineage[C]//Proceedings of the Biennial Conference on Innovative Data Systems Research., 2005:262-276.
[8] FUXMAN A, FAZLI E, MILLER R. ConQuer:efficient management of inconsistent databases[C]//Proceedings of 2005 ACM SIGMOD International Conference on Management of Data. Baltimore, Maryland, 2005:155-166.
[9] DALVI N, SUCIU D. Efficient query evaluation on probabilistic databases[J]. The VLDB journal, 2007, 16(4):523-544.
[10] SOLIMAN M A, ILYAS I F, CHANG K C C. URank:formulation and efficient evaluation of top-k queries in uncertain databases[C]//Proceedings of 2007 ACM SIGMOD International Conference on Management of Data. Beijing, China, 2007:1082-1084.
[11] 赵越, 王意洁, 王媛, 等. 一种高效的不确定数据流并行Skyline查询处理方法[J]. 计算机研究与发展, 2013, 50(S1):132-139.ZHAO Yue, WANG Yijie, WANG Yuan, et al. An efficient method of parallel skyline query processing over uncertain data stream[J]. Journal of computer research and development, 2013, 50(S1):132-139.
[12] ZHOU Xu, LI Kenli, ZHOU Yantao, et al. Adaptive processing for distributed skyline queries over uncertain data[J]. IEEE transactions on knowledge and data engineering, 2016, 28(2):371-384.
[13] CICERI E, FRATERNALI P, MARTINENGHI D, et al. Crowdsourcing for top-K query processing over uncertain data[J]. IEEE transactions on knowledge and data engineering, 2016, 28(1):41-53.
[14] RAMÍREZ-GALLEGO S, KRAWCZYK B, GARCÍA S, et al. A survey on data preprocessing for data stream mining:current status and future directions[J]. Neurocomputing, 2017, 239:39-57.
[15] 张晨, 金澈清, 周傲英. 一种不确定数据流聚类算法[J]. 2010, 21(9):2173-2182.ZHANG Chen, JIN Cheqing, ZHOU Aoying. Clustering algorithm over uncertain data streams[J]. Journal of software, 2010, 21(9):2173-2182.
[16] DALLACHIESA M, NUSHI B, MIRYLENKA K, et al. Uncertain time-series similarity:return to the basics[J]. Proceedings of the VLDB endowment, 2012, 5(11):1662-1673.

相似文献/References:

[1]周鹏飞,李丕安.集装箱堆场不确定提箱次序与卸船箱位分配 [J].哈尔滨工程大学学报,2013,(09):1119.[doi:10.3969/j.issn.1006-7043. 201301022]
[2]焦鑫,江驹.非线性系统自适应鲁棒控制器设计[J].哈尔滨工程大学学报,2016,37(03):402.[doi:10.11990/jheu.201411020]
 JIAO Xin,JIANG Ju.Design of an adaptive robust controller for nonlinear system[J].hebgcdxxb,2016,37(06):402.[doi:10.11990/jheu.201411020]
[3]龚全铨,袁锁中,张进.基于Dubins路径的空中加油自主会合制导与控制[J].哈尔滨工程大学学报,2016,37(08):1081.[doi:10.11990/jheu.201506032]
 GONG Quanquan,YUAN Suozhong,ZHANG Jin.Guidance and control of autonomous rendezvous inaerial refueling based on Dubins path planning[J].hebgcdxxb,2016,37(06):1081.[doi:10.11990/jheu.201506032]
[4]郭军军,钟剑,袁万城,等.考虑桥台性能影响的连续梁桥地震易损性分析[J].哈尔滨工程大学学报,2017,38(04):532.[doi:10.11990/jheu.201603069]
 GUO Junjun,ZHONG Jian,YUAN Wancheng,et al.Seismic fragility analysis of a continuous bridge considering the performance of abutments[J].hebgcdxxb,2017,38(06):532.[doi:10.11990/jheu.201603069]
[5]郭世明,高宏.基于滑动窗口挖掘数据流高效用项集的有效算法[J].哈尔滨工程大学学报,2018,39(04):721.[doi:10.11990/jheu.201611075]
 GUO Shiming,GAO Hong.An efficient algorithm for mining high utility itemsets from data streams based on sliding window techniques[J].hebgcdxxb,2018,39(06):721.[doi:10.11990/jheu.201611075]

备注/Memo

备注/Memo:
收稿日期:2017-07-24。
基金项目:国家自然科学基金重大研究计划(91546110).
作者简介:迟荣华(1981-),男,博士研究生;黄少滨(1965-),男,教授,博士生导师.
通讯作者:迟荣华,E-mail:chironghua@hrbeu.edu.cn
更新日期/Last Update: 2018-06-01