[1]安先喜,田英鑫,郭子阳,等.基于Web2.0的电子商务中基于商品名的实体识别[J].哈尔滨工程大学学报,2019,40(07):1334-1339.[doi:10.11990/jheu.201903065]
 AN Xianxi,TIAN Yingxin,GUO Ziyang,et al.Entity identification based on trade name in e-commerce-based Web2.0[J].hebgcdxxb,2019,40(07):1334-1339.[doi:10.11990/jheu.201903065]
点击复制

基于Web2.0的电子商务中基于商品名的实体识别(/HTML)
分享到:

《哈尔滨工程大学学报》[ISSN:1006-6977/CN:61-1281/TN]

卷:
40
期数:
2019年07期
页码:
1334-1339
栏目:
出版日期:
2019-07-05

文章信息/Info

Title:
Entity identification based on trade name in e-commerce-based Web2.0
作者:
安先喜1 田英鑫2 郭子阳2 石胜飞2
1. 哈尔滨工程大学 经济管理学院, 黑龙江 哈尔滨 150001;
2. 哈尔滨工业大学 计算机科学与技术学院, 黑龙江 哈尔滨 150001
Author(s):
AN Xianxi1 TIAN Yingxin2 GUO Ziyang2 SHI Shengfei2
1. School of Economics and Management, Harbin University of Engineering, Harbin 150001, China;
2. School of Computer Science and Technology, Harbin Institute of Technology, Harbin 150001, China
关键词:
实体实体识别电子商务算法数据库语义学交易描述数据模型
分类号:
TP31
DOI:
10.11990/jheu.201903065
文献标志码:
A
摘要:
由于Web2.0的出现,电子商务数据经常由不同网站和不同用户输入,从而同一商品存在着多种描述,这为用户检索和对比商品带来了困难。针对这种情况,本文基于商品名信息对商品进行分类,使得每一类描述一种现实中的商品。本文提出的系统拟将商品名拆分成为关键词集合,基于关键词集合相似性进行分类。对关键词拆分方法、基于集合的分类方法、关键词权重设置方法和相关反馈进行了研究。实验结果表明:本文提出的方法可以快速有效地对商品进行分类,并且权重设置和相关反馈策略可以有效地提高实体识别的准确性。

参考文献/References:

[1] CHRISTEN P. A survey of indexing techniques for scalable record linkage and deduplication[J]. IEEE transactions on knowledge and data engineering, 2012, 24(9):1537-1555.
[2] ELMAGARMID A K, IPEIROTIS P G, VERYKIOS V S. Duplicate record detection:a survey[J]. IEEE transactions on knowledge and data engineering, 2007, 19(1):1-16.
[3] WANG Sibo, XIAO Xiaokui, LEE C H. Crowd-based deduplication:an adaptive approach[C]//Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data. Melbourne, Victoria, Australia, 2015:1263-1277.
[4] GOKHALE C, DAS S, DOAN A, et al. Corleone:hands-off crowdsourcing for entity matching[C]//Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data. Snowbird, Utah, USA, 2014:601-612.
[5] VERROIOS V, GARCIA-MOLINA H. Entity Resolution with crowd errors[C]//Proceedings of 2015 IEEE 31st International Conference on Data Engineering. Seoul, South Korea, 2015:219-230.
[6] VESDAPUNT N, BELLARE K, DALVI N. Crowdsourcing algorithms for entity resolution[J]. Proceedings of the VLDB endowment, 2014, 7(12):1071-1082.
[7] WHANG S E, LOFGREN P, GARCIA-MOLINA H. Question selection for crowd entity resolution[J]. Proceedings of the VLDB endowment, 2013, 6(6):349-360.
[8] HUA Wen, ZHENG Kai, ZHOU Xiaofang. Microblog entity linking with social temporal context[C]//Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data. Melbourne, Victoria, Australia, 2015:1761-1775.
[9] SHEN Wei, HAN Jiawei, WANG Jianyong. A probabilistic model for linking named entities in web text with heterogeneous information networks[C]//Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data. Snowbird, Utah, USA, 2014:1199-1210.
[10] ZHU Xiaochen, SONG Shaoxu, LIAN Xiang, et al. Matching heterogeneous event data[C]//Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data. Snowbird, Utah, USA, 2014:1211-1222.
[11] CHIANG Y H, DOAN A, NAUGHTON J F. Modeling entity evolution for temporal record matching[C]//Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data. Snowbird, Utah, USA, 2014:1175-1186.
[12] WHANG S E, GARCIA-MOLINA H. Incremental entity resolution on rules and data[J]. The VLDB journal, 2014, 23(1):77-102.
[13] GRUENHEID A, DONG X L, SRIVASTAVA D. Incremental record linkage[J]. Proceedings of the VLDB endowment, 2014, 7(9):697-708.
[14] WILDANI A, MILLER E L, RODEH O. HANDS:a heuristically arranged non-backup in-line deduplication system[C]//Proceedings of 2013 IEEE 29th International Conference on Data Engineering. Brisbane, QLD, Australia, 2013:446-457.
[15] LI Xian, DONG Luna, LYONS K B, et al. Scaling up copy detection[C]//Proceedings of 2015 IEEE 31st International Conference on Data Engineering. Seoul, South Korea, 2015:89-100.
[16] WHANG S E, MARMAROS D, GARCIA-MOLINA H. Pay-as-you-go entity resolution[J]. IEEE transactions on knowledge and data engineering, 2013, 25(5):1111-1124.
[17] LI Lingli, LI Jianzhong, WANG Hongzhi, et al. Context-based entity description rule for entity resolution[C]//Proceedings of the 20th ACM International Conference on Information and Knowledge Management. Glasgow, Scotland, UK, 2011:1725-1730.
[18] LI Lingli, LI Jianzhong, GAO Hong. Rule-based method for entity resolution[J]. IEEE transactions on knowledge and data engineering, 2015, 27(1):250-263.
[19] WANG Fangda, WANG Hongzhi, LI Jianzhong, et al. Graph-based reference table construction to facilitate entity matching[J]. Journal of systems and software, 2013, 86(6):1679-1688.
[20] ALTOWIM Y, KALASHNIKOV D V, MEHROTRA S. Progressive approach to relational entity resolution[J]. Proceedings of the VLDB endowment, 2014, 7(11):999-1010.
[21] ALTWAIJRY H, KALASHNIKOV D V, MEHROTRA S. Query-driven approach to entity resolution[J]. Proceedings of the VLDB endowment, 2013, 6(14):1846-1857.
[22] WANG Hongzhi, LI Jianzhong, GAO Hong. Efficient entity resolution based on subgraph cohesion[J]. Knowledge and information systems, 2016, 46(2):285-314.
[23] LI Qi, LI Yaliang, GAO Jing, et al. A confidence-aware approach for truth discovery on long-tail data[J]. Proceedings of the VLDB endowment, 2014, 8(4):425-436.
[24] PROKOSHYNA N, SZLICHTA J, CHIANG F, et al. Combining quantitative and logical data cleaning[J]. Proceedings of the VLDB endowment, 2015, 9(4):300-311
[25] ZHAO Zhou, CHENG J, NG W. Truth discovery in data streams:A single-pass probabilistic approach[C]//Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management. Shanghai, 2014:1589-1598.
[26] INTERLANDI M, TANG Nan. Proof positive and negative in data cleaning[C]//Proceedings of 2015 IEEE 31st International Conference on Data Engineering. Seoul, South Korea, 2015:18-29.
[27] XIAO Chuan, WANG Wei, LIN Xuemin, et al. Top-k set similarity joins[C]//Proceedings of the 25th International Conference on Data Engineering. Shanghai, China, 2009:916-927.

备注/Memo

备注/Memo:
收稿日期:2019-03-19。
基金项目:国家重点研发计划(2016YFB1000703);国家自然科学基金项目(U1509216,U1866602,61472099,61602129).
作者简介:安先喜,男,博士研究生;石胜飞,男,副教授.
通讯作者:安先喜,E-mail:57729540@qq.com
更新日期/Last Update: 2019-07-04