标题:Enrich Web Entity Schema Based on Integrated Annotation
作者:Zhang, Yan; Li, Qingzhong; Zhang, Yan
通讯作者:Zhang, Y
作者机构:[Zhang, Yan; Li, Qingzhong] Shandong Univ, Sch Comp Sci & Technol, Jinan 250100, Peoples R China.; [Zhang, Yan] Shandong Univ Finance & Econ, Sch Co 更多
会议名称:10th Web Information System and Application Conference (WISA)
会议日期:NOV 01-03, 2013
来源:2013 10TH WEB INFORMATION SYSTEM AND APPLICATION CONFERENCE (WISA 2013)
出版年:2013
页码:153-+
DOI:10.1109/WISA.2013.37
关键词:web entity; web entity schema; conditional random fields; web entity; annotation
摘要:Web integration systems (WIS) need to collect web objects belong to a specific domain from different websites effectively. Most WIS defines entity schemas beforehand by domain experts. Due to the essence of diversity and variability of web, it is hard to model the web entity comprehensively beforehand; furthermore, wrong annotations happen when align object values from different websites into the WIS. In order to avoid the limitations, we propose an integrated annotating method combining the matching strategy and machine learning technology to dynamically discover synonyms for predefined attribute labels and new attribute labels for a specified type of web entity. Experimental results using real-world data in book and job domains show that the proposed approach is effective in enriching web entity schema to enhance the performance of data collection process in a WIS.
收录类别:CPCI-S
资源类型:会议论文
TOP