标题：Extracting Dimensions for OLAP on Multidimensional Text Databases
作者：Zhang, Chao; Wang, Xinjun; Peng, Zhaohui
作者机构：[Zhang, Chao; Wang, Xinjun; Peng, Zhaohui] Shandong Univ Jinan, Sch Comp Sci & Technol, Jinan, Peoples R China.
会议名称：International Conference on Web Information Systems and Mining (WISM 2011)
会议日期：SEP 24-25, 2011
来源：WEB INFORMATION SYSTEMS AND MINING, PT II
关键词：OLAP; unstructured data; extracting algorithm
摘要：With the amount of textual information massively growing in various kinds of business systems and Internet, there are increasingly demands for analyzing both structured data and unstructured text data. Online Analysis Processing (OLAP) is effective for analyzing and mining structured data. However, while handling with unstructured data, it is powerless. After working on several information integration and data analysis applications, we have realized the defect of OLAP on text data analysis and use technical ways to handle this issue. In this paper, we propose a semi-supervised algorithm to extract dimensions and their members from textual information for the purpose of analyzing a huge set of textual data. We use straightforward measures to express analysis results. Experiment result shows that the extracting algorithm is valid and our approach has a high scalability and flexibility.