标题：Mining frequent co-occurrence patterns across multiple data streams
作者：Yu, Ziqiang ;Yu, Xiaohui ;Liu, Yang ;Li, Wenzhu ;Pei, Jian
作者机构：[Yu, Ziqiang ;Yu, Xiaohui ;Liu, Yang ;Li, Wenzhu ] School of Computer Science and Technology, Shandong University, Jinan, China;[Pei, Jian ] School of 更多
会议名称：18th International Conference on Extending Database Technology, EDBT 2015
会议日期：23 March 2015 through 27 March 2015
来源：EDBT 2015 - 18th International Conference on Extending Database Technology, Proceedings
摘要：This paper studies the problem of mining frequent co-occurrence patterns across multiple data streams, which has not been addressed by existing works. Co-occurrence pattern in this context refers to the case that the same group of objects appear consecutively in multiple streams over a short time span, signaling tight correlations between these objects. The need for mining such patterns in real-time arises in a variety of applications ranging from crime prevention to location-based services to event discovery in social media. Since the data streams are usually fast, continuous, and unbounded, existing methods on mining frequent patterns requiring more than one pass over the data cannot be directly applied. Therefore, we propose DIMine and CooMine, two algorithms to discover frequent co-occurrence patterns across multiple data streams. DIMine is an Apriori-style algorithm based on an inverted index, while CooMine uses an in-memory data structure called the Seg-tree to compactly index the data that are already seen but have not expired yet. CooMine employs a one-pass algorithm that uses the filter-and-refine strategy to obtain the co-occurrence patterns from the Seg-tree as updates to the streams arrive. Extensive experiments on two real datasets demonstrate the superiority of the proposed approaches over a baseline method, and show their respective applicability in different senarios. © 2015, Copyright is with the authors.