标题：Popularity Prediction in Microblogging Network
作者：Gao, Shuai; Ma, Jun; Chen, Zhumin
作者机构：[Gao, Shuai; Ma, Jun; Chen, Zhumin] Shandong Univ, Sch Comp Sci & Technol, Jinan 250100, Peoples R China.
会议名称：16th Asia-Pacific Web Conference (APWeb)
会议日期：SEP 05-07, 2014
来源：WEB TECHNOLOGIES AND APPLICATIONS, APWEB 2014
关键词：Popularity Prediction; Social Media; Classification; Information; diffusion
摘要：Popularity prediction in microblogging network aims to predict the future popularity of a tweet based on the observation in the early stages. Existing studies have investigated many features for prediction. However, features from the users who have potential to retweet a tweet have not been fully explored for this problem. Also, the impact of tweet's post time on its early-stage popularity has been neglected. To address these issues, we study two prediction tasks in this paper, i.e. predicting the popularity of a tweet based on the observation in 1 Hour after being posted (PP1H) or the observation of its first k retweets (PPkR), and investigate a wide spectrum of features to identify effective features for each prediction task. We extract structural features including retweet network features and border network features from the underlying user network, and temporal features from the observed retweets. To mitigate the impact of tweet's post time on its early-stage popularity, we introduce the notation of tweet time and use it to measure the temporal features. We treat both prediction tasks as classification problems and apply five standard classifiers (i.e. naive bayes, k-nearest-neighbor, support vector machine, logistic regression and bagging decision trees) for prediction. Experiments on Sina Weibo show that for PP1H task, bagging decision trees with all feature yield the best performance and border network features outperform other groups of features. For PPkR task, we find that satisfied prediction performance can be obtained based on only the temporal features of first 10 retweets. Furhter, by introducing tweet time, we can significantly improve the prediction performance of temporal features.