标题：Big datasets for research: A survey on flagship conferences
作者：Wei, Yi; Liu, Shijun; Sun, Jiao; Cui, Lizhen; Pan, Li; Wu, Lei
作者机构：[Wei, Yi; Liu, Shijun; Sun, Jiao; Cui, Lizhen; Pan, Li; Wu, Lei] Shandong Univ, Sch Comp Sci & Technol, Jinan 250101, Peoples R China.
会议名称：IEEE International Congress on Big Data (BigData Congress)
会议日期：JUN 27-JUL 02, 2016
来源：2016 IEEE INTERNATIONAL CONGRESS ON BIG DATA - BIGDATA CONGRESS 2016
关键词：big data; datasets; survey
摘要：It is obvious that big data can bring us new opportunities to discover valuable information. Apparently, corresponding big datasets are powerful tools for scholars, which connect theoretical studies to reality. They can help scholars to evaluate their achievements and find new problems. In recent years, there has been a significant growth in research data repositories and registries. However, these infrastructures are fragmented across institutions, countries and research domains. As such, finding research datasets is not a trivial task for many researchers. Thus we investigated 195 papers regarding big data on some notable international conferences in recent 3 years, and also gathered 285 datasets mentioned in them. In this paper, we present and analyze our survey results in terms of the status quo of big data research and datasets from different aspects. In particular, we propose two different taxonomies of big datasets and classify our surveyed datasets into them. In addition, we also give a brief introduction about 7 widely accepted data collections online. Finally, some basic principles for scholars in choosing and using big datasets are given.