标题：A Market-Oriented Heuristic Algorithm for Scheduling Parallel Applications in Big Data Service Platform
作者：Shao, Qingshi ;Liu, Shijun ;Pan, Li ;Yang, Chenglei ;Niu, Tingting
作者机构：[Shao, Qingshi ;Liu, Shijun ;Pan, Li ;Yang, Chenglei ] School of Computer Science and Technology, Shandong University, Jinan; 250101, China;[Niu, Ting 更多
会议名称：42nd IEEE Computer Software and Applications Conference, COMPSAC 2018
会议日期：23 July 2018 through 27 July 2018
来源：Proceedings - International Computer Software and Applications Conference
关键词：Big Data analytics; Cloud service; Heuristic job scheduling; Market-oriented
摘要：Big Data analytics service platform delivers a new type of public cloud offerings, through which end users can outsource their job executions by using a group of professional Big Data processing services in a pay-per-use way. Different from other type of cloud services, parallel jobs dominate the domain of data processing services, whose execution time can be varied greatly with different runtime configurations, such as different degrees of parallelism. In such a market-oriented environment, scheduling jobs from end users efficiently to optimize the Big Data analytics service platform's revenue is a more challenging task. In this paper, we propose a market-oriented heuristic algorithm for scheduling parallel jobs in a Big Data analytics service platform with admission control to optimize the platform operator's revenue. The proposed scheduling heuristic takes into account not only the dynamic revenue gained from accomplishing a job within a specific runtime as well as the consumption of resources needed for running it to achieve this given runtime, but also the potential loss it causes to the system by running this job instead of other waiting jobs currently in the system. We also propose a collaborative filtering based approach to quickly and accurately predict the execution time of parallel jobs running in a Big Data analytics service platform. We have conducted extensive experiments and simulations based on workload data derived from the real-world data analytics service platform and parallel applications. We show that our scheduler can outperform the other scheduling algorithms used for comparison, which are based on classical heuristics from literature, thereby fully evaluating the effectiveness of our market-oriented heuristic scheduling algorithm. © 2018 IEEE.