标题:GCMR: A GPU Cluster-based MapReduce Framework for Large-scale Data Processing
作者:Guo, Yiru; Liu, Weiguo; Gong, Bin; Voss, Gerrit; Mueller-Wittig, Wolfgang
通讯作者:Liu, WG
作者机构:[Guo, Yiru; Liu, Weiguo; Gong, Bin] Shandong Univ, Sch Comp Sci & Technol, Jinan 250100, Peoples R China.; [Voss, Gerrit; Mueller-Wittig, Wolfgang] 更多
会议名称:15th IEEE International Conference on High Performance Computing and Communications (HPCC) /11th IEEE/IFIP International Conference on Embedded and Ubiquitous Computing (EUC)
会议日期:NOV 13-15, 2013
来源:2013 IEEE 15TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND COMMUNICATIONS & 2013 IEEE INTERNATIONAL CONFERENCE ON EMBEDDED AND UBIQUITOUS COMPUTING (HPCC_EUC)
出版年:2013
页码:580-586
DOI:10.1109/HPCC.and.EUC.2013.88
关键词:MapReduce; CUDA; MPI; GPU Cluster
摘要:MapReduce is a very popular programming model to support parallel and distributed large-scale data processing. There have been a lot of efforts to implement this model on commodity GPU-based systems. However, most of these implementations can only work on a single GPU. And they can not be used to process large-scale datasets. In this paper, we present a new approach to design the MapReduce framework on GPU clusters for handling large-scale data processing. We have used Compute Unified Device Architectures (CUDA) and MPI parallel programming models to implement this framework. To derive an efficient mapping onto GPU clusters, we introduce a two-level parallelization approach: the inter node level and intra node level parallelization. Furthermore in order to improve the overall MapReduce efficiency, a multi-threading scheme is used to overlap the communication and computation on a multi-GPU node. Compared to previous GPU-based MapReduce implementations, our implementation, called GCMR, achieves speedups up to 2.6 on a single node and up to 9.1 on 4 nodes of a Tesla S1060 quad-GPU cluster system for processing small datasets. It also shows very good scalability for processing large-scale datasets on the cluster system.
收录类别:CPCI-S
WOS核心被引频次:1
资源类型:会议论文
TOP