标题：Redesigning LAMMPS for peta-scale and hundred-billion-atom simulation on sunway taihulight
作者：Duan, Xiaohui ;Gao, Ping ;Zhang, Tingjian ;Zhang, Meng ;Liu, Weiguo ;Zhang, Wusheng ;Xue, Wei ;Fu, Haohuan ;Gan, Lin ;Chen, Dexun ;Meng, Xiangxu ;Yang 更多 作者机构：[Duan, Xiaohui ;Gao, Ping ;Zhang, Tingjian ;Zhang, Meng ;Liu, Weiguo ;Meng, Xiangxu ] School of Software, Shandong University, Jinan, China;[Fu, Haohu 更多
会议名称：2018 International Conference for High Performance Computing, Networking, Storage, and Analysis, SC 2018
会议日期：11 November 2018 through 16 November 2018
来源：Proceedings - International Conference for High Performance Computing, Networking, Storage, and Analysis, SC 2018
摘要：Large-scale molecular dynamics (MD) simulations on supercomputers play an increasingly important role in many research areas. In this paper, we present our efforts on redesigning the widely used LAMMPS MD simulator for Sunway TaihuLight supercomputer and its ShenWei many-core architecture (SW26010). The memory constraints of SW26010 bring a number of new challenges for achieving efficient MD implementation on it. In order to overcome these constraints, we employ four levels of optimization: (1) a hybrid memory update strategy; (2) a software cache strategy; (3) customized transcendental math functions; and (4) a full pipeline acceleration. Furthermore, we redesign the code to enable all possible vectorization. Experiments show that our redesigned software on a single SW26010 processor can outperform over 100 E5-2650 v2 cores for running the latest stable release (11Aug17) of LAMMPS. We also achieve a performance of over 2.43 PFlops for a Tersoff simulation when using 16,384 nodes on Sunway TaihuLight. © 2018 IEEE.