parallel algorithms, Weighted Essentially Non-Oscillatory scheme (WENO), optimization, many-core, Sunway TaihuLight
A Weighted Essentially Non-Oscillatory scheme (WENO) is a solution to hyperbolic conservation laws, suitable for solving high-density fluid interface instability with strong intermittency. These problems have a large and complex flow structure. To fully utilize the computing power of High Performance Computing (HPC) systems, it is necessary to develop specific methodologies to optimize the performance of applications based on the particular system’s architecture. The Sunway TaihuLight supercomputer is currently ranked as the fastest supercomputer in the world. This article presents a heterogeneous parallel algorithm design and performance optimization of a high-order WENO on Sunway TaihuLight. We analyzed characteristics of kernel functions, and proposed an appropriate heterogeneous parallel model. We also figured out the best division strategy for computing tasks, and implemented the parallel algorithm on Sunway TaihuLight. By using access optimization, data dependency elimination, and vectorization optimization, our parallel algorithm can achieve up to 172× speedup on one single node, and additional 58× speedup on 64 nodes, with nearly linear scalability.
Tsinghua University Press
Jianqiang Huang, Wentao Han, Xiaoying Wang et al. Heterogeneous Parallel Algorithm Design and Performance Optimization for WENO on the Sunway TaihuLight Supercomputer. Tsinghua Science and Technology 2020, 25(1): 56-67.