Empirical Analysis of HPC Using Different Programming Models

Full Text (PDF, 992KB), PP.27-34

Views: 0 Downloads: 0

Author(s)

Muhammad Usman Ashraf 1,* Fadi Fouz 1 Fathy Alboraei Eassa 1

1. Department of Computer Science, King abdulaziz University Jeddah, Saudi Arabia

* Corresponding author.

DOI: https://doi.org/10.5815/ijmecs.2016.06.04

Received: 12 Feb. 2016 / Revised: 5 Mar. 2016 / Accepted: 28 Apr. 2016 / Published: 8 Jun. 2016

Index Terms

HPC, GPU, DDM, CUDA, OpenMP, MPI, Parallel programming

Abstract

During the last decade, Heterogeneous systems are emerging for high performance computing [1]. In order to achieve high performance computing (HPC), existing technologies and programming models aims to see rapid growth toward intra-node parallelism [2]. The current high computational system and applications demand for a massive level of computation power. In last few years, Graphical processing unit (GPU) has been introduced an alternative of conventional CPU for highly parallel computing applications both for general purpose and graphic processing. Rather than using the traditional way of coding algorithms in serial by single CPU, many multithreading programming models has been introduced such as CUDA, OpenMP, and MPI to make parallel processing by using multicores. These parallel programming models are supportive to data driven multithreading (DDM) principle [3]. In this paper, we have presented performance based preliminary evaluation of these programming models and compared with the conventional single CPU serial processing system. We have implemented a massive computational operation for performance evaluation such as complex matrix multiplication operation. We used data driven multithreaded HPC system for performance evaluation and presented the results with a comprehensive analysis of these parallel programming models for HPC parallelism.

Cite This Paper

Muhammad Usman Ashraf, Fadi Fouz, Fathy Alboraei Eassa, "Empirical Analysis of HPC Using Different Programming Models", International Journal of Modern Education and Computer Science(IJMECS), Vol.8, No.6, pp.27-34, 2016. DOI:10.5815/ijmecs.2016.06.04

Reference

[1]Jia, Xun, Peter Ziegenhein, and Steve B. Jiang. "GPUbased high-performance computing for radiation therapy." Physics in medicine and biology 59.4 (2014): R151.

[2]Brooks, Alex, et al. "PPL: An abstract runtime system for hybrid parallel programming." Proceedings of the First International Workshop on Extreme Scale Programming Models and Middleware. ACM, 2015.

[3]Allan, Robert John, et al., eds. High-performance computing. Springer Science & Business Media, 2012.

[4]Brodman, James, and Peng Tu, eds. Languages and Compilers for Parallel Computing: 27th International Workshop, LCPC 2014, Hillsboro, OR, USA, September 15-17, 2014, Revised Selected Papers. Vol. 8967. Springer, 2015.

[5]Yang, Chao-Tung, Chih-Lin Huang, and Cheng-Fang Lin. "Hybrid CUDA, OpenMP, and MPI parallel programming on multicore GPU clusters." Computer Physics Communications 182.1 (2011): 266-269.

[6]Navarro, Cristobal A., Nancy Hitschfeld-Kahler, and Luis Mateu. "A survey on parallel computing and its applications in data-parallel problems using GPU architectures." Communications in Computational Physics 15.02 (2014): 285-329.

[7]Kirk, David B., and W. Hwu Wen-mei. Programming massively parallel processors: a hands-on approach. Newnes, 2012.

[8]Christofi, Constantinos, et al. "Exploring HPC parallelism with data-driven multithreating." Data-Flow Execution Models for Extreme Scale Computing (DFM), 2012. IEEE, 2012.

[9]E. Agullo, J. Dongarra, B. Hadri, J. Kurzak, J. Langou, J. Langou, and H. Ltaief. PLASMA Users Guide. Technical report, ICL, UTK, 2009. [10]Diavastos, Andreas, Giannos Stylianou, and Pedro Trancoso. "TFluxSCC: Exploiting Performance on Future Many-Core Systems through Data-Flow."2015 23rd Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP). IEEE, 2015.

[11]Perez J.M., Badia R.M., Labarta J.:A dependency-aware task-based programming environment for multi-core architectures. In Proceedings of 2008 IEEE International Conference on Cluster Computing, 2008.

[12]Yang, Chao-Tung, Chih-Lin Huang, and Cheng-Fang Lin. "Hybrid CUDA, OpenMP, and MPI parallel programming on multicore GPU clusters." Computer Physics Communications 182.1 (2011): 266-269.

[13]Ashraf, Muhammad Usman, and Fathy Elbouraey Eassa. "Hybrid Model Based Testing Tool Architecture for Exascale Computing System." International Journal of Computer Science and Security (IJCSS) 9.5 (2015): 245.

[14]ZOTOS, KOSTAS, et al. "Object-Oriented Analysis of Fibonacci Series Regarding Energy Consumption."

[15]Diaz, Javier, Camelia Munoz-Caro, and Alfonso Nino. "A survey of parallel programming models and tools in the multi and many-core era." Parallel and Distributed Systems, IEEE Transactions on 23.8 (2012): 1369-1386.

[16]Goodrich, Michael T., and Roberto Tamassia. Algorithm design and applications. Wiley Publishing, 2014.

[17]Coppersmith, Don, and Shmuel Winograd. "Matrix multiplication via arithmetic progressions." Proceedings of the nineteenth annual ACM symposium on Theory of computing. ACM, 1987.
[18]“NVIDIA” http://www.nvidia.com/tesla, Mar 2014 [Nov, 25. 2015].

[19]Studio, Visual. "Debugging DirectX Graphics." (2013).
[20]Thomas, W.; Daruwala, R.D., "Performance comparison of CPU and GPU on a discrete heterogeneous architecture," in Circuits, Systems, Communication and Information Technology Applications (CSCITA), 2014 International Conference on , vol., no., pp.271-276, 4-5 April 2014.

[21]Patterson, David A., and John L. Hennessy. Computer organization and design: the hardware/software interface. Newnes, 2013.

[22]T.G. Mattson, B.A. Sanders, and B. Massingill, Patterns for Parallel Programming. Addison-Wesley Professional, 2005.

[23]Da Costa, Georges, et al. "Exascale Machines Require New Programming Paradigms and Runtimes." Super-computing Frontiers and Innovations} 2 (2015): 6-27.

[24]Chung, T. J. Computational fluid dynamics. Cambridge university press, 2010.

[25]Bosilca, George, et al. "DAGuE: A generic distributed DAG engine for high performance computing." Parallel Computing 38.1 (2012): 37-51.

[26]Goff, Stephen A., et al. "The iPlant collaborative: cyberinfrastructure for plant biology." Frontiers in plant science 2 (2011).

[27]Su, Mehmet F., et al. "A novel FDTD application featuring OpenMP-MPI hybrid parallelization." Parallel Processing, 2004. ICPP 2004. International Conference on. IEEE, 2004.