Sanjay H. A

Work place: Advance computing research, Nitte Meenakshi Institute of Technology, Bangalore-560064, India

E-mail: sanjay.ha@nmit.ac.in

Website:

Research Interests: Parallel Computing, Distributed Computing, Solid Modeling, Computer systems and computational processes

Biography

H.A Sanjay is a professor and Head of the department at Nitte Meenakshi institute of Technology. Received the BE degree in Electrical and engineering from the Kuvempu University, India, in 1996. And the M.Tech degree in computer science and engineering from Visvesvaraya Technological University, India, in 2001. He obtained a PhD at the Supercomputer Education and Research Centre at the IISc Bangalore, India in 2008 His research interests include Grid computing, parallel and distributed systems, Performance modeling of parallel applications. He published papers in peerreviewed journals and conference proceedings.

Author Articles
Performance Framework for HPC Applications on Homogeneous Computing Platform

By Chandrashekhar B. N Sanjay H. A

DOI: https://doi.org/10.5815/ijigsp.2019.08.03, Pub. Date: 8 Aug. 2019

In scientific fields, solving large and complex computational problems using central processing units (CPU) alone is not enough to meet the computation requirement. In this work we have considered a homogenous cluster in which each nodes consists of same capability of CPU and graphical processing unit (GPU). Normally CPU are used for control GPU and to transfer data from CPU to GPUs. Here we are considering CPU computation power with GPU to compute high performance computing (HPC) applications. The framework adopts pinned memory technique to overcome the overhead of data transfer between CPU and GPU. To enable the homogeneous platform we have considered hybrid [message passing interface (MPI), OpenMP (open multi-processing), Compute Unified Device Architecture (CUDA)] programming model strategy. The key challenge on the homogeneous platform is allocation of workload among CPU and GPU cores. To address this challenge we have proposed a novel analytical workload division strategy to predict an effective workload division between the CPU and GPU. We have observed that using our hybrid programming model and workload division strategy, an average performance improvement of 76.06% and 84.11% in Giga floating point operations per seconds(GFLOPs) on NVIDIA TESLA M2075 cluster and NVIDIA QUADRO K 2000 nodes of a cluster respectively for N-dynamic vector addition when compared with Simplice Donfack et.al [5] performance models. Also using pinned memory technique with hybrid programming model an average performance improvement of 33.83% and 39.00% on NVIDIA TESLA M2075 and NVIDIA QUADRO K 2000 respectively is observed for saxpy applications when compared with pagable memory technique.

[...] Read more.
Other Articles