TY - GEN
T1 - Efficient Matricization of n-D Array with CUDA and Its Evaluation
AU - Shaikh, Md Abu Hanif
AU - Hasan, K. M.Azharul
AU - Ali, G. G.Md Nawaz
AU - Chafii, Marwa
AU - Chong, Peter Han Joo
N1 - Publisher Copyright:
© 2016 IEEE.
PY - 2017/7/14
Y1 - 2017/7/14
N2 - Scientific and engineering computing requires operation on flooded amount of data having very high number of dimensions. Traditional multidimensional array is widely popular for implementing higher dimensional data but its' performance diminishes with the increase of the number of dimensions. On the other side, traditional row-column view is facile for implementation, imagination and visualization. This paper details a representation scheme for higher dimensional array with row-column abstraction on parallel environment. Odd dimensions contribute along row-direction and even dimensions along column direction which gives lower cost of index computation, higher data locality and parallelism. Each 2-D block of size blockIdx.x × threadIdx.x is independent of each other. Theoretically, it has no limitation with the number of dimensions and mapping algorithm is unique for any number of dimensions. Performance of the proposed matricization is measured with matrix-matrix addition, subtraction and multiplication operation. Experimental results show promising performance improvement over Traditional Multidimensional Array (TMA) and Extended Karnaugh Map Representation (EKMR). Thus the scheme can be used for implementing higher dimensional array in both general purpose and scientific computing on GPU.
AB - Scientific and engineering computing requires operation on flooded amount of data having very high number of dimensions. Traditional multidimensional array is widely popular for implementing higher dimensional data but its' performance diminishes with the increase of the number of dimensions. On the other side, traditional row-column view is facile for implementation, imagination and visualization. This paper details a representation scheme for higher dimensional array with row-column abstraction on parallel environment. Odd dimensions contribute along row-direction and even dimensions along column direction which gives lower cost of index computation, higher data locality and parallelism. Each 2-D block of size blockIdx.x × threadIdx.x is independent of each other. Theoretically, it has no limitation with the number of dimensions and mapping algorithm is unique for any number of dimensions. Performance of the proposed matricization is measured with matrix-matrix addition, subtraction and multiplication operation. Experimental results show promising performance improvement over Traditional Multidimensional Array (TMA) and Extended Karnaugh Map Representation (EKMR). Thus the scheme can be used for implementing higher dimensional array in both general purpose and scientific computing on GPU.
KW - Array operations
KW - CUDA
KW - GPU
KW - High Performance Computing
KW - Matrix Operation
KW - Matrix-Matrix Multiplication
KW - Multidimensional Array
KW - Parallel Computing
UR - http://www.scopus.com/inward/record.url?scp=85026664890&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85026664890&partnerID=8YFLogxK
U2 - 10.1109/CSE-EUC-DCABES.2016.192
DO - 10.1109/CSE-EUC-DCABES.2016.192
M3 - Conference contribution
AN - SCOPUS:85026664890
T3 - Proceedings - 19th IEEE International Conference on Computational Science and Engineering, 14th IEEE International Conference on Embedded and Ubiquitous Computing and 15th International Symposium on Distributed Computing and Applications to Business, Engineering and Science, CSE-EUC-DCABES 2016
SP - 246
EP - 252
BT - Proceedings - 19th IEEE International Conference on Computational Science and Engineering, 14th IEEE International Conference on Embedded and Ubiquitous Computing and 15th International Symposium on Distributed Computing and Applications to Business, Engineering and Science, CSE-EUC-DCABES 2016
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 19th IEEE International Conference on Computational Science and Engineering, 14th IEEE International Conference on Embedded and Ubiquitous Computing and 15th International Symposium on Distributed Computing and Applications to Business, Engineering and Science, CSE-EUC-DCABES 2016
Y2 - 24 August 2016 through 26 August 2016
ER -