Generative Adversarial Networks (GANs) have gained importance because of their tremendous unsupervised learning capability and enormous applications in data generation, for example, text to image synthesis, synthetic medical data generation, video generation, and artwork generation. Hardware acceleration for GANs become challenging due to the intrinsic complex computational phases, which require efficient data management during the training and inference. In this work, we propose a distributed on-chip memory architecture, which aims at efficiently handling the data for complex computations involved in GANs, such as strided convolution or transposed convolution. We also propose a controller that improves the computational efficiency by pre-arranging the data from either the off-chip memory or the computational units before storing it in the on-chip memory. Our architectural enhancement supports to achieve 3.65x performance improvement in state-of-the-art, and reduces the number of read accesses and write accesses by 85% and 75%, respectively.