TY - JOUR
T1 - CapsBeam
T2 - Accelerating Capsule Network-Based Beamformer for Ultrasound Nonsteered Plane-Wave Imaging on Field-Programmable Gate Array
AU - Rahoof, Abdul
AU - Chaturvedi, Vivek
AU - Raveendranatha Panicker, Mahesh
AU - Shafique, Muhammad
N1 - Publisher Copyright:
© 1993-2012 IEEE.
PY - 2025
Y1 - 2025
N2 - In recent years, there has been a growing trend in accelerating computationally complex nonreal-time beamforming algorithms in ultrasound imaging using deep learning models. However, due to the large size and complexity, these state-of-the-art deep learning techniques pose significant challenges when deploying on resource-constrained edge devices. In this work, we propose a novel capsule network-based beamformer called CapsBeam, designed to operate on raw radio frequency data and provide an envelope of beamformed data through non-steered plane-wave insonification. In experiments on in vivo data, CapsBeam reduced artifacts compared to the standard Delay-and-Sum (DAS) beamforming. For in vitro data, CapsBeam demonstrated a 32.31% increase in contrast, along with gains of 16.54% and 6.7% in axial and lateral resolution compared to the DAS. Similarly, in silico data showed a 26% enhancement in contrast, along with improvements of 13.6% and 21.5% in axial and lateral resolution, respectively, compared to the DAS. To reduce the parameter redundancy and enhance the computational efficiency, we pruned the model using our multilayer look-ahead kernel pruning (LAKP-ML) methodology, achieving a compression ratio of 85% without affecting the image quality. Additionally, the hardware complexity of the proposed model is reduced by applying quantization, simplification of nonlinear operations, and parallelizing operations. Finally, we proposed a specialized accelerator architecture for the pruned and optimized CapsBeam model, implemented on a Xilinx ZU7EV FPGA. The proposed accelerator achieved a throughput of 30 GOPS for the convolution operation and 17.4 GOPS for the dynamic routing operation.
AB - In recent years, there has been a growing trend in accelerating computationally complex nonreal-time beamforming algorithms in ultrasound imaging using deep learning models. However, due to the large size and complexity, these state-of-the-art deep learning techniques pose significant challenges when deploying on resource-constrained edge devices. In this work, we propose a novel capsule network-based beamformer called CapsBeam, designed to operate on raw radio frequency data and provide an envelope of beamformed data through non-steered plane-wave insonification. In experiments on in vivo data, CapsBeam reduced artifacts compared to the standard Delay-and-Sum (DAS) beamforming. For in vitro data, CapsBeam demonstrated a 32.31% increase in contrast, along with gains of 16.54% and 6.7% in axial and lateral resolution compared to the DAS. Similarly, in silico data showed a 26% enhancement in contrast, along with improvements of 13.6% and 21.5% in axial and lateral resolution, respectively, compared to the DAS. To reduce the parameter redundancy and enhance the computational efficiency, we pruned the model using our multilayer look-ahead kernel pruning (LAKP-ML) methodology, achieving a compression ratio of 85% without affecting the image quality. Additionally, the hardware complexity of the proposed model is reduced by applying quantization, simplification of nonlinear operations, and parallelizing operations. Finally, we proposed a specialized accelerator architecture for the pruned and optimized CapsBeam model, implemented on a Xilinx ZU7EV FPGA. The proposed accelerator achieved a throughput of 30 GOPS for the convolution operation and 17.4 GOPS for the dynamic routing operation.
KW - Beamforming
KW - capsule network
KW - field-programmable gate array (FPGA)
KW - hardware accelerator
KW - reconstruction
KW - ultrasound imaging
UR - http://www.scopus.com/inward/record.url?scp=105003654818&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=105003654818&partnerID=8YFLogxK
U2 - 10.1109/TVLSI.2025.3559403
DO - 10.1109/TVLSI.2025.3559403
M3 - Article
AN - SCOPUS:105003654818
SN - 1063-8210
VL - 33
SP - 1934
EP - 1944
JO - IEEE Transactions on Very Large Scale Integration (VLSI) Systems
JF - IEEE Transactions on Very Large Scale Integration (VLSI) Systems
IS - 7
ER -