TY - GEN
T1 - Saturation RRAM leveraging bit-level sparsity resulting from term quantization
AU - McDanel, Bradley
AU - Kung, H. T.
AU - Zhang, Sai Qian
N1 - Publisher Copyright:
© 2021 IEEE
PY - 2021
Y1 - 2021
N2 - The proposed saturation RRAM for in-memory computing of a pre-trained Convolutional Neural Network (CNN) inference imposes a limit on the maximum analog value output from each bitline in order to reduce analog-to-digital (A/D) conversion costs. The proposed scheme uses term quantization (TQ) to enable flexible bit annihilation at any position for a value in the context of a group of weights values in RRAM. This enables a drastic reduction in the required ADC resolution while still maintaining CNN model accuracy. Specifically, we show that the A/D conversion errors after TQ have a minimum impact on the classification accuracy of the inference task. For instance, for a 64x64 RRAM, reducing the ADC resolution from 6 bits to 4 bits enables a 1.58x reduction in the total system power, without a significant impact to classification accuracy.
AB - The proposed saturation RRAM for in-memory computing of a pre-trained Convolutional Neural Network (CNN) inference imposes a limit on the maximum analog value output from each bitline in order to reduce analog-to-digital (A/D) conversion costs. The proposed scheme uses term quantization (TQ) to enable flexible bit annihilation at any position for a value in the context of a group of weights values in RRAM. This enables a drastic reduction in the required ADC resolution while still maintaining CNN model accuracy. Specifically, we show that the A/D conversion errors after TQ have a minimum impact on the classification accuracy of the inference task. For instance, for a 64x64 RRAM, reducing the ADC resolution from 6 bits to 4 bits enables a 1.58x reduction in the total system power, without a significant impact to classification accuracy.
KW - Analog computing
KW - Analog-to-digital conversion (A/D conversion)
KW - Analog-to-digital converter (ADC)
KW - Convolutional neural network (CNN)
KW - Dot-product computation
KW - In-memory computing
KW - Noise
KW - Resistive RAM (RRAM)
UR - http://www.scopus.com/inward/record.url?scp=85108986437&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85108986437&partnerID=8YFLogxK
U2 - 10.1109/ISCAS51556.2021.9401293
DO - 10.1109/ISCAS51556.2021.9401293
M3 - Conference contribution
AN - SCOPUS:85108986437
T3 - Proceedings - IEEE International Symposium on Circuits and Systems
BT - 2021 IEEE International Symposium on Circuits and Systems, ISCAS 2021 - Proceedings
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 53rd IEEE International Symposium on Circuits and Systems, ISCAS 2021
Y2 - 22 May 2021 through 28 May 2021
ER -