TY - GEN
T1 - STANDARD COMPLIANT VIDEO CODING USING LOW COMPLEXITY, SWITCHABLE NEURAL WRAPPERS
AU - Hu, Yueyu
AU - Zhang, Chenhao
AU - Guleryuz, Onur G.
AU - Mukherjee, Debargha
AU - Wang, Yao
N1 - Publisher Copyright:
© 2024 IEEE.
PY - 2024
Y1 - 2024
N2 - The proliferation of high resolution videos posts great storage and bandwidth pressure on cloud video services, driving the development of next-generation video codecs. Despite great progress made in neural video coding, existing approaches are still far from economical deployment considering the complexity and rate-distortion performance tradeoff. To clear the roadblocks for neural video coding, in this paper we propose a new framework featuring standard compatibility, high performance, and low decoding complexity. We employ a set of jointly optimized neural pre- and post-processors, wrapping a standard video codec, to encode videos at different resolutions. The rate-distorion optimal downsampling ratio is signaled to the decoder at the per-sequence level for each target rate. We design a low complexity neural post-processor architecture that can handle different upsampling ratios. The change of resolution exploits the spatial redundancy in high-resolution videos, while the neural wrapper further achieves rate-distortion performance improvement through end-to-end optimization with a codec proxy. Our light-weight post-processor architecture has a complexity of 516 MACs/pixel, and achieves 9.3% BD-Rate reduction over VVC on the UVG dataset, and 6.4% on AOM CTC Class A1. Our approach has the potential to further advance the performance of the latest video coding standards using neural processing with minimal added complexity.
AB - The proliferation of high resolution videos posts great storage and bandwidth pressure on cloud video services, driving the development of next-generation video codecs. Despite great progress made in neural video coding, existing approaches are still far from economical deployment considering the complexity and rate-distortion performance tradeoff. To clear the roadblocks for neural video coding, in this paper we propose a new framework featuring standard compatibility, high performance, and low decoding complexity. We employ a set of jointly optimized neural pre- and post-processors, wrapping a standard video codec, to encode videos at different resolutions. The rate-distorion optimal downsampling ratio is signaled to the decoder at the per-sequence level for each target rate. We design a low complexity neural post-processor architecture that can handle different upsampling ratios. The change of resolution exploits the spatial redundancy in high-resolution videos, while the neural wrapper further achieves rate-distortion performance improvement through end-to-end optimization with a codec proxy. Our light-weight post-processor architecture has a complexity of 516 MACs/pixel, and achieves 9.3% BD-Rate reduction over VVC on the UVG dataset, and 6.4% on AOM CTC Class A1. Our approach has the potential to further advance the performance of the latest video coding standards using neural processing with minimal added complexity.
KW - Efficient Neural Network
KW - Neural Wrapper
KW - Postprocess
KW - Preprocess
KW - Video Coding
UR - http://www.scopus.com/inward/record.url?scp=85216835295&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85216835295&partnerID=8YFLogxK
U2 - 10.1109/ICIP51287.2024.10647690
DO - 10.1109/ICIP51287.2024.10647690
M3 - Conference contribution
AN - SCOPUS:85216835295
T3 - Proceedings - International Conference on Image Processing, ICIP
SP - 1922
EP - 1928
BT - 2024 IEEE International Conference on Image Processing, ICIP 2024 - Proceedings
PB - IEEE Computer Society
T2 - 31st IEEE International Conference on Image Processing, ICIP 2024
Y2 - 27 October 2024 through 30 October 2024
ER -