TY - GEN
T1 - Making densepose fast and light
AU - Rakhimov, Ruslan
AU - Bogomolov, Emil
AU - Notchenko, Alexandr
AU - Mao, Fung
AU - Artemov, Alexey
AU - Zorin, Denis
AU - Burnaev, Evgeny
N1 - Publisher Copyright:
© 2021 IEEE.
PY - 2021/1
Y1 - 2021/1
N2 - DensePose estimation task is a significant step forward for enhancing user experience computer vision applications ranging from augmented reality to cloth fitting. Existing neural network models capable of solving this task are heavily parameterized and a long way from being transferred to an embedded or mobile device. To enable Dense Pose inference on the end device with current models, one needs to support an expensive server-side infrastructure and have a stable internet connection. To make things worse, mobile and embedded devices do not always have a powerful GPU inside. In this work, we target the problem of redesigning the DensePose R-CNN model's architecture so that the final network retains most of its accuracy but becomes more light-weight and fast. To achieve that, we tested and incorporated many deep learning innovations from recent years, specifically performing an ablation study on 23 efficient backbone architectures, multiple two-stage detection pipeline modifications, and custom model quantization methods. As a result, we achieved 17× model size reduction and 2× latency improvement compared to the baseline model.1
AB - DensePose estimation task is a significant step forward for enhancing user experience computer vision applications ranging from augmented reality to cloth fitting. Existing neural network models capable of solving this task are heavily parameterized and a long way from being transferred to an embedded or mobile device. To enable Dense Pose inference on the end device with current models, one needs to support an expensive server-side infrastructure and have a stable internet connection. To make things worse, mobile and embedded devices do not always have a powerful GPU inside. In this work, we target the problem of redesigning the DensePose R-CNN model's architecture so that the final network retains most of its accuracy but becomes more light-weight and fast. To achieve that, we tested and incorporated many deep learning innovations from recent years, specifically performing an ablation study on 23 efficient backbone architectures, multiple two-stage detection pipeline modifications, and custom model quantization methods. As a result, we achieved 17× model size reduction and 2× latency improvement compared to the baseline model.1
UR - http://www.scopus.com/inward/record.url?scp=85116155425&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85116155425&partnerID=8YFLogxK
U2 - 10.1109/WACV48630.2021.00191
DO - 10.1109/WACV48630.2021.00191
M3 - Conference contribution
AN - SCOPUS:85116155425
T3 - Proceedings - 2021 IEEE Winter Conference on Applications of Computer Vision, WACV 2021
SP - 1868
EP - 1876
BT - Proceedings - 2021 IEEE Winter Conference on Applications of Computer Vision, WACV 2021
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2021 IEEE Winter Conference on Applications of Computer Vision, WACV 2021
Y2 - 5 January 2021 through 9 January 2021
ER -