Motion Adaptive Pose Estimation from Compressed Videos

Zhipeng Fan, Jun Liu, Yao Wang

Research output: Chapter in Book/Report/Conference proceedingConference contribution


Human pose estimation from videos has many real-world applications. Existing methods focus on applying models with a uniform computation profile on fully decoded frames, ignoring the freely-available motion signals and motion-compensation residuals from the compressed stream. A novel model, called Motion Adaptive Pose Net is proposed to exploit the compressed streams to efficiently decode pose sequences from videos. The model incorporates a Motion Compensated ConvLSTM to propagate the spatially aligned features, along with an adaptive gate to dynamically determine if the computationally expensive features should be extracted from fully decoded frames to compensate the motion-warped features, solely based on the residual errors. Leveraging the informative yet readily available signals from compressed streams, we propagate the latent features through our Motion Adaptive Pose Net efficiently Our model outperforms the state-of-the-art models in pose-estimation accuracy on two widely used datasets with only around half of the computation complexity.

Original languageEnglish (US)
Title of host publicationProceedings - 2021 IEEE/CVF International Conference on Computer Vision, ICCV 2021
PublisherInstitute of Electrical and Electronics Engineers Inc.
Number of pages10
ISBN (Electronic)9781665428125
StatePublished - 2021
Event18th IEEE/CVF International Conference on Computer Vision, ICCV 2021 - Virtual, Online, Canada
Duration: Oct 11 2021Oct 17 2021

Publication series

NameProceedings of the IEEE International Conference on Computer Vision
ISSN (Print)1550-5499


Conference18th IEEE/CVF International Conference on Computer Vision, ICCV 2021
CityVirtual, Online

ASJC Scopus subject areas

  • Software
  • Computer Vision and Pattern Recognition


Dive into the research topics of 'Motion Adaptive Pose Estimation from Compressed Videos'. Together they form a unique fingerprint.

Cite this