TY - JOUR
T1 - Deep graph pose
T2 - 34th Conference on Neural Information Processing Systems, NeurIPS 2020
AU - The International Brain Laboratory
AU - Wu, Anqi
AU - Kelly Buchanan, E.
AU - Whiteway, Matthew R.
AU - Schartner, Michael
AU - Meijer, Guido
AU - Noel, Jean Paul
AU - Rodriguez, Erica
AU - Everett, Claire
AU - Norovich, Amy
AU - Schaffer, Evan
AU - Mishra, Neeli
AU - Daniel Salzman, C.
AU - Angelaki, Dora
AU - Bendesky, Andrés
AU - Cunningham, John
AU - Paninski, Liam
N1 - Publisher Copyright:
© 2020 Neural information processing systems foundation. All rights reserved.
PY - 2020
Y1 - 2020
N2 - Noninvasive behavioral tracking of animals is crucial for many scientific investigations. Recent transfer learning approaches for behavioral tracking have considerably advanced the state of the art. Typically these methods treat each video frame and each object to be tracked independently. In this work, we improve on these methods (particularly in the regime of few training labels) by leveraging the rich spatiotemporal structures pervasive in behavioral video — specifically, the spatial statistics imposed by physical constraints (e.g., paw to elbow distance), and the temporal statistics imposed by smoothness from frame to frame. We propose a probabilistic graphical model built on top of deep neural networks, Deep Graph Pose (DGP), to leverage these useful spatial and temporal constraints, and develop an efficient structured variational approach to perform inference in this model. The resulting semi-supervised model exploits both labeled and unlabeled frames to achieve significantly more accurate and robust tracking while requiring users to label fewer training frames. In turn, these tracking improvements enhance performance on downstream applications, including robust unsupervised segmentation of behavioral “syllables,” and estimation of interpretable “disentangled” low-dimensional representations of the full behavioral video. Open source code is available at https://github.com/paninski-lab/deepgraphpose.
AB - Noninvasive behavioral tracking of animals is crucial for many scientific investigations. Recent transfer learning approaches for behavioral tracking have considerably advanced the state of the art. Typically these methods treat each video frame and each object to be tracked independently. In this work, we improve on these methods (particularly in the regime of few training labels) by leveraging the rich spatiotemporal structures pervasive in behavioral video — specifically, the spatial statistics imposed by physical constraints (e.g., paw to elbow distance), and the temporal statistics imposed by smoothness from frame to frame. We propose a probabilistic graphical model built on top of deep neural networks, Deep Graph Pose (DGP), to leverage these useful spatial and temporal constraints, and develop an efficient structured variational approach to perform inference in this model. The resulting semi-supervised model exploits both labeled and unlabeled frames to achieve significantly more accurate and robust tracking while requiring users to label fewer training frames. In turn, these tracking improvements enhance performance on downstream applications, including robust unsupervised segmentation of behavioral “syllables,” and estimation of interpretable “disentangled” low-dimensional representations of the full behavioral video. Open source code is available at https://github.com/paninski-lab/deepgraphpose.
UR - http://www.scopus.com/inward/record.url?scp=85103809953&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85103809953&partnerID=8YFLogxK
M3 - Conference article
AN - SCOPUS:85103809953
SN - 1049-5258
VL - 2020-December
JO - Advances in Neural Information Processing Systems
JF - Advances in Neural Information Processing Systems
Y2 - 6 December 2020 through 12 December 2020
ER -