TY - GEN
T1 - Classifying and visualizing motion capture sequences using deep neural networks
AU - Cho, Kyunghyun
AU - Chen, Xi
PY - 2014
Y1 - 2014
N2 - The gesture recognition using motion capture data and depth sensors has recently drawn more attention in vision recognition. Currently most systems only classify dataset with a couple of dozens different actions. Moreover, feature extraction from the data is often computational complex. In this paper, we propose a novel system to recognize the actions from skeleton data with simple, but effective, features using deep neural networks. Features are extracted for each frame based on the relative positions of joints (PO), temporal differences (TD), and normalized trajectories of motion (NT). Given these features a hybrid multi-layer perceptron is trained, which simultaneously classifies and reconstructs input data. We use deep autoencoder to visualize learnt features. The experiments show that deep neural networks can capture more discriminative information than, for instance, principal component analysis can. We test our system on a public database with 65 classes and more than 2,000 motion sequences. We obtain an accuracy above 95% which is, to our knowledge, the state of the art result for such a large dataset.
AB - The gesture recognition using motion capture data and depth sensors has recently drawn more attention in vision recognition. Currently most systems only classify dataset with a couple of dozens different actions. Moreover, feature extraction from the data is often computational complex. In this paper, we propose a novel system to recognize the actions from skeleton data with simple, but effective, features using deep neural networks. Features are extracted for each frame based on the relative positions of joints (PO), temporal differences (TD), and normalized trajectories of motion (NT). Given these features a hybrid multi-layer perceptron is trained, which simultaneously classifies and reconstructs input data. We use deep autoencoder to visualize learnt features. The experiments show that deep neural networks can capture more discriminative information than, for instance, principal component analysis can. We test our system on a public database with 65 classes and more than 2,000 motion sequences. We obtain an accuracy above 95% which is, to our knowledge, the state of the art result for such a large dataset.
KW - Deep Neural Network
KW - Gesture Recognition
KW - Motion Capture
UR - http://www.scopus.com/inward/record.url?scp=84906895958&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84906895958&partnerID=8YFLogxK
U2 - 10.5220/0004718301220130
DO - 10.5220/0004718301220130
M3 - Conference contribution
AN - SCOPUS:84906895958
SN - 9789897580048
T3 - VISAPP 2014 - Proceedings of the 9th International Conference on Computer Vision Theory and Applications
SP - 122
EP - 130
BT - VISAPP 2014 - Proceedings of the 9th International Conference on Computer Vision Theory and Applications
PB - SciTePress
T2 - 9th International Conference on Computer Vision Theory and Applications, VISAPP 2014
Y2 - 5 January 2014 through 8 January 2014
ER -