Abstract
We present techniques for speeding up the test-time evaluation of large convolutional networks, designed for object recognition tasks. These models deliver impressive accuracy, but each image evaluation requires millions of floating point operations, making their deployment on smartphones and Internet-scale clusters problematic. The computation is dominated by the convolution operations in the lower layers of the model. We exploit the redundancy present within the convolutional filters to derive approximations that significantly reduce the required computation. Using large state-of-the-art models, we demonstrate speedups of convolutional layers on both CPU and GPU by a factor of 2 ×, while keeping the accuracy within 1% of the original model.
Original language | English (US) |
---|---|
Title of host publication | Advances in Neural Information Processing Systems |
Publisher | Neural information processing systems foundation |
Pages | 1269-1277 |
Number of pages | 9 |
Volume | 2 |
Edition | January |
State | Published - 2014 |
Event | 28th Annual Conference on Neural Information Processing Systems 2014, NIPS 2014 - Montreal, Canada Duration: Dec 8 2014 → Dec 13 2014 |
Other
Other | 28th Annual Conference on Neural Information Processing Systems 2014, NIPS 2014 |
---|---|
Country/Territory | Canada |
City | Montreal |
Period | 12/8/14 → 12/13/14 |
ASJC Scopus subject areas
- Computer Networks and Communications
- Information Systems
- Signal Processing