TY - GEN
T1 - CNP
T2 - FPL 09: 19th International Conference on Field Programmable Logic and Applications
AU - Farabet, Clément
AU - Poulet, Cyril
AU - Han, Jefferson Y.
AU - LeCun, Yann
PY - 2009
Y1 - 2009
N2 - Convolutional Networks (ConvNets) are biologically-inspired hierarchical architectures that can be trained to perform a variety of detection, recognition and segmentation tasks. ConvNets have a feed-forward architecture consisting of multiple linear convolution filters interspersed with point-wise non-linear squashing functions. This paper presents an efficient implementation of ConvNets on a low-end DSP-oriented Field Programmable Gate Array (FPGA). The implementation exploits the inherent parallelism of ConvNets and takes full advantage of multiple hardware multiply-accumulate units on the FPGA. The entire system uses a single FPGA with an external memory module, and no extra parts. A network compiler software was implemented, which takes a description of a trained ConvNet and compiles it into a sequence of instructions for the ConvNet Processor (CNP). A ConvNet face detection system was implemented and tested. Face detection on a 512 x 384 frame takes 100ms (10 frames per second), which corresponds to an average performance of 3:4x109 connections per second for this 340 million connection network. The design can be used for low-power, lightweight embedded vision systems for micro-UAVs and other small robots.
AB - Convolutional Networks (ConvNets) are biologically-inspired hierarchical architectures that can be trained to perform a variety of detection, recognition and segmentation tasks. ConvNets have a feed-forward architecture consisting of multiple linear convolution filters interspersed with point-wise non-linear squashing functions. This paper presents an efficient implementation of ConvNets on a low-end DSP-oriented Field Programmable Gate Array (FPGA). The implementation exploits the inherent parallelism of ConvNets and takes full advantage of multiple hardware multiply-accumulate units on the FPGA. The entire system uses a single FPGA with an external memory module, and no extra parts. A network compiler software was implemented, which takes a description of a trained ConvNet and compiles it into a sequence of instructions for the ConvNet Processor (CNP). A ConvNet face detection system was implemented and tested. Face detection on a 512 x 384 frame takes 100ms (10 frames per second), which corresponds to an average performance of 3:4x109 connections per second for this 340 million connection network. The design can be used for low-power, lightweight embedded vision systems for micro-UAVs and other small robots.
UR - http://www.scopus.com/inward/record.url?scp=70450060046&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=70450060046&partnerID=8YFLogxK
U2 - 10.1109/FPL.2009.5272559
DO - 10.1109/FPL.2009.5272559
M3 - Conference contribution
AN - SCOPUS:70450060046
SN - 9781424438921
T3 - FPL 09: 19th International Conference on Field Programmable Logic and Applications
SP - 32
EP - 37
BT - FPL 09
Y2 - 31 August 2009 through 2 September 2009
ER -