Large-scale FPGA-based convolutional networks

Clément Farabet, Yann LeCun, Koray Kavukcuoglu, Berin Martini, Polina Akselrod, Selcuk Talay, Eugenio Culurciello

Research output: Chapter in Book/Report/Conference proceedingChapter

Abstract

Micro-robots, unmanned aerial vehicles, imaging sensor networks, wireless phones, and other embedded vision systems all require low cost and high-speed implementations of synthetic vision systems capable of recognizing and categorizing objects in a scene. Many successful object recognition systems use dense features extracted on regularly spaced patches over the input image. The majority of the feature extraction systems have a common structure composed of a filter bank (generally based on oriented edge detectors or 2D Gabor functions), a nonlinear operation (quantization, winner-take-all, sparsification, normalization, and/or pointwise saturation), and finally a pooling operation (max, average, or histogramming). For example, the scale-invariant feature transform (SIFT) (Lowe, 2004) operator applies oriented edge filters to a small patch and determines the dominant orientation through a winner-take-all operation. Finally, the resulting sparse vectors are added (pooled) over a larger patch to form a local orientation histogram. Some recognition systems use a single stage of feature extractors (Lazebnik, Schmid, and Ponce, 2006; Dalal and Triggs, 2005; Berg, Berg, and Malik, 2005; Pinto, Cox, and DiCarlo, 2008). Other models such as HMAX-type models (Serre, Wolf, and Poggio, 2005; Mutch, and Lowe, 2006) and convolutional networks use two more layers of successive feature extractors. Different training algorithms have been used for learning the parameters of convolutional networks. In LeCun et al. (1998b) and Huang and LeCun (2006), pure supervised learning is used to update the parameters. However, recent works have focused on training with an auxiliary task (Ahmed et al., 2008) or using unsupervised objectives (Ranzato et al., 2007b; Kavukcuoglu et al., 2009; Jarrett et al., 2009; Lee et al., 2009).

Original languageEnglish (US)
Title of host publicationScaling up Machine Learning
Subtitle of host publicationParallel and Distributed Approaches
PublisherCambridge University Press
Pages399-419
Number of pages21
Volume9780521192248
ISBN (Electronic)9781139042918
ISBN (Print)9780521192248
DOIs
StatePublished - Jan 1 2011

ASJC Scopus subject areas

  • Computer Science(all)

Fingerprint Dive into the research topics of 'Large-scale FPGA-based convolutional networks'. Together they form a unique fingerprint.

  • Cite this

    Farabet, C., LeCun, Y., Kavukcuoglu, K., Martini, B., Akselrod, P., Talay, S., & Culurciello, E. (2011). Large-scale FPGA-based convolutional networks. In Scaling up Machine Learning: Parallel and Distributed Approaches (Vol. 9780521192248, pp. 399-419). Cambridge University Press. https://doi.org/10.1017/CBO9781139042918.020