Abstract
We describe an image compression method, consisting of a nonlinear analysis transformation, a uniform quantizer, and a nonlinear synthesis transformation. The transforms are constructed in three successive stages of convolutional linear filters and nonlinear activation functions. Unlike most convolutional neural networks, the joint nonlinearity is chosen to implement a form of local gain control, inspired by those used to model biological neurons. Using a variant of stochastic gradient descent, we jointly optimize the entire model for rate-distortion performance over a database of training images, introducing a continuous proxy for the discontinuous loss function arising from the quantizer. Under certain conditions, the relaxed loss function may be interpreted as the log likelihood of a generative model, as implemented by a variational autoencoder. Unlike these models, however, the compression model must operate at any given point along the rate-distortion curve, as specified by a trade-off parameter. Across an independent set of test images, we find that the optimized method generally exhibits better rate-distortion performance than the standard JPEG and JPEG 2000 compression methods. More importantly, we observe a dramatic improvement in visual quality for all images at all bit rates, which is supported by objective quality estimates using MS-SSIM.
Original language | English (US) |
---|---|
State | Published - 2017 |
Event | 5th International Conference on Learning Representations, ICLR 2017 - Toulon, France Duration: Apr 24 2017 → Apr 26 2017 |
Conference
Conference | 5th International Conference on Learning Representations, ICLR 2017 |
---|---|
Country/Territory | France |
City | Toulon |
Period | 4/24/17 → 4/26/17 |
ASJC Scopus subject areas
- Education
- Computer Science Applications
- Linguistics and Language
- Language and Linguistics