OATAO - Open Archive Toulouse Archive Ouverte Open Access Week

Simplified entropy model for reduced-complexity end-to-end variational autoencoder with application to on-board satellite image compression

Alves de Oliveira, Vinicius and Oberlin, Thomas and Chabert, Marie and Poulliat, Charly and Mickael, Bruno and Latry, Christophe and Carlavan, Mikael and Henrot, Simon and Falzon, Frederic and Camarero, Roberto Simplified entropy model for reduced-complexity end-to-end variational autoencoder with application to on-board satellite image compression. (2020) In: 7th International Workshop on On-Board Payload Data Compression (OBPDC 2020), European Space Agency (ESA); Centre national d’études spatiales (CNES), 21 September 2020 - 23 September 2020 (Virtual, Greece).

(Document in English)

PDF (Author's version) - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader


In recent years, neural networks have emerged as data-driven tools to solve problems which were previously addressed with model-based methods. In particular, image processing has been largely impacted by convolutional neural networks (CNNs). Recently, CNN-based auto-encoders have been successfully employed for lossy image compression [1,2,3,4]. These end-to-end optimized architectures are able to dramatically outperform traditional compression schemes in terms of rate-distortion trade-off. The auto-encoder is composed of an encoder and a decoder both learned from the data. The encoder is applied to the input data to produce a latent representation with minimum entropy after quantization. The latent representation, derived through several convolutional layers composed of filters and activation functions, is multi-channel (the output of a particular filter is called a channel or a feature) and non-linear. The representation is then quantized to produce a discrete-valued vector. A standard entropy coding method uses the entropy model inferred from the representation to losslessly compress this discrete-valued vector. A key element of these frameworks is the entropy model. In earlier works [1,2,3], the learned representation was assumed independent and identically distributed within each channel and the channels were assumed independent of each other, resulting in a fully-factorized entropy model. Moreover, a fixed entropy model was learned once, from the training set, preventing any adaptation to the input image during the operational phase. The variational auto-encoder proposed in [4] proposed to use a hyperprior auxiliary network. This network estimates the hyper-parameters of the representation distribution, for each input image. Thus, it does not require the assumption of a fully-factorized model which conflicts with the need for context modeling. This variational auto-encoder achieves compression performance close to the one of BPG (Better Portable Graphics) at the expense of a considerable increase in complexity.However, in the context of on-board compression, a trade-off between compression performance and complexity has to be considered to take into account the strong computational constraints. For this reason, the CCSDS (Consultative Committee for Space Data Systems) lossy compression standard has been designed as a highly simplified version of JPEG2000. This work follows the same logic, however in the context of learned image compression. The aim of this paper is to design a simplified version of the variational auto-encoder proposed in [4] in order to meet the on-board constraints in terms of complexity while preserving high performance in terms of rate-distortion. Apart from straightforward simplifications of the transform (e.g. reduction of the number of filters in the convolutional layers), we mainly propose a simplified entropy model that preserves the adaptability to the input image.A preliminary reduction of the number of filters reduces the complexity by 62% in terms of FLOPs with respect to [4]. It also reduces the number of learned parameters with a positive impact on the memory occupancy. The entropy model simplification exploits a statistical analysis of the learned representation for satellite images, also performed in [5] for natural images. This analysis reveals that most of the features are well fitted by centered Laplacian distributions. The complex hyperprior model based on a non-parametric distribution of [4] can thus be replaced by a simpler parametric centered Laplacian model. The problem then amounts to a classical and simple estimation of a single parameter referred to as the scale. Our simplified entropy models reduces the complexity of the variational auto-encoder coding part by 22% and outperforms the end-to-end model proposed in [1] for the high target rates.

Item Type:Conference or Workshop Item (Paper)
Audience (conference):International conference proceedings
Uncontrolled Keywords:
Institution:French research institutions > Centre National d'Études Spatiales - CNES (FRANCE)
French research institutions > Centre National de la Recherche Scientifique - CNRS (FRANCE)
Université de Toulouse > Institut National Polytechnique de Toulouse - Toulouse INP (FRANCE)
Université de Toulouse > Institut Supérieur de l'Aéronautique et de l'Espace - ISAE-SUPAERO (FRANCE)
Other partners > Thales (FRANCE)
Université de Toulouse > Université Toulouse III - Paul Sabatier - UT3 (FRANCE)
Université de Toulouse > Université Toulouse - Jean Jaurès - UT2J (FRANCE)
Université de Toulouse > Université Toulouse 1 Capitole - UT1 (FRANCE)
Other partners > ESA - ESTEC (NETHERLANDS)
Other partners > Laboratoire de recherche en télécommunications spatiales et aéronautiques - TéSA (FRANCE)
Laboratory name:
Deposited On:31 Aug 2021 08:53

Repository Staff Only: item control page