A curated collection of research papers and resources on learned/neural image compression.
These papers established the core architectures and techniques used in modern learned image compression.
| Paper | Authors | Venue | Links |
|---|---|---|---|
| End-to-end Optimized Image Compression | Ballé, Laparra, Simoncelli | ICLR 2017 | arXiv |
| Variational Image Compression with a Scale Hyperprior | Ballé et al. | ICLR 2018 | arXiv |
| Joint Autoregressive and Hierarchical Priors for Learned Image Compression | Minnen, Ballé, Toderici | NeurIPS 2018 | arXiv | GitHub |
| Learned Image Compression with Discretized Gaussian Mixture Likelihoods and Attention Modules | Cheng et al. | CVPR 2020 | arXiv | GitHub |
| Paper | Authors | Venue | Links |
|---|---|---|---|
| Transformer-based Transform Coding (SwinT-ChARM) | Zhu et al. | ICLR 2022 | OpenReview | GitHub |
| Entroformer: A Transformer-based Entropy Model for Learned Image Compression | Qian et al. | ICLR 2022 | OpenReview |
| Transformer-based Image Compression | Lu et al. | DCC 2022 | arXiv | GitHub |
| Learned Image Compression with Mixed Transformer-CNN Architectures | Liu et al. | CVPR 2023 | CVF | GitHub |
| Paper | Authors | Venue | Links |
|---|---|---|---|
| High-Fidelity Generative Image Compression (HiFiC) | Mentzer et al. | NeurIPS 2020 | Project | PDF |
| Lossy Image Compression with Conditional Diffusion Models (CDC) | Yang et al. | NeurIPS 2023 | arXiv | GitHub |
| Paper | Venue | Links |
|---|---|---|
| Neural Image Compression with Text-guided Encoding for both Pixel-level and Perceptual Fidelity | 2024 | arXiv |
| On Efficient Neural Network Architectures for Image Compression | 2024 | arXiv |
| Causal Context Adjustment Loss for Learned Image Compression | NeurIPS 2024 | GitHub |
| EVC: Towards Real-Time Neural Image Compression with Mask Decay | 2023 | arXiv |
| WeConvene: Learned Image Compression with Wavelet-Domain Convolution and Entropy Model | ECCV 2024 | GitHub |
| Enhanced Invertible Encoding for Learned Image Compression (InvCompress) | ACM MM 2021 | GitHub |
| Library | Description | Links |
|---|---|---|
| CompressAI | PyTorch library and evaluation platform for end-to-end compression research (InterDigital) | GitHub | Docs |
| tensorflow/compression | TensorFlow library for learned compression | GitHub |
| CompressAI-Trainer | Training platform for end-to-end compression models | GitHub |
| Repository | Description |
|---|---|
| Awesome-Deep-Compression | Paper list of deep learning based image/video compression |
| Deep-Learning-Based-Image-Compression | Paper list about deep learning based image compression |
| Learned-Image-Video-Compression | Collection of papers related to data compression |
| Image-Compression-Papers-and-Code | Papers with code implementations |
| Paper | Links |
|---|---|
| Image and Video Compression with Neural Networks: A Review | arXiv |
| Deep Architectures for Image Compression: A Critical Review | ScienceDirect |
| Information Compression in the AI Era: Recent Advances and Future Challenges | arXiv |
We have chosen to use Lightning Fabric over PyTorch Lightning for this project.
- Why Fabric? Fabric allows for a "build-your-own-loop" approach, which is essential for the custom training requirements of neural compression (e.g., separate auxiliary loss optimization steps, custom rate-distortion loss handling). It provides the necessary control without the "magic" or restrictive structure of a full LightningModule.
- Trade-offs: Fabric does not have a built-in callback system (like
EarlyStoppingorModelCheckpoint). These features must be implemented manually in the training loop, as seen intinify/cli/train.py. - Decision: Stick with Fabric for the transparency and granular control it offers, which outweighs the convenience of pre-built callbacks for this specific use case.