Can Chat GPT compress lossless images better than PNG?

[10:24 Mon,2.October 2023 by Rudi Schmidts]

What actually happens when you compress an image losslessly? To compress data, you have to find recurring patterns in the data. Then you can combine them to save memory. For example, instead of 10110 10110 10110, you write 3 x 10110. This usually saves quite a bit of storage space.

With lossless compression, the compressed image must match the original image down to the last bit after "unpacking".

And how does a Large Language Model (LLM) alá ChatGPT work? Here an AI model always tries to guess the next words in a word sequence. GPT can always rewrite sentences as they would most likely be continued in an original text. For this GPT must also have recognized patterns in the given text.

Pattern recognition and guessing how a sequence of data will evolve thus connects the two worlds. But can large language models and effective, lossless image compression really have much to do with each other in practice?

In the arXiv research paper titled

"Language Modeling Is Compression", researchers now suggest such a connection. Surprisingly, they have discovered that the DeepMind LLM called Chinchilla 70B can perform lossless compression of image patches from the ImageNet image database to 43.4 percent of their original size - even outperforming the well-established PNG algorithm, which compressed the same data to "only" 58.5 percent. For audio, Chinchilla compressed samples from the LibriSpeech audio dataset to just 16.4 percent of their raw size, outperforming the usual FLAC compression of 30.3 percent. In both cases, the compression is lossless.

The really strange thing about the surprisingly good compression results, however, is that Chinchilla 70B was trained primarily to handle text - and yet is now surprisingly effective at compressing other types of data. In the two cases considered, even better than algorithms designed specifically for these tasks.

This probably establishes that AI models will play a bigger role in image and audio compression in the future.

But there are, of course, a few critical comments about this report, which is making big waves, especially in IT and AI circles. First, the paper has not yet been peer-reviewed, which is why an error could have crept in. It is conceivable that Chinchilla 70B somehow had access to the ImageNet image database and the LibriSpeech audio dataset during its training. And thus already knew the data through its own training.

In addition, one should not lose sight of the size of the "decoder". To decompress a PNG file, a very small program with a few KB of code is usually sufficient, while a Chinchilla 70B model needs several high-performance GPUs connected in parallel and hundreds of GB of GPU RAM as a decoder.

So such AI compressors are by no means efficient in terms of memory consumption or computing power. And they probably won&t be in the foreseeable future.

more infos at bei arstechnica.com

deutsche Version dieser Seite: Kann Chat GPT Bilder besser verlustfrei komprimieren als PNG?