preloader
  • Home
  • LLMZip: Lossless Text Compression using Large Language Models

LLMZip: Lossless Text Compression using Large Language Models

LLMZip: Lossless Text Compression using Large Language Models

We show that English text is a lot more compressible than what was believed to be possible previously. Our lossless compression algorithm combines the prediction from modern large language models with arithmetic coding and compresses text to about 0.7~bits/character, outperforming state-of-the-art compressors like BSC and PAQ.