compression_is_prediction_is_intelligence at 7d49e9c

Any model which assigns probabilities to sequences can be turned into a compression algorithm (e.g. with arithmetic coding). Symmetrically, any compression algorithm can be interpreted as assigning probabilities to sequences (splitting probability mass of 1/2, 1/4, 1/8, ... between all compressed-domain sequences of length 0, 1, 2, etc). Intelligence is (mostly) predicting what will happen next in the world. This is the thinking behind the Hutter Prize and large language models (the cross-entropy loss used in pretraining is exactly identical to providing optimal (tokenwise) compression).

G™:Compression Is Prediction Is Intelligence

G™Compression Is Prediction Is Intelligence