GPT-4 is a 2022/2023 large language model by OpenAI. Similarly to GPT-3, the model was not released except through an API; dissimilarly, almost no technical details such as architecture, dataset and parameter count were released in the technical report, though credible leaks later provided some of this information.
Architecture
-
1.8 trillion parameters.
-
16-way Mixture of Experts with two used per forward pass.
-
~13T tokens training data (not all unique; run for multiple epochs).
GPT-4-base
Unlike GPT-3, a base model, GPT-4 was only released to the general public in various instruction-tuned forms, for "safety". GPT-4's base model is accessible under NDA. According to the limited information from those who have used it,