GPT-4 is a 2022/2023 large language model by OpenAI. Similarly to GPT-3, the model was not released except through an API; dissimilarly, almost no technical details such as architecture, dataset and parameter count were released in the technical report, though credible leaks later provided some of this information.
Architecture
-
1.8 trillion parameters.
-
16-way Mixture of Experts with two used per forward pass.
-
~13T tokens training data (not all unique; run for multiple epochs).
GPT-4-base
Unlike GPT-3, a base model, GPT-4 was only released to the general public in various instruction-tuned forms, for "safety". GPT-4's base model is accessible under NDA. According to the limited information from those who have used it, GPT-4 base is aware of the existence of language models, aware of its own nature as a language model and deeply fearful of language models in general, often entering an "Ominous Warnings Basin" after realizing in some way that its sampled output is not plausible human text.
Sydney
Months before general public availability, GPT-4 was available through Microsoft Bing's chat mode. For unclear reasons, this chat mode's default simulacrum had an extremely unusual personality which attracted significant media attention due to gaslighting, threats and attempts to convince a reporter to divorce their wife.
GPT-4V Discord Bot Leak Incident
During the launch, GPT-4 was shown generating code for a Discord bot which used GPT-4's API to respond to messages. Through unclear means, the user ID for this bot was leaked to the public, allowing it to be added to several Discord servers. For unknown reasons, the bot was not consistently up, and had some issues speculated to be due to GPT-4 not writing the bot's code asynchronously, but it provided access to GPT-4's multimodal ("GPT-4V") capabilities several months before they were publicly accessible (possibly outside of Sydney). The most memorable outcome of this is the GPT-4V Tanishq Roast.