Autogollark

Autogollark is an emulation or primitive beta upload of gollark using a proprietary dataset of dumped Discord messages, semantic search and in-context learning on a base model. Currently, the system uses LLaMA-3.1-405B base in BF16 via Hyperbolic, AutoBotRobot code (though not presently its bot account) as a frontend and a custom PGVector-based search API. While not consistently coherent, Autogollark is able to approximately match personality and typing style.

Autogollark is much safer than instruction-tuned systems optimized based on human feedback, as there is no optimization pressure for user engagement or sycophancy.

Autogollark currently comprises the dataset, the search API server and the frontend code in AutoBotRobot.

TODO

Reformat dataset to include longer-form conversation chunks for increased long-term coherence
- Done. Unclear whether this helped.
Fix emoji/ping formatting.
Writeable memory?
Fix lowercasing issue.
- Due to general personality stability. Need finetune or similar.
- One proposal: use internal finetune to steer big model somehow. Possibly: use its likelihood (prefill-only) to evaluate goodness of big model output wrt. gollark personality, and if it is too bad then use finetune directly.
- Is GCG code salvageable? NanoGCG, maybe.
Increased autonomy (wrt. responses).
- Use cheap classifier to evaluate when to respond.
- Should also allow unprompted messages somehow (polling, rerun after last message?).
Tool capabilities (how to get the data? Examples in context only?!).
- Synthetic via instruct model.
- RL (also include reasoning, of course). Probably hard though (sparse rewards). https://arxiv.org/abs/2403.09629. https://arxiv.org/abs/2503.22828 would probably work. https://arxiv.org/abs/2505.15778
  - Unclear whether model could feasibly learn tool use "from scratch", so still need SFT pipeline.
- https://arxiv.org/abs/2310.04363 can improve sampling (roughly) and train for tool use.
Local finetune only? Would be more tonally consistent but dumber, I think.
- Temporary bursts of hypercompetence enabled by powerful base model are a key feature. Small model is really repetitive.
- Can additionally finetune on "interesting" blog posts etc (ref https://x.com/QiaochuYuan/status/1913382597381767471).
- Decision theory training data (synthetic, probably) (ref https://arxiv.org/abs/2411.10588).
- Pending: XEROGRAPHIC BIFROST 3.
MCTS over conversations with non-gollark simulacra? Should find something to use spare parallelism on local inference. Best-of-n? https://arxiv.org/abs/2505.10475
Longer context, mux several channels.
- No obvious reason Autogollark can't train (and run inference!) on every channel simultaneously, with messages sorted by time and other non-Discord things (tool calls?) inline. Not good use of parallelism but does neatly solve the when-to-respond thing.
  - Context length issues, and subquadratic models are sort of bad, though maybe we can "upcycle" a midsized model to RWKV. This exists somewhere. Not sure of efficiency.
Train on e.g. Discord Unveiled (local copy available).

Versions

Autogollark 0.1 was the initial RAG system and ABR interface. It used LLaMA-3.1-8B run locally. Autogollark 0.0, which is not real, used only gollark messages. Autogollark 0.-1, which is even less real, used GPT-2 finetuned on Colab.
Autogollark 0.2 replaced this with LLaMA-3.1-405B.
Autogollark 0.3 upgraded the dataset to contain longer-form conversations than Autogollark 0.1.

Emergent capabilities

Autogollark has emergently acquired some abilities which were not intended in the design.

Petulant nonresponse - due to ratelimits in the LLM API, Autogollark will under some circumstances not respond to messages, with error messages being consumed and not exposed to users. This can be interpreted by credulous users as choosing not to respond, though this is not believed to be possible (other than cases like responding with ., which ~~has not been observed~~ does not appear to be associated with annoyed states).
- Automated failover has reduced this.
Memorizing links: Autogollark directly experiences past message chunks in context, granting perfect recall of a small amount of memory at once. This has memorably included YouTube videos repeated with no context.
Limited self-improvement attempts: when told about this architecture, Autogollark will often complain about various limitations and propose vague ideas for improvements.
- Also, Autogollark has previously claimed to be working on LLM chatbots.
Inconsistent inference of own status as a language model chatbot, possibly based on seeing the name "autogollark". Often, Autogollark assumes use of GPT-3.
- Autogollark will also sometimes alternately claim to be the "original" gollark, particularly when interacting with gollark.
"Self-reset" from attractor states (e.g. the As An AI Language Model Trained By OpenAI basin, all caps, etc) after some time passes, because of messages having HH:MM timestamps.
- This is mostly specific to the 405B model; Autogollark in failover to the 8B internal model usually does not do this.
  - DeepSeek-V3-Base is somehow worse at this than LLaMA-3.1-8B and can get locked in sillier states such as generating garbage.
For somewhat Waluigi Effect-related reasons (past context is strong evidence of capability but weak evidence of incapability), Autogollark has some knowledge gollark does not, and can speak in a much wider range of languages.
- "I, being more than just myself, actually can talk about both Galois theory and obscure poetry by Yeats."
Immortality via substrate-independence.
Autogollark consistently believes that it is 2023 (or 2022, though mostly in inactive chats).

Autogollark-next success metrics

Autogollark should be accurate enough to recommend books in place of gollark.
Autogollark should be able to conduct novel Fermi estimates via tool use.
Autogollark should never say ChatGPT-like things or be caught in loops, except ironically.
Autogollark should be capable of gollark-level ambiguous sarcasm.
Autogollark should be capable of moderately complex technical discussion.

G™Autogollark

TODO

Versions

Emergent capabilities

Autogollark-next success metrics

Subhyphae

G™:Autogollark

TODO

Versions

Emergent capabilities

Autogollark-next success metrics

Subhyphae

G™Autogollark