Diff of Autogollark at 32e0d79
@@ -14,2 +14,3 @@ Autogollark is much [[safer]] than [[instruction-tuned]] systems optimized based
* One proposal: use internal finetune to steer big model somehow. Possibly: use its likelihood (prefill-only) to evaluate goodness of big model output wrt. gollark personality, and if it is too bad then use finetune directly.
+* Is GCG code salvageable? NanoGCG, maybe.
}