Diff of Good Ideas at 939cc82

@@ -40,3 +40,6 @@ Next-action-predictor editors/UI.
 Do WiFi sensing but good (with more data (https://b.osmarks.net/o/a0bddc9b742b4efbba18600ae6d51d98)).
-* Weak evidence from internal testing suggests that models can't easily "ICL" locations. We may need really long contexts or big finetunes or some smarter meta-learning algorithm.
+* {
+Weak evidence from internal testing suggests that models can't easily "ICL" locations. We may need really long contexts or big finetunes or some smarter meta-learning algorithm.
+* (multi-TX single-RX data with ESP32-C3 receiver, slightly nonuniform times; would need to run more tests)
+}
 }

@@ -74,2 +77,3 @@ Psychological studies using LLMs (https://arxiv.org/abs/2209.06899 addresses thi
 * https://arxiv.org/abs/2410.20268
+* TODO find the paper which actually did factor-analyze political beliefs (while doing something else)
 }

@@ -95,3 +99,7 @@ Semantic search for:
 * Improve AI-based search with methods other than making boring contrastive embedding models very slightly better.
-* Graph-based vector indices do beam search on a graph constructed so that beam search on dot product or PQ dot product "mostly" returns the closest (highest-dot-product) result. Do one of several RL/etc pathfinding schemes instead (like https://arxiv.org/abs/2502.18663)? We only have a budget of 100μs per read node, though, so this is a bit tricky.
+* {
+Graph-based vector indices do beam search on a graph constructed so that beam search on dot product or PQ dot product "mostly" returns the closest (highest-dot-product) result. Do one of several RL/etc pathfinding schemes instead (like https://arxiv.org/abs/2502.18663)? We only have a budget of 100μs per read node, though, so this is a bit tricky.
+* We can also pack the graphs better (with some cost at retrieval time due to multi-page accesses and more compute): sorting and delta compression of node indices, offload metadata again, maybe general-purpose compressors, maybe QAT/IAA codecs, quantize the graph vectors a bit.
+* 100μs/node is assuming SSD random latency, and with that assumption these algorithms still aren't useful if we can throw more time at it because we can do more reads. But with really clever pathfinding, HDDs could also work. Maybe? Two orders of magnitude slower, though.
+}
 * SAEs/dictionary learning algorithms for residual quantization.

@@ -104,2 +112,2 @@ Semantic search for:
 * Neural net graph prettiness heuristic rather than force-based.
-* CA embedding models.
\ No newline at end of file
+* CA embedding models (like maths thing, transition rule proximity).
\ No newline at end of file