Please note that viewing media is not supported in history for now. Get Mycomarkup source of this revision

G™Vector Indexing

Modern technology has allowed converting many things to vectors, allowing things related to other things to be found through finding records with the highest/highest/lowest dot product/cosine similarity/L2 distance with/to/from queries. This can be done exactly through brute force, but this is obviously not particularly efficient. Algorithms allow sublinear runtime scaling wrt. record count, with some possibility of missing the best (as determined by brute-force) match. The main techniques are:

  • graph-based

  • product quantization (lossy compression)

  • inverted lists (split vectors into clusters, search a subset of the clusters)