A vector embedding is a fixed-length array of numbers that represents the meaning of a piece of text (or an image, or audio), produced by a neural network so that semantically similar inputs produce numerically close vectors.
Why it matters
Embeddings are the substrate of modern AI retrieval. OpenAI's text-embedding-3-large produces 3072-dimensional vectors; Cohere's embed-v3 does similar. The dimensionality is a tunable knob — smaller embeddings are faster but less accurate.
The useful property of embeddings is **cosine similarity**: the angle between two vectors approximates how related their texts are. This is what powers semantic search, RAG retrieval, clustering (topic maps), and recommendation systems.
How Pith relates
Every Pith bookmark is embedded once at save time. Embeddings drive the topic map's clustering, the wiki's RAG retrieval, the search's semantic layer, and the auto-tag service's similarity scoring. See the Topic Map feature for the visualisation.
See also
Last reviewed: 10 May 2026 · Licensed CC BY 4.0 · cite freely with attribution to Pith.