Page:
Vector Embeddings
No results
1
Vector Embeddings
Anton Nesterov edited this page 2026-02-21 15:30:07 +01:00
Table of Contents
- Vector Embeddings
- Understanding Embeddings
- Storing Embeddings
- Searching Embeddings
- Practical Examples
- Semantic Document Search
- Product Recommendations
- FAQ Search
- Content Moderation
- Plagiarism Detection
- Hybrid Search (Text + Vector)
- Embedding Update Strategies
- Performance Optimization
- API Endpoints
- Best Practices
Vector Embeddings
Vector embeddings enable semantic search by representing text, images, or other data as numerical vectors. VSKI provides native support for storing and searching embeddings.
Understanding Embeddings
Embeddings convert content (text, images, etc.) into high-dimensional vectors where similar items are close together in vector space. This enables:
- Semantic search (finding meaningfully similar content)
- Recommendation systems
- Clustering and classification
- Duplicate detection
Storing Embeddings
Upsert an Embedding
// Generate embedding (using OpenAI, Cohere, or local models)
const embedding = await generateEmbedding("Machine learning is amazing");
// Store embedding for a record
await client.embeddings.upsert(
"documents", // collection name
"record-id", // record ID
embedding, // vector array
);
The upsert method creates or updates an embedding:
const result = await client.embeddings.upsert(
"documents", // collection name
"record-id-123", // record ID
[0.1, 0.2, 0.3, ...] // Array of numbers (typically 128-1536 dimensions)
);
console.log(result.recordId); // "record-id-123"
console.log(result.dimensions); // 128 (or your vector dimension)
console.log(result.status); // "success"
Generating Embeddings
Using OpenAI
async function generateEmbedding(text: string): Promise<number[]> {
const response = await fetch("https://api.openai.com/v1/embeddings", {
method: "POST",
headers: {
"Authorization": `Bearer ${process.env.OPENAI_API_KEY}`,
"Content-Type": "application/json",
},
body: JSON.stringify({
model: "text-embedding-3-small",
input: text,
}),
});
const data = await response.json();
return data.data[0].embedding;
}
Using Cohere
async function generateEmbedding(text: string): Promise<number[]> {
const response = await fetch("https://api.cohere.ai/v1/embed", {
method: "POST",
headers: {
"Authorization": `Bearer ${process.env.COHERE_API_KEY}`,
"Content-Type": "application/json",
},
body: JSON.stringify({
model: "embed-english-v3.0",
texts: [text],
}),
});
const data = await response.json();
return data.embeddings[0];
}
Using Local Models (Transformers.js)
import { pipeline } from "@xenova/transformers";
let embedder;
async function generateEmbedding(text: string): Promise<number[]> {
if (!embedder) {
embedder = await pipeline(
"feature-extraction",
"Xenova/all-MiniLM-L6-v2",
);
}
const output = await embedder(text, {
pooling: "mean",
normalize: true,
});
return Array.from(output.data);
}
Searching Embeddings
Basic Similarity Search
const queryEmbedding = await generateEmbedding("AI and machine learning");
const results = await client.embeddings.search(
"documents",
queryEmbedding,
{ limit: 10 },
);
console.log(results);
// {
// results: [
// {
// record: { id: "1", title: "Machine Learning Tutorial", ... },
// distance: 0.23
// },
// {
// record: { id: "2", title: "AI Research", ... },
// distance: 0.45
// },
// ...
// ],
// count: 25
// }
Search with Threshold
Filter results by similarity threshold (lower distance = more similar):
const results = await client.embeddings.search(
"documents",
queryEmbedding,
{
limit: 10,
threshold: 0.5, // Only return results with distance <= 0.5
},
);
Custom Limit
Control how many results to return:
const results = await client.embeddings.search(
"documents",
queryEmbedding,
{ limit: 5 }, // Return top 5 most similar
);
Practical Examples
Semantic Document Search
// Store documents with embeddings
const documents = await client.collection("documents").getList(1, 1000);
for (const doc of documents.items) {
const embedding = await generateEmbedding(doc.title + " " + doc.content);
await client.embeddings.upsert("documents", doc.id, embedding); // collection, recordId, vector
}
// Search semantically
async function searchDocuments(query: string) {
const embedding = await generateEmbedding(query);
const results = await client.embeddings.search(
"documents",
embedding,
{ limit: 10, threshold: 0.5 },
);
return results.results.map((r) => r.record);
}
// Usage
const results = await searchDocuments("artificial intelligence tutorials");
Product Recommendations
// Store product embeddings
const products = await client.collection("products").getList(1, 1000);
for (const product of products.items) {
const text = `${product.name} ${product.description} ${product.category}`;
const embedding = await generateEmbedding(text);
await client.embeddings.upsert("products", product.id, embedding); // collection, recordId, vector
}
// Find similar products
async function findSimilarProducts(productId: string, limit = 5) {
const product = await client.collection("products").getOne(productId);
const text = `${product.name} ${product.description} ${product.category}`;
const embedding = await generateEmbedding(text);
const results = await client.embeddings.search(
"products",
embedding,
{ limit: limit + 1 }, // +1 to exclude the product itself
);
return results.results
.filter((r) => r.record.id !== productId)
.slice(0, limit)
.map((r) => r.record);
}
// Usage
const similar = await findSimilarProducts("product-123", 5);
FAQ Search
// Store FAQ embeddings
const faqs = await client.collection("faqs").getList(1, 1000);
for (const faq of faqs.items) {
const text = `${faq.question} ${faq.answer}`;
const embedding = await generateEmbedding(text);
await client.embeddings.upsert("faqs", faq.id, embedding); // collection, recordId, vector
}
// Semantic FAQ search
async function searchFAQs(query: string) {
const embedding = await generateEmbedding(query);
const results = await client.embeddings.search(
"faqs",
embedding,
{ limit: 3, threshold: 0.6 },
);
return results.results.map((r) => r.record);
}
// Usage
const answers = await searchFAQs("How do I reset my password?");
Content Moderation
// Store flagged content embeddings
const flagged = await client.collection("flagged_content").getList(1, 1000);
for (const item of flagged.items) {
const embedding = await generateEmbedding(item.content);
await client.embeddings.upsert("flagged_content", item.id, embedding); // collection, recordId, vector
}
// Check if new content is similar to flagged content
async function checkSimilarity(content: string) {
const embedding = await generateEmbedding(content);
const results = await client.embeddings.search(
"flagged_content",
embedding,
{ limit: 3, threshold: 0.3 },
);
if (results.results.length > 0) {
console.warn("Content similar to flagged items found!");
return results.results.map((r) => r.record);
}
return [];
}
Plagiarism Detection
// Store document embeddings for comparison
async function indexForPlagiarismCheck(documents: any[]) {
for (const doc of documents) {
const embedding = await generateEmbedding(doc.content);
await client.embeddings.upsert("documents", doc.id, embedding); // collection, recordId, vector
}
}
// Check for potential plagiarism
async function checkPlagiarism(content: string) {
const embedding = await generateEmbedding(content);
const results = await client.embeddings.search(
"documents",
embedding,
{ limit: 5, threshold: 0.2 }, // Low threshold for high similarity
);
return results.results.map((r) => ({
document: r.record,
similarity: 1 - r.distance, // Convert distance to similarity
}));
}
Hybrid Search (Text + Vector)
Combine keyword search with semantic search:
async function hybridSearch(query: string, collection: string) {
const embedding = await generateEmbedding(query);
// Vector search
const vectorResults = await client.embeddings.search(
collection,
embedding,
{ limit: 10 },
);
// Text search (using full-text search)
const textResults = await client.collection(collection).getList(1, 10, {
search: query,
});
// Combine and deduplicate
const combined = new Map();
vectorResults.results.forEach((r) => {
const score = 1 - r.distance;
combined.set(r.record.id, {
record: r.record,
vectorScore: score,
textScore: 0,
});
});
textResults.items.forEach((r) => {
if (combined.has(r.id)) {
combined.get(r.id).textScore = 1;
} else {
combined.set(r.id, {
record: r,
vectorScore: 0,
textScore: 1,
});
}
});
// Calculate combined score (70% vector, 30% text)
const sorted = Array.from(combined.values())
.map((item) => ({
...item.record,
score: (item.vectorScore * 0.7) + (item.textScore * 0.3),
}))
.sort((a, b) => b.score - a.score)
.slice(0, 10);
return sorted;
}
Embedding Update Strategies
Real-time Updates
Update embeddings when content changes:
// Create record with embedding
const record = await client.collection("documents").create({
title: "New Document",
content: "Content here...",
});
const embedding = await generateEmbedding(record.title + " " + record.content);
await client.embeddings.upsert("documents", record.id, embedding); // collection, recordId, vector
// Update embedding when record is updated
async function updateDocument(id: string, data: any) {
const updated = await client.collection("documents").update(id, data);
const embedding = await generateEmbedding(
updated.title + " " + updated.content,
);
await client.embeddings.upsert("documents", id, embedding); // collection, recordId, vector
return updated;
}
Batch Updates
Reindex all records:
async function reindexCollection(collectionName: string) {
const records = await client.collection(collectionName).getList(1, 1000);
for (const record of records.items) {
const text = `${record.title || ""} ${record.content || ""}`;
const embedding = await generateEmbedding(text);
await client.embeddings.upsert(collectionName, record.id, embedding); // collection, recordId, vector
// Rate limiting
await new Promise((resolve) => setTimeout(resolve, 100));
}
console.log(`Reindexed ${records.items.length} records`);
}
Performance Optimization
Batch Processing
Process multiple embeddings in parallel:
async function batchEmbed(texts: string[]): Promise<number[][]> {
// Process in parallel
const embeddings = await Promise.all(
texts.map((text) => generateEmbedding(text)),
);
return embeddings;
}
Caching
Cache embeddings for frequently accessed content:
const embeddingCache = new Map();
async function getCachedEmbedding(text: string): Promise<number[]> {
const cacheKey = text.toLowerCase().trim();
if (embeddingCache.has(cacheKey)) {
return embeddingCache.get(cacheKey);
}
const embedding = await generateEmbedding(text);
embeddingCache.set(cacheKey, embedding);
return embedding;
}
Dimension Selection
Choose appropriate embedding dimensions:
// Small models (faster, less memory)
// 128-256 dimensions - for simple similarity
const embedding = await generateEmbedding(text, {
model: "all-MiniLM-L6-v2", // 384 dimensions
});
// Medium models (good balance)
// 384-768 dimensions - for general use
const embedding = await generateEmbedding(text, {
model: "text-embedding-3-small", // 1536 dimensions
});
// Large models (more accurate, slower)
// 768-1536 dimensions - for complex semantics
API Endpoints
| Method | Endpoint | Description |
|---|---|---|
POST |
/api/collections/:name/embeddings |
Upsert embedding for a record |
POST |
/api/collections/:name/embeddings/search |
Search similar embeddings |
Best Practices
- Consistent text - Use the same text format for all embeddings
- Normalize text - Clean and normalize text before embedding
- Choose right model - Select embedding model based on use case
- Batch when possible - Process multiple embeddings together
- Cache results - Cache embeddings for repeated queries
- Monitor performance - Track embedding generation and search times
- Use appropriate dimensions - Balance accuracy and performance
- Update embeddings - Re-embed when content changes
- Set proper thresholds - Tune similarity thresholds for your use case
- Test thoroughly - Validate search quality with real queries