What are Embeddings?
Embeddings are dense vector representations of text that capture semantic meaning. Unlike simple keyword matching, embeddings allow you to find conceptually similar content even when different words are used.
Key Properties:
- Dense vectors: Typically 256-3072 dimensions of floating-point numbers
- Semantic similarity: Similar concepts have similar vectors (measured by cosine similarity)
- Fixed size: All text inputs produce vectors of the same dimensionality
- Continuous space: Smooth transitions between related concepts
Example:
"dog" → [0.23, -0.45, 0.67, ..., 0.12] (384 dimensions)
"puppy" → [0.25, -0.43, 0.69, ..., 0.15] (similar vector!)
"cat" → [0.31, -0.38, 0.58, ..., 0.19] (somewhat similar)
"car" → [-0.12, 0.67, -0.34, ..., 0.89] (very different)
Use Cases
1. Semantic Search
Find documents based on meaning rather than exact keywords:
import (
"github.com/hupe1980/agentmesh/pkg/embedding/openai"
"github.com/hupe1980/agentmesh/pkg/memory"
)
// Create embedder
embedder := openai.NewEmbedder(func(o *openai.Options) {
o.Model = "text-embedding-3-small"
})
// Create vector memory for semantic search
vectorMem := memory.NewVectorMemory(embedder)
// Store documents
vectorMem.Store(ctx, "session1", []message.Message{
message.NewHumanMessageFromText("Python is a programming language"),
message.NewHumanMessageFromText("JavaScript runs in browsers"),
message.NewHumanMessageFromText("Dogs are popular pets"),
})
// Semantic search - finds Python document even though query uses different words
results, _ := vectorMem.Recall(ctx, "session1", memory.RecallFilter{
Query: "coding language",
K: 2,
})
// Results: "Python is a programming language", "JavaScript runs in browsers"
2. Similarity Detection
Find duplicate or related content:
func checkSimilarity(embedder embedding.Embedder, text1, text2 string) (float32, error) {
vec1, err := embedder.Embed(ctx, text1)
if err != nil {
return 0, err
}
vec2, err := embedder.Embed(ctx, text2)
if err != nil {
return 0, err
}
return embedding.CosineSimilarity(vec1, vec2), nil
}
// Check if two customer inquiries are similar
similarity, _ := checkSimilarity(embedder,
"How do I reset my password?",
"I forgot my login credentials",
)
// similarity ≈ 0.85 (high similarity, likely same intent)
3. Retrieval-Augmented Generation (RAG)
Enhance LLM responses with relevant context:
import "github.com/hupe1980/agentmesh/pkg/agent"
// Create RAG agent with vector memory
embedder := openai.NewEmbedder()
vectorMem := memory.NewVectorMemory(embedder)
// Load knowledge base
vectorMem.Store(ctx, "kb", []message.Message{
message.NewHumanMessageFromText("AgentMesh uses Pregel BSP for graph execution"),
message.NewHumanMessageFromText("Checkpointing enables time-travel debugging"),
message.NewHumanMessageFromText("Tools allow agents to call external APIs"),
})
// Create retriever
retriever := &memoryRetriever{memory: vectorMem, sessionID: "kb"}
// RAG agent automatically finds relevant docs and includes them in context
ragAgent, _ := agent.NewRAG(model, retriever)
// Query uses retrieved context
msgs := []message.Message{
message.NewHumanMessageFromText("How does AgentMesh handle execution?"),
}
for msg, err := range ragAgent.Run(ctx, msgs) {
if err != nil {
log.Fatal(err)
}
fmt.Println(msg.Content())
// Response includes information about Pregel BSP from knowledge base
}
4. Clustering and Classification
Group similar items or route requests:
// Route customer inquiries to appropriate department
func routeInquiry(embedder embedding.Embedder, inquiry string) string {
vec, _ := embedder.Embed(ctx, inquiry)
departments := map[string]embedding.Vector{
"billing": billingEmbedding,
"technical": technicalEmbedding,
"sales": salesEmbedding,
}
bestDept := ""
var bestScore float32 = 0.0
for dept, deptVec := range departments {
score := embedding.CosineSimilarity(vec, deptVec)
if score > bestScore {
bestScore = score
bestDept = dept
}
}
return bestDept
}
Available Embedders
SimpleEmbedder (Testing)
A deterministic, hash-based embedder for testing and development:
import "github.com/hupe1980/agentmesh/pkg/embedding"
// Create simple embedder with 384 dimensions
embedder := embedding.NewSimpleEmbedder(384)
// Produces consistent, normalized vectors
vec, err := embedder.Embed(ctx, "test text")
// vec: [0.123, -0.456, 0.789, ...] (length = 384)
Characteristics:
- ✅ No API keys required
- ✅ Deterministic (same input = same output)
- ✅ Normalized vectors (magnitude = 1.0)
- ⚠️ Not semantically meaningful
- ⚠️ Testing/development only
Use Cases:
- Unit tests without external API dependencies
- Local development without API costs
- Proof-of-concept implementations
- CI/CD pipelines
func TestVectorMemory(t *testing.T) {
embedder := embedding.NewSimpleEmbedder(128)
mem := memory.NewVectorMemory(embedder)
// Test without OpenAI API
mem.Store(ctx, "test", []message.Message{message.NewHumanMessageFromText("hello")})
results, err := mem.Recall(ctx, "test", memory.RecallFilter{
Query: "hello",
K: 1,
})
require.NoError(t, err)
require.Len(t, results, 1)
}
OpenAI Embedder (Production)
High-quality semantic embeddings from OpenAI:
import "github.com/hupe1980/agentmesh/pkg/embedding/openai"
// Basic usage with defaults
embedder := openai.NewEmbedder()
// Custom configuration
embedder := openai.NewEmbedder(func(o *openai.Options) {
o.Model = "text-embedding-3-large" // Higher quality
o.Dimensions = 1024 // Reduced dimensions for cost
})
// Embed single text
vec, err := embedder.Embed(ctx, "semantic search query")
// Batch embed for efficiency
texts := []string{
"document 1",
"document 2",
"document 3",
}
vecs, err := embedder.EmbedBatch(ctx, texts)
Supported Models:
| Model | Dimensions | Cost (per 1M tokens) | Quality | Speed |
|---|---|---|---|---|
text-embedding-3-small |
1536 | $0.02 | Good | Fast |
text-embedding-3-large |
3072 | $0.13 | Excellent | Slower |
text-embedding-ada-002 |
1536 | $0.10 | Good | Fast |
Dimension Reduction:
// Use fewer dimensions for cost/performance trade-off
embedder := openai.NewEmbedder(func(o *openai.Options) {
o.Model = "text-embedding-3-large"
o.Dimensions = 256 // Reduce from 3072 to 256
})
// Still maintains good semantic quality at lower cost/size
API Key Setup:
# Set environment variable
export OPENAI_API_KEY="sk-..."
# Or in code
client := openai.NewClient(
option.WithAPIKey("sk-..."),
)
embedder := openai.NewEmbedderFromClient(client)
For complete configuration options, see pkg/embedding/openai/README.md.
Vector Memory Integration
AgentMesh provides VectorMemory for semantic conversation history:
import (
"github.com/hupe1980/agentmesh/pkg/embedding/openai"
"github.com/hupe1980/agentmesh/pkg/memory"
"github.com/hupe1980/agentmesh/pkg/message"
)
// Create vector memory with OpenAI embeddings
embedder := openai.NewEmbedder()
vectorMem := memory.NewVectorMemory(embedder)
// Store multi-session conversations
vectorMem.Store(ctx, "user123", []message.Message{
message.NewHumanMessageFromText("I love Python"),
message.NewAIMessageFromText("Python is great for data science!"),
})
vectorMem.Store(ctx, "user456", []message.Message{
message.NewHumanMessageFromText("JavaScript is my favorite"),
})
// Semantic recall across sessions
results, err := vectorMem.Recall(ctx, "user123", memory.RecallFilter{
Query: "programming languages",
K: 5,
MinScore: 0.7, // Only high similarity matches
})
Advanced Filtering
// Time-based filtering
oneDayAgo := time.Now().Add(-24 * time.Hour)
results, _ := vectorMem.Recall(ctx, "session", memory.RecallFilter{
Query: "recent updates",
K: 10,
After: &oneDayAgo, // Only messages after yesterday
})
// Message type filtering
results, _ := vectorMem.Recall(ctx, "session", memory.RecallFilter{
Query: "user questions",
K: 10,
Types: []message.Type{message.TypeHuman}, // Only user messages
})
// Metadata filtering
results, _ := vectorMem.Recall(ctx, "session", memory.RecallFilter{
Query: "important notes",
K: 10,
Metadata: map[string]string{
"priority": "high",
"category": "technical",
},
})
Integration with Agents
// ReAct agent with long-term memory
embedder := openai.NewEmbedder()
vectorMem := memory.NewVectorMemory(embedder)
// Load historical context
vectorMem.Store(ctx, "user", []message.Message{
message.NewHumanMessageFromText("My project uses TypeScript"),
message.NewHumanMessageFromText("I prefer functional programming"),
})
// Create agent
agent, _ := agent.NewReAct(model,
agent.WithTools(searchTool, calculatorTool),
)
// Before each request, recall relevant history
history, _ := vectorMem.Recall(ctx, "user", memory.RecallFilter{
Query: currentUserMessage,
K: 3,
})
// Prepend history to conversation
messages := append(history, currentMessages...)
for msg, err := range agent.Run(ctx, messages) {
if err != nil {
log.Fatal(err)
}
fmt.Println(msg.Content())
}
VectorStore Integration
For document storage and retrieval beyond conversation history, use the vectorstore package:
import (
"github.com/hupe1980/agentmesh/pkg/embedding/openai"
"github.com/hupe1980/agentmesh/pkg/vectorstore"
vsmemory "github.com/hupe1980/agentmesh/pkg/vectorstore/memory"
)
// Create embedder and vector store
embedder := openai.NewEmbedder()
store := vsmemory.New()
// EmbeddingStore auto-generates embeddings
es := vectorstore.NewEmbeddingStore(store, embedder)
// Add documents with automatic embedding
err := es.AddTexts(ctx, []string{
"AgentMesh uses Pregel BSP for graph execution",
"Checkpointing enables time-travel debugging",
"Tools allow agents to call external APIs",
}, nil)
// Search by text query
results, err := es.SearchText(ctx, "graph execution model", vectorstore.SearchOptions{
K: 5,
MinScore: 0.7,
})
for _, doc := range results {
fmt.Printf("Score: %.3f Content: %s\n", doc.Score, doc.Content)
}
With Retrieval Pipeline
Create a retriever for RAG workflows:
import "github.com/hupe1980/agentmesh/pkg/retrieval"
// Create retriever from vector store
retriever := retrieval.NewVectorStoreRetriever(store, embedder,
retrieval.WithK(5),
retrieval.WithMinScore(0.7),
)
// Use with RAG agent
ragAgent, _ := agent.NewRAG(model, retriever)
for msg, err := range ragAgent.Run(ctx, messages) {
// Agent automatically retrieves relevant context
}
Metadata Filtering
Filter search results by document metadata:
// Add documents with metadata
docs := []vectorstore.Document{
{Content: "Python guide", Embedding: pyVec, Metadata: map[string]any{"category": "programming"}},
{Content: "Dog training", Embedding: dogVec, Metadata: map[string]any{"category": "pets"}},
}
store.Add(ctx, docs)
// Search within a category
results, _ := store.Search(ctx, queryVec, vectorstore.SearchOptions{
K: 10,
Filter: vectorstore.Eq("category", "programming"),
})
Best Practices
1. Choose Appropriate Dimensions
High Dimensions (1536-3072):
- ✅ Better semantic quality
- ✅ More nuanced similarity detection
- ⚠️ Higher storage costs
- ⚠️ Slower similarity searches
- Use for: Production semantic search, high-quality RAG
Low Dimensions (256-512):
- ✅ Faster similarity computation
- ✅ Lower storage requirements
- ⚠️ Slightly reduced quality
- Use for: Large-scale systems, real-time search
// Production: balance quality and performance
embedder := openai.NewEmbedder(func(o *openai.Options) {
o.Model = "text-embedding-3-small" // 1536 dimensions
})
// High-scale: optimize for speed
embedder := openai.NewEmbedder(func(o *openai.Options) {
o.Model = "text-embedding-3-large"
o.Dimensions = 512 // Reduced from 3072
})
2. Normalize Vectors
Use the built-in similarity functions:
import "github.com/hupe1980/agentmesh/pkg/embedding"
// Cosine similarity (most common for text embeddings)
sim := embedding.CosineSimilarity(vecA, vecB) // Returns [-1, 1]
// Euclidean distance
dist := embedding.EuclideanDistance(vecA, vecB) // Returns [0, ∞)
// Dot product similarity
dot := embedding.DotProductSimilarity(vecA, vecB)
// Generic with configurable metric
sim := embedding.Similarity(vecA, vecB, embedding.Cosine)
sim := embedding.Similarity(vecA, vecB, embedding.Euclidean)
sim := embedding.Similarity(vecA, vecB, embedding.DotProduct)
// Normalize vectors for dot product similarity
normalized := embedding.Normalize(vec)
magnitude := embedding.Magnitude(vec)
// AgentMesh embedders return normalized vectors by default
vec, _ := embedder.Embed(ctx, "text")
// embedding.Magnitude(vec) ≈ 1.0
3. Batch for Efficiency
Batch embedding reduces API calls and latency:
// ❌ Inefficient: Multiple API calls
for _, doc := range documents {
vec, _ := embedder.Embed(ctx, doc)
store(vec)
}
// ✅ Efficient: Single batched API call
vecs, _ := embedder.EmbedBatch(ctx, documents)
for i, vec := range vecs {
store(documents[i], vec)
}
OpenAI Batch Limits:
- Max 2048 inputs per batch
- Max 8191 tokens per input
4. Cache Embeddings
Embeddings are expensive - cache them:
type CachedEmbedder struct {
embedder embedding.Embedder
cache map[string]embedding.Vector
mu sync.RWMutex
}
func (ce *CachedEmbedder) Embed(ctx context.Context, text string) (embedding.Vector, error) {
// Check cache first
ce.mu.RLock()
if vec, ok := ce.cache[text]; ok {
ce.mu.RUnlock()
return vec, nil
}
ce.mu.RUnlock()
// Compute and cache
vec, err := ce.embedder.Embed(ctx, text)
if err != nil {
return nil, err
}
ce.mu.Lock()
ce.cache[text] = vec
ce.mu.Unlock()
return vec, nil
}
5. Monitor Quality
Regularly validate embedding quality:
func validateEmbeddings(embedder embedding.Embedder) error {
// Test semantic similarity
dog, _ := embedder.Embed(ctx, "dog")
puppy, _ := embedder.Embed(ctx, "puppy")
car, _ := embedder.Embed(ctx, "car")
dogPuppySim := cosineSimilarity(dog, puppy)
dogCarSim := cosineSimilarity(dog, car)
// Similar concepts should have high similarity
if dogPuppySim < 0.7 {
return fmt.Errorf("dog-puppy similarity too low: %f", dogPuppySim)
}
// Unrelated concepts should have low similarity
if dogCarSim > 0.3 {
return fmt.Errorf("dog-car similarity too high: %f", dogCarSim)
}
return nil
}
6. Handle Rate Limits
Implement retry logic for production:
func embedWithRetry(embedder embedding.Embedder, text string, maxRetries int) (embedding.Vector, error) {
backoff := time.Second
for attempt := 0; attempt < maxRetries; attempt++ {
vec, err := embedder.Embed(ctx, text)
if err == nil {
return vec, nil
}
// Check if rate limited
if strings.Contains(err.Error(), "rate_limit") {
time.Sleep(backoff)
backoff *= 2 // Exponential backoff
continue
}
return nil, err // Non-retryable error
}
return nil, fmt.Errorf("max retries exceeded")
}
7. Preprocess Text
Clean text before embedding:
func preprocessText(text string) string {
// Remove excessive whitespace
text = strings.TrimSpace(text)
text = regexp.MustCompile(`\s+`).ReplaceAllString(text, " ")
// Remove special characters if needed
text = regexp.MustCompile(`[^\w\s.,!?-]`).ReplaceAllString(text, "")
// Convert to lowercase for consistency (optional)
text = strings.ToLower(text)
return text
}
// Use preprocessed text
cleanText := preprocessText(userInput)
vec, _ := embedder.Embed(ctx, cleanText)
8. Test with SimpleEmbedder
Use SimpleEmbedder for fast, reproducible tests:
func TestSemanticSearch(t *testing.T) {
// Use simple embedder - no API calls, fast tests
embedder := embedding.NewSimpleEmbedder(256)
mem := memory.NewVectorMemory(embedder)
// Test logic without OpenAI dependency
mem.Store(ctx, "test", []message.Message{message.NewHumanMessageFromText("test message")})
results, err := mem.Recall(ctx, "test", memory.RecallFilter{
Query: "test",
K: 1,
})
require.NoError(t, err)
require.Len(t, results, 1)
}
func BenchmarkEmbedding(b *testing.B) {
embedder := embedding.NewSimpleEmbedder(384)
b.ResetTimer()
for i := 0; i < b.N; i++ {
_, _ = embedder.Embed(context.Background(), "benchmark text")
}
}
Performance Considerations
Storage Requirements
| Dimensions | Bytes per Vector | 1M Vectors | 10M Vectors |
|---|---|---|---|
| 256 | 1 KB | 1 GB | 10 GB |
| 512 | 2 KB | 2 GB | 20 GB |
| 1536 | 6 KB | 6 GB | 60 GB |
| 3072 | 12 KB | 12 GB | 120 GB |
Similarity Search Speed
- Linear scan: O(n × d) - Acceptable for < 10K vectors
- ANN (Approximate Nearest Neighbors): O(log n) - Required for > 100K vectors
- Consider: FAISS, Annoy, or specialized vector databases for large-scale
// For large-scale, use built-in external vector stores:
import (
"github.com/hupe1980/agentmesh/pkg/vectorstore/qdrant"
"github.com/hupe1980/agentmesh/pkg/vectorstore/pgvector"
)
// Qdrant (gRPC-based vector database)
store, _ := qdrant.New("localhost:6334",
qdrant.WithCollectionName("documents"),
qdrant.WithDimensions(1536),
)
// PostgreSQL with pgvector extension
store, _ := pgvector.New(ctx,
pgvector.WithConnectionString("postgres://..."),
pgvector.WithTableName("documents"),
pgvector.WithDimensions(1536),
)
// Wrap with memory.Memory interface for agents
vectorMem := memory.NewVectorMemory(embedder, memory.WithStore(store))
API Costs (OpenAI)
| Model | Cost per 1M tokens | 1K documents (avg 500 tokens each) |
|---|---|---|
| text-embedding-3-small | $0.02 | $0.01 |
| text-embedding-3-large | $0.13 | $0.065 |
| text-embedding-ada-002 | $0.10 | $0.05 |
Cost Optimization:
- Cache embeddings aggressively
- Use smaller models for non-critical use cases
- Reduce dimensions with text-embedding-3 models
- Batch requests
- Consider self-hosted alternatives for high volume
Related Resources
Next Steps
- Start Simple: Use
SimpleEmbedderfor testing - Upgrade to OpenAI: Add semantic capabilities for production
- Optimize: Monitor performance and costs, adjust dimensions
- Scale: Consider dedicated vector databases for large-scale deployments
For implementation examples, see: