embeddings
Text embeddings for semantic search and similarity. Use when converting text to vectors, choosing embedding models, implementing chunking strategies, or building document similarity features.
$ Instalar
git clone https://github.com/yonatangross/skillforge-claude-plugin /tmp/skillforge-claude-plugin && cp -r /tmp/skillforge-claude-plugin/.claude/skills/embeddings ~/.claude/skills/skillforge-claude-plugin// tip: Run this command in your terminal to install the skill
SKILL.md
name: embeddings description: Text embeddings for semantic search and similarity. Use when converting text to vectors, choosing embedding models, implementing chunking strategies, or building document similarity features. context: fork agent: data-pipeline-engineer
Embeddings
Convert text to dense vector representations for semantic search and similarity.
When to Use
- Building semantic search systems
- Document similarity comparison
- RAG retrieval (see:
rag-retrievalskill) - Clustering related content
- Duplicate detection
Quick Reference
from openai import OpenAI
client = OpenAI()
# Single text embedding
response = client.embeddings.create(
model="text-embedding-3-small",
input="Your text here"
)
vector = response.data[0].embedding # 1536 dimensions
# Batch embedding (efficient)
texts = ["text1", "text2", "text3"]
response = client.embeddings.create(
model="text-embedding-3-small",
input=texts
)
vectors = [item.embedding for item in response.data]
Model Selection
| Model | Dims | Cost | Use Case |
|---|---|---|---|
text-embedding-3-small | 1536 | $0.02/1M | General purpose |
text-embedding-3-large | 3072 | $0.13/1M | High accuracy |
nomic-embed-text (Ollama) | 768 | Free | Local/CI |
Chunking Strategy
def chunk_text(text: str, chunk_size: int = 512, overlap: int = 50) -> list[str]:
"""Split text into overlapping chunks for embedding."""
words = text.split()
chunks = []
for i in range(0, len(words), chunk_size - overlap):
chunk = " ".join(words[i:i + chunk_size])
if chunk:
chunks.append(chunk)
return chunks
Guidelines:
- Chunk size: 256-1024 tokens (512 typical)
- Overlap: 10-20% for context continuity
- Include metadata (title, source) with chunks
Similarity Calculation
import numpy as np
def cosine_similarity(a: list[float], b: list[float]) -> float:
"""Calculate cosine similarity between two vectors."""
a, b = np.array(a), np.array(b)
return np.dot(a, b) / (np.linalg.norm(a) * np.linalg.norm(b))
# Usage
similarity = cosine_similarity(vector1, vector2)
# 1.0 = identical, 0.0 = orthogonal, -1.0 = opposite
Key Decisions
- Dimension reduction: Can truncate
text-embedding-3-largeto 1536 dims - Normalization: Most models return normalized vectors
- Batch size: 100-500 texts per API call for efficiency
Common Mistakes
- Embedding queries differently than documents
- Not chunking long documents (context gets lost)
- Using wrong similarity metric (cosine vs euclidean)
- Re-embedding unchanged content (cache embeddings)
Advanced Patterns
See references/advanced-patterns.md for:
- Late Chunking: Embed full document, extract chunk vectors from contextualized tokens
- Batch API: Production batching with rate limiting and retry
- Embedding Cache: Redis-based caching to avoid re-embedding
- Matryoshka Embeddings: Dimension reduction with text-embedding-3
Related Skills
rag-retrieval- Using embeddings for RAG pipelineshyde-retrieval- Hypothetical document embeddings for vocabulary mismatchcontextual-retrieval- Anthropic's context-prepending techniquereranking-patterns- Cross-encoder reranking for precisionollama-local- Local embeddings with nomic-embed-text
Capability Details
text-to-vector
Keywords: embedding, text to vector, vectorize, embed text Solves:
- Convert text to vector embeddings
- Choose appropriate embedding models
- Handle embedding API integration
semantic-search
Keywords: semantic search, vector search, similarity search, find similar Solves:
- Implement semantic search over documents
- Configure similarity thresholds
- Rank results by relevance
chunking-strategies
Keywords: chunk, chunking, split, text splitting, overlap Solves:
- Split documents into optimal chunks
- Configure chunk size and overlap
- Preserve semantic boundaries
batch-embedding
Keywords: batch, bulk embed, parallel embedding, batch processing Solves:
- Embed large document collections efficiently
- Handle rate limits and retries
- Optimize embedding costs
local-embeddings
Keywords: local, ollama, self-hosted, on-premise, offline Solves:
- Run embeddings locally with Ollama
- Deploy self-hosted embedding models
- Reduce API costs with local models
Repository

yonatangross
Author
yonatangross/skillforge-claude-plugin/.claude/skills/embeddings
5
Stars
1
Forks
Updated1w ago
Added1w ago