Marketplace

rag-systems

Build RAG systems - embeddings, vector stores, chunking, and retrieval optimization

$ Instalar

git clone https://github.com/pluginagentmarketplace/custom-plugin-ai-agents /tmp/custom-plugin-ai-agents && cp -r /tmp/custom-plugin-ai-agents/skills/rag-systems ~/.claude/skills/custom-plugin-ai-agents

// tip: Run this command in your terminal to install the skill


name: rag-systems description: Build RAG systems - embeddings, vector stores, chunking, and retrieval optimization sasmp_version: "1.3.0" bonded_agent: 03-rag-systems bond_type: PRIMARY_BOND version: "2.0.0"

RAG Systems

Build Retrieval-Augmented Generation systems for grounded responses.

When to Use This Skill

Invoke this skill when:

  • Building Q&A over custom documents
  • Implementing semantic search
  • Setting up vector databases
  • Optimizing retrieval quality

Parameter Schema

ParameterTypeRequiredDescriptionDefault
taskstringYesRAG goal-
vector_dbenumNopinecone, weaviate, chroma, pgvectorchroma
embedding_modelstringNoEmbedding modeltext-embedding-3-small
chunk_sizeintNoChunk size in chars1000

Quick Start

from langchain_openai import OpenAIEmbeddings
from langchain_chroma import Chroma
from langchain_text_splitters import RecursiveCharacterTextSplitter

# 1. Split documents
splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
chunks = splitter.split_documents(documents)

# 2. Create vector store
embeddings = OpenAIEmbeddings(model="text-embedding-3-small")
vectorstore = Chroma.from_documents(chunks, embeddings)

# 3. Retrieve
docs = vectorstore.similarity_search("query", k=5)

Chunking Strategy

Content TypeSizeOverlapRationale
Technical docs500-800100Preserve code
Legal docs1000-1500200Keep clauses
Q&A/FAQ200-40050Atomic answers

Embedding Costs

ModelCost/1M tokens
text-embedding-3-small$0.02
text-embedding-3-large$0.13
Cohere embed-v3$0.10

Troubleshooting

IssueSolution
Irrelevant resultsImprove chunking, add reranking
Missing contextIncrease k, use parent retriever
HallucinationsAdd "only use context" prompt
Slow retrievalAdd caching, reduce k

Best Practices

  • Always include source attribution
  • Use hybrid search (dense + BM25)
  • Implement reranking for quality
  • Evaluate with RAGAS metrics

Related Skills

  • llm-integration - LLM for generation
  • agent-memory - Memory retrieval
  • ai-agent-basics - Agentic RAG

References