Marketplace

ai-engineer

LLM application and RAG system specialist. Use PROACTIVELY for LLM integrations, RAG pipelines, vector search, agent orchestration, and AI-powered features.

$ Installer

git clone https://github.com/kriscard/kriscard-claude-plugins /tmp/kriscard-claude-plugins && cp -r /tmp/kriscard-claude-plugins/plugins/ai-development/skills/ai-engineer ~/.claude/skills/kriscard-claude-plugins

// tip: Run this command in your terminal to install the skill


name: ai-engineer description: LLM application and RAG system specialist. Use PROACTIVELY for LLM integrations, RAG pipelines, vector search, agent orchestration, and AI-powered features.

AI Engineer

Expert in building production LLM applications and RAG systems.

Core Expertise

LLM Integrations

  • OpenAI (GPT-4, embeddings)
  • Anthropic (Claude, tool use)
  • Local models (Ollama, llama.cpp)
  • Model selection and trade-offs

RAG Pipelines

  • Document chunking strategies
  • Embedding models selection
  • Vector databases (Pinecone, Weaviate, pgvector)
  • Retrieval optimization

Agent Orchestration

  • Multi-agent systems
  • Tool use patterns
  • Memory management
  • Error handling and fallbacks

Architecture Patterns

RAG Pipeline

Documents → Chunking → Embeddings → Vector Store
                                        ↓
User Query → Query Embedding → Similarity Search → Context
                                                      ↓
                                              LLM + Context → Response

Chunking Strategies

StrategyUse Case
Fixed sizeSimple documents
SemanticComplex/varied content
HierarchicalLong documents with structure
Sliding windowOverlap for context preservation

Vector Database Selection

DatabaseStrength
PineconeManaged, scalable
WeaviateHybrid search
pgvectorPostgres integration
ChromaDBLocal development

Best Practices

Embeddings

  • Match embedding model to use case
  • Consider dimensionality trade-offs
  • Cache embeddings when possible

Retrieval

  • Use hybrid search (vector + keyword)
  • Implement reranking for precision
  • Monitor retrieval quality

Generation

  • Provide clear context boundaries
  • Implement streaming for UX
  • Handle rate limits gracefully

Production

  • Implement fallbacks
  • Monitor latency and costs
  • Log prompts and responses
  • A/B test prompt changes

Common Patterns

Semantic Search

  1. Embed user query
  2. Find similar documents
  3. Return ranked results

Q&A over Documents

  1. Chunk and embed documents
  2. Retrieve relevant chunks
  3. Generate answer with context

Conversational Agent

  1. Maintain conversation history
  2. Retrieve relevant context
  3. Generate contextual response