dspy-rb

Build type-safe LLM applications with DSPy.rb - Ruby's programmatic prompt framework with signatures, modules, agents, and optimization

$ 설치

git clone https://github.com/majiayu000/claude-skill-registry /tmp/claude-skill-registry && cp -r /tmp/claude-skill-registry/skills/development/dspy-rb ~/.claude/skills/claude-skill-registry

// tip: Run this command in your terminal to install the skill


name: dspy-rb description: Build type-safe LLM applications with DSPy.rb - Ruby's programmatic prompt framework with signatures, modules, agents, and optimization

DSPy.rb

Build LLM apps like you build software. Type-safe, modular, testable.

DSPy.rb brings software engineering best practices to LLM development. Instead of tweaking prompts, you define what you want with Ruby types and let DSPy handle the rest.

Overview

DSPy.rb is a Ruby framework for building language model applications with programmatic prompts. It provides:

  • Type-safe signatures - Define inputs/outputs with Sorbet types
  • Modular components - Compose and reuse LLM logic
  • Automatic optimization - Use data to improve prompts, not guesswork
  • Production-ready - Built-in observability, testing, and error handling

Core Concepts

1. Signatures

Define interfaces between your app and LLMs using Ruby types:

class EmailClassifier < DSPy::Signature
  description "Classify customer support emails by category and priority"

  class Priority < T::Enum
    enums do
      Low = new('low')
      Medium = new('medium')
      High = new('high')
      Urgent = new('urgent')
    end
  end

  input do
    const :email_content, String
    const :sender, String
  end

  output do
    const :category, String
    const :priority, Priority  # Type-safe enum with defined values
    const :confidence, Float
  end
end

2. Modules

Build complex workflows from simple building blocks:

  • Predict - Basic LLM calls with signatures
  • ChainOfThought - Step-by-step reasoning
  • ReAct - Tool-using agents
  • CodeAct - Dynamic code generation agents (install the dspy-code_act gem)

Lifecycle callbacks

Rails-style lifecycle hooks ship with every DSPy::Module, so you can wrap forward without touching instrumentation:

  • before – runs ahead of forward for setup (metrics, context loading)
  • around – wraps forward, calls yield, and lets you pair setup/teardown logic
  • after – fires after forward returns for cleanup or persistence

3. Tools & Toolsets

Create type-safe tools for agents with comprehensive Sorbet support:

# Enum-based tool with automatic type conversion
class CalculatorTool < DSPy::Tools::Base
  tool_name 'calculator'
  tool_description 'Performs arithmetic operations with type-safe enum inputs'

  class Operation < T::Enum
    enums do
      Add = new('add')
      Subtract = new('subtract')
      Multiply = new('multiply')
      Divide = new('divide')
    end
  end

  sig { params(operation: Operation, num1: Float, num2: Float).returns(T.any(Float, String)) }
  def call(operation:, num1:, num2:)
    case operation
    when Operation::Add then num1 + num2
    when Operation::Subtract then num1 - num2
    when Operation::Multiply then num1 * num2
    when Operation::Divide
      return "Error: Division by zero" if num2 == 0
      num1 / num2
    end
  end
end

# Multi-tool toolset with rich types
class DataToolset < DSPy::Tools::Toolset
  toolset_name "data_processing"

  class Format < T::Enum
    enums do
      JSON = new('json')
      CSV = new('csv')
      XML = new('xml')
    end
  end

  class ProcessingConfig < T::Struct
    const :max_rows, Integer, default: 1000
    const :include_headers, T::Boolean, default: true
    const :encoding, String, default: 'utf-8'
  end

  tool :convert, description: "Convert data between formats"
  tool :validate, description: "Validate data structure"

  sig { params(data: String, from: Format, to: Format, config: T.nilable(ProcessingConfig)).returns(String) }
  def convert(data:, from:, to:, config: nil)
    config ||= ProcessingConfig.new
    "Converted from #{from.serialize} to #{to.serialize} with config: #{config.inspect}"
  end

  sig { params(data: String, format: Format).returns(T::Hash[String, T.any(String, Integer, T::Boolean)]) }
  def validate(data:, format:)
    {
      valid: true,
      format: format.serialize,
      row_count: 42,
      message: "Data validation passed"
    }
  end
end

4. Type System & Discriminators

DSPy.rb uses sophisticated type discrimination for complex data structures:

  • Automatic _type field injection - DSPy adds discriminator fields to structs for type safety
  • Union type support - T.any() types automatically disambiguated by _type
  • Reserved field name - Avoid defining your own _type fields in structs
  • Recursive filtering - _type fields filtered during deserialization at all nesting levels

5. Optimization

Improve accuracy with real data:

  • MIPROv2 - Advanced multi-prompt optimization with bootstrap sampling and Bayesian optimization
  • GEPA (Genetic-Pareto Reflective Prompt Evolution) - Reflection-driven instruction rewrite loop with feedback maps, experiment tracking, and telemetry
  • Evaluation - Comprehensive framework with built-in and custom metrics, error handling, and batch processing

Quick Start

# Install
gem 'dspy'

# Configure
DSPy.configure do |c|
  c.lm = DSPy::LM.new('openai/gpt-4o-mini', api_key: ENV['OPENAI_API_KEY'])
  # or use Ollama for local models
  # c.lm = DSPy::LM.new('ollama/llama3.2')
end

# Define a task
class SentimentAnalysis < DSPy::Signature
  description "Analyze sentiment of text"

  input do
    const :text, String
  end

  output do
    const :sentiment, String  # positive, negative, neutral
    const :score, Float       # 0.0 to 1.0
  end
end

# Use it
analyzer = DSPy::Predict.new(SentimentAnalysis)
result = analyzer.call(text: "This product is amazing!")
puts result.sentiment  # => "positive"
puts result.score      # => 0.92

Provider Adapter Gems

Add the adapter gems that match the providers you call:

# Gemfile
gem 'dspy'
gem 'dspy-openai'    # OpenAI, OpenRouter, Ollama
gem 'dspy-anthropic' # Claude
gem 'dspy-gemini'    # Gemini

Each adapter gem already pulls in the official SDK (openai, anthropic, gemini-ai), so you don't need to add those manually.

Key URLs

Guidelines for Claude

When helping users with DSPy.rb:

  1. Focus on signatures - They define the contract with LLMs
  2. Use proper types - T::Enum for categories, T::Struct for complex data
  3. Leverage automatic type conversion - Tools and toolsets automatically convert JSON strings to proper Ruby types (enums, structs, arrays, hashes)
  4. Compose modules - Chain predictors for complex workflows
  5. Create type-safe tools - Use Sorbet signatures for comprehensive tool parameter validation and conversion
  6. Test thoroughly - Use RSpec and VCR for reliable tests
  7. Monitor production - Enable Langfuse by installing the optional o11y gems and setting env vars

Signature Best Practices

Keep description concise - The signature description should state the goal, not the field details:

# ✅ Good - concise goal
class ParseOutline < DSPy::Signature
  description 'Extract block-level structure from HTML as a flat list of skeleton sections.'

  input do
    const :html, String, description: 'Raw HTML to parse'
  end

  output do
    const :sections, T::Array[Section], description: 'Block elements: headings, paragraphs, code blocks, lists'
  end
end

# ❌ Bad - putting field docs in signature description
class ParseOutline < DSPy::Signature
  description <<~DESC
    Extract outline from HTML.

    Return sections with:
    - node_type: The type of element
    - text: For headings, the text content
    - level: For headings, 1-6
    ...
  DESC
end

Use defaults over nilable arrays - For OpenAI structured outputs compatibility:

# ✅ Good - works with OpenAI structured outputs
class ASTNode < T::Struct
  const :children, T::Array[ASTNode], default: []
end

# ❌ Bad - causes schema issues with OpenAI
class ASTNode < T::Struct
  const :children, T.nilable(T::Array[ASTNode])
end

Recursive Types with $defs

DSPy.rb supports recursive types in structured outputs using JSON Schema $defs:

class TreeNode < T::Struct
  const :value, String
  const :children, T::Array[TreeNode], default: []  # Self-reference
end

class DocumentAST < DSPy::Signature
  description 'Parse document into tree structure'

  output do
    const :root, TreeNode
  end
end

The schema generator automatically creates #/$defs/TreeNode references for recursive types, compatible with OpenAI and Gemini structured outputs.

Field Descriptions for T::Struct

DSPy.rb extends T::Struct to support field-level description: kwargs that flow to JSON Schema:

class ASTNode < T::Struct
  const :node_type, NodeType, description: 'The type of node (heading, paragraph, etc.)'
  const :text, String, default: "", description: 'Text content of the node'
  const :level, Integer, default: 0  # No description - field is self-explanatory
  const :children, T::Array[ASTNode], default: []
end

# Access descriptions programmatically
ASTNode.field_descriptions[:node_type]  # => "The type of node (heading, paragraph, etc.)"
ASTNode.field_descriptions[:text]       # => "Text content of the node"
ASTNode.field_descriptions[:level]      # => nil (no description)

The generated JSON Schema includes these descriptions:

{
  "type": "object",
  "properties": {
    "node_type": {
      "type": "string",
      "description": "The type of node (heading, paragraph, etc.)"
    },
    "text": {
      "type": "string",
      "description": "Text content of the node"
    },
    "level": { "type": "integer" }
  }
}

When to use field descriptions:

  • Complex field semantics not obvious from the type
  • Enum-like strings with specific allowed values
  • Fields with constraints (e.g., "1-6 for heading levels")
  • Nested structs where the purpose isn't clear from the name

When to skip descriptions:

  • Self-explanatory fields like name, id, url
  • Fields where the type tells the story (e.g., T::Boolean for flags)

Hierarchical Parsing for Complex Documents

For complex documents that may exceed token limits, consider two-phase parsing:

  1. Phase 1 - Outline: Extract skeleton structure (block types, headings)
  2. Phase 2 - Fill: Parse each section in detail

This avoids max_tokens limits and produces more complete output.

See Also

For complete API reference, advanced patterns, and integration guides, see REFERENCE.md.

Version

Current: 0.34.1