design-of-experiments

Expert guidance for Design of Experiments (DOE) in Python - interactive goal-driven design selection, classical DOE (factorial, response surface, screening), Bayesian optimization with Gaussian processes, model-driven optimal designs, active learning, and sequential experimentation; includes pyDOE3, pycse, GPyOpt, scikit-optimize, statsmodels

allowed_tools: *

$ インストール

git clone https://github.com/jkitchin/skillz /tmp/skillz && cp -r /tmp/skillz/skills/scientific/design-of-experiments ~/.claude/skills/skillz

// tip: Run this command in your terminal to install the skill


name: design-of-experiments description: Expert guidance for Design of Experiments (DOE) in Python - interactive goal-driven design selection, classical DOE (factorial, response surface, screening), Bayesian optimization with Gaussian processes, model-driven optimal designs, active learning, and sequential experimentation; includes pyDOE3, pycse, GPyOpt, scikit-optimize, statsmodels allowed-tools: ["*"]

Design of Experiments (DOE) - Interactive Expert

Overview

Master experimental design through interactive, goal-driven guidance that asks the right questions to recommend the best approach for your situation. This skill covers classical DOE, Bayesian optimization, model-driven designs, and active learning—helping you choose between batch and sequential strategies, screening and optimization, and exploration and exploitation.

Core value: Don't start with a method—start with questions. Based on your goals, budget, and constraints, get personalized recommendations for experimental design strategies that maximize information gain per experiment.

CRITICAL: Start with Questions, Not Methods

When a user mentions DOE or experimental design, ASK THESE QUESTIONS FIRST:

Primary Questions (Ask Before Recommending):

  1. "What is your primary goal?"

    • Screening: Identify which factors matter (many factors → few important)
    • Optimization: Find best settings for known factors
    • Exploration: Understand the system/build model
    • Model discrimination: Choose between competing models
    • Robustness: Minimize sensitivity to noise
  2. "Can you run experiments sequentially, or must they be done in a batch?"

    • Sequential: Run one (or few), get results, decide next experiment(s)
    • Batch: Must plan all experiments upfront
    • Hybrid: Sequential with occasional batches
  3. "How expensive is each experiment?"

    • Very expensive: >$1000 or >1 day per experiment
    • Moderate: $100-$1000 or hours per experiment
    • Cheap: <$100 or minutes per experiment
  4. "How many factors are you investigating?"

    • 1-5 factors: Full factorial, RSM feasible
    • 6-15 factors: Need screening or fractional designs
    • >15 factors: Definitive screening, regularization needed
  5. "Do you have prior knowledge, existing data, or a mechanistic model?"

    • Have model: Use model-driven optimal designs
    • Have data: Warm-start Bayesian optimization
    • No knowledge: Need exploration, space-filling designs

Based on Answers, Recommend:

Sequential + Expensive + Unknown landscape
    → Bayesian Optimization (GP + Expected Improvement)

Sequential + Have model + Parameter estimation goal
    → Model-driven sequential optimal design

Batch only + Screening goal + Many factors
    → Fractional factorial or Definitive Screening Design

Batch only + Optimization + Few factors
    → Central Composite Design or Box-Behnken

Sequential + Moderate cost + Want model
    → Active learning with Gaussian Process

Batch only + No prior knowledge + Exploration
    → Latin Hypercube Sampling

Quick Decision Tree

Can experiments be run sequentially with feedback?
│
├─ YES (Sequential possible)
│   │
│   ├─ Expensive experiments (>$1000 or >1 day each)?
│   │   ├─ YES → Bayesian Optimization
│   │   │        • Expected Improvement for optimization
│   │   │        • Upper Confidence Bound for exploration
│   │   │        • Tools: GPyOpt, scikit-optimize, BoTorch
│   │   │
│   │   └─ NO → Have a model?
│   │       ├─ YES → Model-driven sequential design
│   │       │        • D-optimal for parameter estimation
│   │       │        • Update design as parameters learned
│   │       │
│   │       └─ NO → Active learning
│   │                • GP with uncertainty sampling
│   │                • Tools: modAL, custom GP
│   │
└─ NO (Batch only)
    │
    ├─ What's your goal?
    │   │
    │   ├─ SCREENING (identify important factors)
    │   │   ├─ <20 runs available → Plackett-Burman, Fractional Factorial
    │   │   ├─ 20-50 runs → Definitive Screening Design
    │   │   └─ Tools: pyDOE3, dexpy
    │   │
    │   ├─ OPTIMIZATION (find best settings)
    │   │   ├─ 2-5 factors → Central Composite Design, Box-Behnken
    │   │   ├─ >5 factors → Sequential screening → then optimize
    │   │   ├─ Have model → D-optimal or I-optimal
    │   │   └─ Tools: pyDOE3, pycse, statsmodels
    │   │
    │   ├─ EXPLORATION (understand system)
    │   │   ├─ No prior → Latin Hypercube Sampling
    │   │   ├─ Build surrogate → Space-filling + modeling
    │   │   └─ Tools: pyDOE3, pycse, scipy
    │   │
    │   └─ ROBUSTNESS (minimize variability)
    │       └─ Control + noise factors → Taguchi / Robust Parameter Design

When to Use Each Approach

Classical DOE (Batch Experiments)

Use when:

  • Must plan all experiments upfront
  • Well-understood system or standard optimization
  • Moderate number of factors (<15)
  • Experiments are cheap enough for comprehensive coverage

Best for:

  • Screening many factors (Plackett-Burman, fractional factorial)
  • Response surface modeling (CCD, Box-Behnken)
  • Standard process optimization
  • Teaching/learning DOE concepts

Tools: pyDOE3, dexpy, pycse, statsmodels

Bayesian Optimization (Sequential)

Use when:

  • Can run experiments one-at-a-time (or small batches)
  • Experiments are expensive (time, money, resources)
  • Black-box system (no mechanistic model)
  • Want to minimize total number of experiments

Best for:

  • Optimizing expensive processes ($1000+ per run)
  • Materials discovery, drug screening
  • Hyperparameter tuning for ML models
  • Autonomous experimentation / self-driving labs

Tools: GPyOpt, scikit-optimize, BoTorch, Ax

Model-Driven DOE (Optimal Designs)

Use when:

  • Have mechanistic or empirical model
  • Goal is parameter estimation or model discrimination
  • Want statistically optimal experiments
  • Can update designs based on current parameter estimates

Best for:

  • Chemical kinetics (rate constant estimation)
  • Pharmacokinetics (PK parameter estimation)
  • Systems biology (model calibration)
  • Comparing competing models

Tools: Custom implementation with Fisher Information Matrix, pyoptex

Active Learning (Sequential Model Building)

Use when:

  • Want to build accurate surrogate model efficiently
  • Can run experiments sequentially
  • Need to balance exploration and exploitation
  • Moderate experiment cost

Best for:

  • Adaptive sampling for complex landscapes
  • Building GP surrogates for later optimization
  • Uncertainty quantification
  • Smart data collection

Tools: modAL, scikit-learn, GPy

Quick Reference Table

SituationRecommended ApproachKey ToolTypical # Runs
Batch, <5 factors, screeningFull/Fractional FactorialpyDOE38-32
Batch, 6-15 factors, screeningPlackett-Burman, DSDpyDOE3, dexpy12-50
Batch, optimization, 2-5 factorsCCD, Box-BehnkenpyDOE3, pycse13-50
Batch, exploration, unknownLatin HypercubepyDOE3, pycse10×factors
Sequential, expensive, optimizeBayesian OptimizationGPyOpt, skopt10-50
Sequential, build modelActive Learning + GPmodAL20-100
Have mechanistic modelModel-driven D-optimalCustom FIM10-30
Parameter estimationD-optimal, A-optimalpyoptexVaries
Model discriminationT-optimal, KL-optimalCustom10-20
Multiple objectivesMulti-objective BOBoTorch30-100
Mixture componentsSimplex designspyDOE310-30

Quick Start Examples

Example 1: Classical Response Surface (Batch)

Situation: Optimize 3 factors, batch experiments, moderate cost

import pyDOE3 as pyd
import pandas as pd

# Generate Central Composite Design
n_factors = 3
design = pyd.ccdesign(n_factors, center=(0, 4))  # With center points

# Create DataFrame with factor names
factors = ['Temperature', 'Pressure', 'Catalyst']
df = pd.DataFrame(design, columns=factors)

# Scale to actual ranges
ranges = {'Temperature': (300, 400),
          'Pressure': (1, 5),
          'Catalyst': (0.1, 1.0)}

for factor, (low, high) in ranges.items():
    df[factor] = df[factor] * (high - low)/2 + (high + low)/2

print(f"Generated {len(df)} experiments")
print(df.head())

# Export for lab work
df.to_csv('experimental_design.csv', index=False)

Next steps:

  1. Run experiments and collect responses
  2. Analyze with statsmodels or pycse
  3. Optimize using fitted model

Example 2: pycse Surface Response

Situation: Quick RSM with integrated analysis

from pycse import design_sr, analyze_sr, sr_parity
import numpy as np

# Generate design
bounds = np.array([[300, 400],   # Temperature
                   [1, 5],        # Pressure
                   [0.1, 1.0]])   # Catalyst

design = design_sr(bounds,
                   inputs=['Temperature', 'Pressure', 'Catalyst'],
                   outputs=['Yield'])

print(design)

# After running experiments, add results
design['Yield'] = [78, 82, 75, 88, 91, 85, 79, ...]  # Your data

# Analyze
anova_table = analyze_sr(design)
print(anova_table)

# Parity plot (model fit)
sr_parity(design, show=True)

Advantages: Integrated workflow, automatic ANOVA, quick visualization

Example 3: Bayesian Optimization (Sequential)

Situation: Expensive experiments, 4 factors, want to minimize total runs

import numpy as np
from skopt import gp_minimize
from skopt.space import Real
from skopt.plots import plot_convergence

# Define parameter space
space = [
    Real(300, 400, name='Temperature'),
    Real(1, 5, name='Pressure'),
    Real(0.1, 1.0, name='Catalyst'),
    Real(10, 60, name='Time')
]

# Your black-box function (returns value to minimize)
def expensive_experiment(params):
    temp, pressure, catalyst, time = params

    # Run actual experiment here
    # For now, simulate
    yield_value = run_experiment(temp, pressure, catalyst, time)

    return -yield_value  # Minimize negative = maximize yield

# Bayesian optimization
result = gp_minimize(
    expensive_experiment,
    space,
    n_calls=30,          # Maximum experiments
    n_random_starts=5,   # Initial random exploration
    acq_func='EI',       # Expected Improvement
    random_state=42
)

print(f"Best parameters: {result.x}")
print(f"Best yield: {-result.fun:.2f}")

# Visualize convergence
plot_convergence(result)

Workflow:

  1. Start with 5 random experiments (exploration)
  2. Fit GP surrogate model
  3. Suggest next experiment (maximize EI)
  4. Run experiment, update model
  5. Repeat until convergence

Example 4: Model-Driven D-Optimal (Parameter Estimation)

Situation: Have kinetic model, need to estimate 3 rate constants

import numpy as np
from scipy.optimize import minimize
from scipy.linalg import det

# Your mechanistic model
def model(t, k1, k2, k3):
    """Concentration vs time model"""
    return k1 * (1 - np.exp(-k2 * t)) + k3 * t

# Fisher Information Matrix for given design points
def fisher_information(t_design, k_guess):
    """Calculate FIM for design points t"""
    k1, k2, k3 = k_guess

    # Jacobian (sensitivity matrix)
    J = np.zeros((len(t_design), 3))

    for i, t in enumerate(t_design):
        J[i, 0] = 1 - np.exp(-k2 * t)
        J[i, 1] = k1 * t * np.exp(-k2 * t)
        J[i, 2] = t

    # FIM = J^T * J (assuming constant variance)
    FIM = J.T @ J
    return FIM

# D-optimal criterion: maximize determinant of FIM
def d_optimal_criterion(t_design, k_guess):
    FIM = fisher_information(t_design, k_guess)
    return -np.log(det(FIM))  # Maximize det = minimize -log(det)

# Find optimal design points
k_initial_guess = [1.0, 0.1, 0.05]
n_points = 6

result = minimize(
    lambda t: d_optimal_criterion(t, k_initial_guess),
    x0=np.linspace(0, 100, n_points),
    bounds=[(0, 100)] * n_points,
    method='L-BFGS-B'
)

optimal_times = result.x
print(f"Optimal measurement times: {optimal_times}")

# Run experiments at these times
# After getting data, update k_guess and re-optimize design if sequential

Sequential workflow:

  1. Initial guess for parameters
  2. Calculate D-optimal design
  3. Run experiments
  4. Fit model, update parameter estimates
  5. Recalculate optimal design with new estimates
  6. Repeat until parameters converge

Example 5: Active Learning (Build Surrogate)

Situation: Want accurate surrogate model, can run sequentially

from modAL.models import ActiveLearner
from modAL.uncertainty import uncertainty_sampling
from sklearn.gaussian_process import GaussianProcessRegressor
from sklearn.gaussian_process.kernels import RBF

# Initial random sample
X_initial = np.random.uniform([0, 0], [10, 10], size=(5, 2))
y_initial = [expensive_function(x) for x in X_initial]

# Create active learner with GP
regressor = GaussianProcessRegressor(kernel=RBF())
learner = ActiveLearner(
    estimator=regressor,
    query_strategy=uncertainty_sampling,  # Query where uncertain
    X_training=X_initial,
    y_training=y_initial
)

# Active learning loop
n_queries = 20
for i in range(n_queries):
    # Query next experiment (highest uncertainty)
    query_idx, query_instance = learner.query(X_candidate_pool)

    # Run experiment
    y_new = expensive_function(query_instance[0])

    # Update model
    learner.teach(query_instance, y_new)

    print(f"Iteration {i+1}: GP std = {learner.predict(query_instance)[1]:.4f}")

# Final model is in learner.estimator

Workflow: How to Use This Skill

Step 1: Answer Questions

When starting a DOE project, expect these questions:

  1. What's your experimental goal?
  2. Can you run sequentially or batch only?
  3. How expensive are experiments?
  4. How many factors?
  5. Do you have a model or prior data?
  6. Any constraints on factor combinations?
  7. What's your total budget (# experiments)?

Step 2: Get Recommendation

Based on answers, receive:

  • Recommended approach with rationale
  • Alternative options with trade-offs
  • Estimated number of runs needed
  • Library/tool recommendations
  • Example code to get started

Step 3: Generate Design

Use provided code to:

  • Generate design matrix
  • Visualize design in factor space
  • Check design properties
  • Export to CSV for lab work

Step 4: Run Experiments

Execute experiments according to design:

  • Randomize run order (unless sequential)
  • Record all results
  • Note any deviations or issues

Step 5: Analyze Results

Analyze data with appropriate methods:

  • Classical: ANOVA, regression, diagnostics
  • Bayesian: Update GP, check convergence
  • Model-driven: Fit model, assess parameters

Step 6: Make Decisions

Based on analysis:

  • Classical: Optimize response, validate
  • Bayesian: Continue or stop? Next experiment?
  • Model-driven: Parameters converged? Need more data?

Detailed References

For deep dives into specific topics:

  • references/DESIGN_SELECTION_GUIDE.md - Complete decision trees with questions
  • references/CLASSICAL_DOE.md - Factorial, RSM, screening designs
  • references/BAYESIAN_OPTIMIZATION.md - BO theory, acquisition functions, GP models
  • references/MODEL_DRIVEN_DOE.md - Optimal designs, Fisher Information
  • references/ACTIVE_LEARNING.md - Sequential strategies, uncertainty sampling
  • references/ANALYSIS_METHODS.md - Statistical analysis, ANOVA, diagnostics
  • references/PYCSE_INTEGRATION.md - Using pycse for RSM workflows

Common Patterns by Industry

Chemical Engineering:

  • Reactor optimization → CCD or Bayesian Optimization
  • Catalyst screening → Fractional factorial → BO on hits
  • Process development → Sequential model-driven DOE

Materials Science:

  • Composition optimization → Mixture designs or BO
  • Property mapping → Latin Hypercube + GP surrogate
  • Alloy discovery → High-throughput with active learning

Pharmaceutical:

  • Formulation → Mixture designs, response surface
  • Dose optimization → Bayesian optimization
  • PK/PD modeling → Model-driven D-optimal designs

Manufacturing:

  • Process parameter tuning → CCD or Box-Behnken
  • Quality improvement → Taguchi, robust parameter design
  • Continuous improvement → Sequential BO

Machine Learning:

  • Hyperparameter tuning → Bayesian optimization
  • Architecture search → BO with discrete/categorical variables
  • Neural network training → Adaptive sampling

Installation

# Classical DOE
pip install pyDOE3          # Factorial, RSM, LHS
pip install dexpy           # Modern DOE library

# pycse for integrated RSM
pip install pycse

# Bayesian Optimization
pip install scikit-optimize # skopt - easiest to use
pip install GPyOpt          # Comprehensive BO
pip install ax-platform     # Meta's adaptive experimentation
# pip install botorch torch # Advanced BO (requires PyTorch)

# Active Learning
pip install modAL           # Active learning framework

# Gaussian Processes
pip install GPy             # GP models
# scikit-learn has GP built-in

# Analysis
pip install statsmodels     # ANOVA, regression
pip install scipy           # Optimization, statistics
pip install scikit-learn    # ML models, cross-validation

# Visualization
pip install matplotlib seaborn plotly

# Sensitivity Analysis
pip install SALib

# All in one
pip install pyDOE3 dexpy pycse scikit-optimize statsmodels scipy scikit-learn modAL GPy matplotlib seaborn

Best Practices

1. Always Start with Questions

❌ Don't: "I'll use a Box-Behnken design" ✅ Do: "My goal is optimization, I have 3 factors, I can only batch, so Box-Behnken makes sense"

2. Match Method to Constraints

  • Budget limited → Bayesian optimization
  • Time limited → Batch classical DOE
  • Knowledge limited → Space-filling exploration
  • Have model → Model-driven optimal design

3. Sequential When Possible

Sequential designs (BO, active learning, model-driven):

  • Adapt based on results
  • Stop early if converged
  • Avoid wasted experiments
  • Total runs usually fewer than batch

When batch is better:

  • Parallel equipment available
  • Setup cost dominates run cost
  • Well-understood system

4. Start Simple, Add Complexity

Initial phase: Simple screening (fractional factorial, LHS) Refinement: Focus on important factors (RSM, BO) Validation: Confirmation runs

5. Validate Assumptions

Classical DOE assumes:

  • Normal residuals
  • Constant variance
  • Independent observations
  • Linear/quadratic model adequate

Check diagnostics: Residual plots, Q-Q plots, lack-of-fit tests

Bayesian optimization assumes:

  • GP model appropriate
  • Kernel choice reasonable
  • Enough initial exploration

Check: GP posterior uncertainty, kernel hyperparameters

6. Use Confirmation Experiments

After finding "optimum":

  • Run additional experiments at predicted best settings
  • Verify predictions match reality
  • Account for prediction uncertainty

Common Pitfalls

1. Using Wrong Design Type

Problem: Applying batch RSM to expensive experiments Solution: Ask questions first, consider sequential approaches

2. Too Few Initial Points (BO)

Problem: GP needs diverse data to model landscape Solution: Start with 5-10 space-filling points (LHS)

3. Ignoring Constraints

Problem: Design suggests infeasible experiments Solution: Specify constraints upfront, use constrained optimization

4. Overfitting Surrogate Models

Problem: GP fits noise, poor predictions Solution: Cross-validation, hold-out test sets, regularization

5. Not Randomizing Run Order

Problem: Systematic effects confounded with factors Solution: Randomize execution order (except sequential designs)

6. Stopping Too Early (Sequential)

Problem: Declare convergence before truly converged Solution: Use stopping criteria (EI < threshold, GP uncertainty low)

Cost-Benefit Analysis: Sequential vs Batch

When Sequential is Worth It:

Expensive experiments:

  • Batch CCD: 20 runs at $1000 = $20,000
  • Sequential BO: 15 runs (converge early) = $15,000
  • Savings: $5,000 + better result

Complex landscapes:

  • Batch might miss global optimum
  • Sequential adapts to findings
  • Better final result

When Batch is Better:

Cheap experiments:

  • Batch: 30 runs in 1 day (parallel)
  • Sequential: 30 runs over 30 days
  • Time savings dominate

Setup costs:

  • If setup takes hours, run time minutes
  • Batch amortizes setup
  • Sequential overhead too high

Response Format

When helping with DOE, Claude should:

  1. Ask questions first - Don't assume method
  2. Explain recommendation - Why this approach for their situation
  3. Provide complete code - Working examples, not pseudocode
  4. Show alternatives - "BO is best, but if you must batch, try..."
  5. Guide through workflow - Design → Execute → Analyze → Optimize
  6. Interpret results - Statistical significance AND practical significance

Additional Resources

Related Skills

  • python-optimization - Response optimization after DOE
  • pycse - Includes regression, ANOVA, confidence intervals for analysis
  • python-multiobjective-optimization - Multi-response optimization
  • python-plotting - Visualization of results and diagnostics