hypothesis

Use Hypothesis for property-based testing to automatically generate comprehensive test cases, find edge cases, and write more robust tests with minimal example shrinking. Includes Polars parametric testing integration.

$ Instalar

git clone https://github.com/anam-org/metaxy /tmp/metaxy && cp -r /tmp/metaxy/.claude/skills/hypothesis ~/.claude/skills/metaxy

// tip: Run this command in your terminal to install the skill


name: hypothesis description: Use Hypothesis for property-based testing to automatically generate comprehensive test cases, find edge cases, and write more robust tests with minimal example shrinking. Includes Polars parametric testing integration.

Hypothesis - Property-Based Testing

Property-based testing framework that generates test cases automatically, finds minimal failing examples through shrinking, and verifies invariants.

Official Docs: https://hypothesis.readthedocs.io/en/latest/

Key Features:

  • Automatic test data generation from strategies
  • Minimal failing example shrinking
  • Stateful testing with rule-based state machines
  • pytest integration
  • Deterministic reproducibility

Quick Start

from hypothesis import given
from hypothesis import strategies as st


@given(st.integers())
def test_property(x):
    """Test properties that should always hold"""
    assert abs(x) >= 0


@given(st.lists(st.integers()))
def test_list_property(lst):
    sorted_lst = sorted(lst)
    assert len(sorted_lst) == len(lst)
    # Check monotonic property
    for i in range(len(sorted_lst) - 1):
        assert sorted_lst[i] <= sorted_lst[i + 1]

Strategies

Full reference: https://hypothesis.readthedocs.io/en/latest/data.html

Common strategies:

  • Primitives: st.integers(), st.floats(), st.text(), st.booleans()
  • Collections: st.lists(), st.dictionaries(), st.tuples(), st.sets()
  • Dates/Times: st.dates(), st.datetimes(), st.timedeltas()
  • Combinators: st.one_of(), st.sampled_from(), st.recursive()
  • Type-based: st.from_type(MyClass)

Composite Strategies

from hypothesis import strategies as st
from hypothesis.strategies import composite


@composite
def user_strategy(draw):
    age = draw(st.integers(min_value=18, max_value=100))
    name = draw(st.text(min_size=1))
    return {"name": name, "age": age, "is_adult": age >= 18}


@given(user_strategy())
def test_user(user):
    assert user["is_adult"] == (user["age"] >= 18)

Strategy Combinators

st.integers().filter(lambda x: x % 2 == 0)  # Filter
st.integers().map(str)  # Transform
st.one_of(st.integers(), st.text())  # Choose between strategies
st.sampled_from([1, 2, 3, 4, 5])  # Pick from collection
st.from_type(MyClass)  # Infer from type hints
st.builds(MyClass, arg1=st.integers())  # Build instances

Settings

from hypothesis import given, settings
from hypothesis import strategies as st


@given(st.integers())
@settings(
    max_examples=1000,  # Default: 100
    deadline=None,  # Remove time limit
    derandomize=True,  # Deterministic ordering
)
def test_example(x):
    pass


# Profiles for different environments
settings.register_profile("dev", max_examples=10)
settings.register_profile("ci", max_examples=1000, deadline=None)
# Activate: HYPOTHESIS_PROFILE=ci pytest

Full settings reference: https://hypothesis.readthedocs.io/en/latest/settings.html

Helpers

from hypothesis import given, assume, note, example, seed


@given(st.integers(), st.integers())
def test_division(x, y):
    assume(y != 0)  # Skip invalid cases (prefer .filter() instead)
    note(f"Testing {x} / {y}")  # Add debug info
    assert (x / y) * y == x


@given(st.integers())
@example(0)  # Always test specific cases
@seed(12345)  # Reproducible run
def test_something(x):
    pass

Stateful Testing

For testing complex stateful systems with rule-based state machines.

from hypothesis.stateful import RuleBasedStateMachine, rule, invariant
from hypothesis import strategies as st


class MyStateMachine(RuleBasedStateMachine):
    def __init__(self):
        super().__init__()
        self.data = []

    @rule(value=st.integers())
    def add(self, value):
        self.data.append(value)

    @invariant()
    def check_invariant(self):
        assert isinstance(self.data, list)


TestMachine = MyStateMachine.TestCase

Full stateful testing guide: https://hypothesis.readthedocs.io/en/latest/stateful.html

Polars Integration

Polars provides built-in parametric testing strategies for generating DataFrames.

Official docs: https://docs.pola.rs/api/python/stable/reference/api/polars.testing.parametric.dataframes.html

from hypothesis import given
import polars as pl
from polars.testing.parametric import dataframes, column


# Generate DataFrames with specific column schemas
@given(
    dataframes(
        cols=[
            column("id", dtype=pl.Int64),
            column("name", dtype=pl.String),
            column("value", dtype=pl.Float64),
        ],
        min_size=1,
        max_size=100,
    )
)
def test_dataframe_property(df: pl.DataFrame):
    """Test properties of DataFrame operations"""
    assert df.shape[0] >= 1
    assert set(df.columns) == {"id", "name", "value"}
    assert df["id"].dtype == pl.Int64


# With Narwhals wrapper
import narwhals as nw


@given(dataframes(cols=[column("a", dtype=pl.Int64)]))
def test_narwhals_operation(df: pl.DataFrame):
    nw_df = nw.from_native(df)
    result = nw_df.select(nw.col("a") * 2)
    assert result.shape[0] == nw_df.shape[0]

Key functions:

  • dataframes(): Generate DataFrames with specified columns
  • column(name, dtype, ...): Define column schemas with constraints
  • series(): Generate standalone Series

Column constraints:

  • null_probability: Control null value frequency
  • min_size/max_size: Control row count
  • allow_null: Enable/disable nulls
  • unique: Generate unique values
  • strategy: Custom strategy for column values

Best Practices

  • Use constraints over filters: st.integers(min_value=0) not st.integers().filter(lambda x: x >= 0)
  • Test properties, not examples: Focus on invariants that always hold
  • Combine with @example(): Test specific edge cases explicitly
  • Avoid assume() overuse: Makes tests slow; use filtered strategies
  • Document properties: Clear docstrings explain what invariant is tested
  • Set size limits: Always bound collection sizes to prevent memory issues
  • Use .hypothesis/ in .gitignore: Stores example database locally

Troubleshooting

Common issues and solutions:

  • HealthCheck failures: Too many rejected examples → use constrained strategies or suppress_health_check
  • Flaky tests: Non-deterministic code → use @seed() or @settings(derandomize=True)
  • Slow tests: Too many examples → reduce max_examples or use profiles
  • Deadline exceeded: Complex operations → increase deadline or set to None

Resources