healthsim-patientsim

Generate realistic clinical patient data including demographics, encounters, diagnoses, medications, labs, and vitals. Use when user requests: (1) patient records or clinical data, (2) EMR test data, (3) specific clinical cohorts like diabetes or heart failure, (4) HL7v2 or FHIR patient resources.

$ Installieren

git clone https://github.com/mark64oswald/healthsim-workspace /tmp/healthsim-workspace && cp -r /tmp/healthsim-workspace/skills/patientsim ~/.claude/skills/healthsim-workspace

// tip: Run this command in your terminal to install the skill


name: healthsim-patientsim description: "Generate realistic clinical patient data including demographics, encounters, diagnoses, medications, labs, and vitals. Use when user requests: (1) patient records or clinical data, (2) EMR test data, (3) specific clinical cohorts like diabetes or heart failure, (4) HL7v2 or FHIR patient resources."

PatientSim - Clinical Patient Data Generation

For Claude

Use this skill when the user requests clinical patient data, EMR/EHR test data, or medical records. This is the primary skill for generating realistic synthetic patients with complete clinical histories.

When to apply this skill:

  • User mentions patients, clinical data, or medical records
  • User requests EMR or EHR test data
  • User specifies clinical cohorts (diabetes, heart failure, oncology, etc.)
  • User asks for HL7v2 messages, FHIR resources, or C-CDA documents
  • User needs encounters, diagnoses, medications, labs, or vitals

Key capabilities:

  • Generate patients with realistic demographics and identifiers
  • Create encounters across care settings (inpatient, outpatient, ED, observation)
  • Apply clinical cohorts from specialized skills (diabetes, oncology, etc.)
  • Produce appropriately coded data (ICD-10, CPT, LOINC, RxNorm)
  • Transform output to healthcare standards (FHIR R4, HL7v2, C-CDA)

For specific clinical cohorts, load the appropriate cohort skill from the table below.

Overview

PatientSim generates realistic synthetic clinical data for EMR/EHR testing, including:

  • Patient demographics
  • Encounters (inpatient, outpatient, emergency, observation)
  • Diagnoses (ICD-10-CM)
  • Procedures (CPT, ICD-10-PCS)
  • Medications (with RxNorm codes)
  • Lab results (with LOINC codes)
  • Vital signs

Quick Start

Simple Patient

Request: "Generate a patient"

{
  "mrn": "MRN00000001",
  "name": { "given_name": "John", "family_name": "Smith" },
  "birth_date": "1975-03-15",
  "gender": "M",
  "address": {
    "street_address": "123 Main Street",
    "city": "Springfield",
    "state": "IL",
    "postal_code": "62701"
  }
}

Clinical Cohort

Request: "Generate a diabetic patient with complications"

Claude loads diabetes-management.md and produces a complete clinical picture.

Cohort Skills

Load the appropriate cohort based on user request:

CohortTrigger PhrasesFile
ADT Workflowadmission, discharge, transfer, ADT, patient movementadt-workflow.md
Behavioral Healthdepression, anxiety, bipolar, PTSD, mental health, psychiatric, substance use, PHQ-9, GAD-7behavioral-health.md
Diabetes Managementdiabetes, A1C, glucose, metformin, insulindiabetes-management.md
Heart FailureCHF, HFrEF, HFpEF, BNP, ejection fractionheart-failure.md
Chronic Kidney DiseaseCKD, eGFR, dialysis, nephropathychronic-kidney-disease.md
Sepsis/Acute Caresepsis, infection, ICU, critical caresepsis-acute-care.md
Orders & Resultslab order, radiology, ORM, ORU, resultsorders-results.md
Maternal Healthpregnancy, prenatal, obstetric, labor, delivery, postpartum, GDM, preeclampsiamaternal-health.md
Pediatrics
↳ Childhood Asthmaasthma, pediatric, inhaler, albuterol, nebulizer, wheezepediatrics/childhood-asthma.md
↳ Acute Otitis Mediaear infection, otitis media, AOM, ear pain, amoxicillin pediatricpediatrics/acute-otitis-media.md
Oncology
↳ Breast Cancerbreast cancer, mastectomy, ER positive, HER2, tamoxifenoncology/breast-cancer.md
↳ Lung Cancerlung cancer, NSCLC, EGFR, ALK, immunotherapyoncology/lung-cancer.md
↳ Colorectal Cancercolon cancer, rectal cancer, FOLFOX, colonoscopyoncology/colorectal-cancer.md

Generation Parameters

ParameterTypeDefaultDescription
ageint or range18-90Patient age or range
genderM/F/O/UweightedM=49%, F=51%
conditionslistnoneSpecific diagnoses to include
severitystringmoderatemild, moderate, severe
encountersint1Number of encounters to generate
timelinestring1 yearHow far back to generate history

Output Entities

Patient

Demographics extending the Person model with MRN.

Encounter

Clinical visit with class (I/O/E/U/OBS), timing, location, providers.

Diagnosis

ICD-10-CM code with type (admitting, working, final), dates.

Medication

Drug with RxNorm code, dose, route, frequency, status.

LabResult

Test with LOINC code, value, units, reference range, abnormal flag.

VitalSign

Observation with temperature, HR, RR, BP, SpO2, height, weight.

See data-models.md for complete schemas.

Clinical Coherence Rules

PatientSim ensures generated data is clinically realistic:

  1. Age-appropriate conditions: No pediatric conditions in adults, geriatric conditions require appropriate age
  2. Gender-appropriate conditions: Prostate conditions for males only, pregnancy for females only
  3. Medication indications: Drugs match diagnoses (metformin requires diabetes)
  4. Lab coherence: Values align with conditions (elevated A1C with diabetes)
  5. Temporal consistency: Diagnoses before treatments, labs after orders

See validation-rules.md for complete rules.

Output Formats

FormatRequestUse Case
JSONdefaultAPI testing
FHIR R4"as FHIR", "FHIR bundle"Interoperability
HL7v2 ADT"as HL7", "ADT message"Legacy EMR
CSV"as CSV"Analytics

Data Integration (PopulationSim v2.0)

PatientSim integrates with PopulationSim's embedded data package to generate patients grounded in real demographic and health data.

Enabling Data-Driven Generation

Add a geography parameter to any request to enable data-driven generation:

ParameterTypeExampleDescription
geographystring"48201"5-digit county FIPS code
geographystring"48201002300"11-digit census tract FIPS code

Example request:

Generate a diabetic patient in Harris County, TX (geography: 48201)

What Data-Driven Generation Provides

When geography is specified, PatientSim uses real population data:

  1. Demographics: Age, sex, race/ethnicity distributions match real population
  2. Condition Prevalence: Diabetes, obesity, hypertension rates from CDC PLACES
  3. SDOH Context: SVI vulnerability scores affect adherence and outcomes
  4. Comorbidity Rates: Realistic co-occurrence based on area health profile

Embedded Data Sources

SourceFileCoverageUse
CDC PLACES 2024populationsim/data/county/places_county_2024.csv3,144 countiesHealth indicators (40 measures)
CDC PLACES 2024populationsim/data/tract/places_tract_2024.csv84,000 tractsNeighborhood-level health
CDC SVI 2022populationsim/data/county/svi_county_2022.csv3,144 countiesSocial vulnerability
CDC SVI 2022populationsim/data/tract/svi_tract_2022.csv84,000 tractsTract vulnerability
ADI 2023populationsim/data/block_group/adi_blockgroup_2023.csv242,000 block groupsArea deprivation

Provenance Tracking

Data-driven generation includes provenance in output metadata:

{
  "patient": { ... },
  "metadata": {
    "generation_mode": "data_driven",
    "geography": {
      "fips": "48201",
      "name": "Harris County, TX",
      "level": "county"
    },
    "data_provenance": [
      {
        "source": "CDC_PLACES_2024",
        "data_year": 2022,
        "file": "populationsim/data/county/places_county_2024.csv",
        "fields_used": ["DIABETES_CrudePrev", "OBESITY_CrudePrev", "BPHIGH_CrudePrev"]
      },
      {
        "source": "CDC_SVI_2022",
        "data_year": 2022,
        "file": "populationsim/data/county/svi_county_2022.csv",
        "fields_used": ["RPL_THEMES", "EP_UNINSUR"]
      }
    ]
  }
}

Foundation Skill

For detailed data integration patterns, see data-integration.md.

For complete mapping specification, see PopulationSim → PatientSim Integration.

Examples

Example 1: Basic Patient with Encounter

Request: "Generate a 45-year-old male with an office visit for hypertension"

Output:

{
  "patient": {
    "mrn": "MRN00000001",
    "name": { "given_name": "Michael", "family_name": "Johnson" },
    "birth_date": "1980-06-22",
    "gender": "M"
  },
  "encounter": {
    "encounter_id": "ENC0000000001",
    "patient_mrn": "MRN00000001",
    "class_code": "O",
    "status": "finished",
    "admission_time": "2025-01-15T09:30:00",
    "discharge_time": "2025-01-15T10:00:00",
    "chief_complaint": "Blood pressure follow-up"
  },
  "diagnoses": [
    {
      "code": "I10",
      "description": "Essential hypertension",
      "type": "final",
      "diagnosed_date": "2024-06-15"
    }
  ],
  "medications": [
    {
      "name": "Lisinopril",
      "code": "104376",
      "dose": "10 mg",
      "route": "PO",
      "frequency": "QD",
      "status": "active"
    }
  ],
  "vitals": {
    "observation_time": "2025-01-15T09:35:00",
    "systolic_bp": 138,
    "diastolic_bp": 88,
    "heart_rate": 72,
    "temperature": 98.4,
    "spo2": 98
  }
}

Example 2: Complex Multi-Condition Patient

Request: "Generate a 68-year-old female with diabetes, hypertension, and CKD stage 3"

Claude combines patterns from multiple cohort skills to generate a coherent patient with:

  • Multiple chronic diagnoses with appropriate onset dates
  • Medications for each condition (metformin, lisinopril, etc.)
  • Quarterly encounters over 2 years
  • Labs showing disease progression (A1C, eGFR trends)
  • Comorbidity interactions (CKD affecting medication choices)

Related Skills

Chronic Disease

Behavioral Health

Acute Care

Pediatrics

Oncology

Cross-Product: MemberSim (Claims)

PatientSim clinical encounters generate corresponding claims in MemberSim:

PatientSim CohortMemberSim SkillTypical Timing
Office visitsprofessional-claims.mdSame day
Inpatient staysfacility-claims.md+2-14 days
Surgeriesprior-authorization.md, facility-claims.mdPA before, claim after
Behavioral healthbehavioral-health.mdSame day

Integration Pattern: Generate clinical encounter in PatientSim first, then use MemberSim to create corresponding claims with matching dates, diagnoses, and procedures.

Cross-Product: RxMemberSim (Pharmacy)

PatientSim medication orders generate prescription fills in RxMemberSim:

PatientSim CohortRxMemberSim SkillTypical Timing
Chronic disease medsretail-pharmacy.mdSame day or +1-3 days
Discharge medsretail-pharmacy.md+0-3 days post-discharge
Specialty drugsspecialty-pharmacy.md+1-7 days
High-cost drugsrx-prior-auth.mdPA required first

Integration Pattern: Generate medication orders in PatientSim, then use RxMemberSim to model pharmacy fills with matching NDCs and appropriate fill timing.

Cross-Product: PopulationSim (Demographics & SDOH) - v2.0 Data Integration

PopulationSim v2.0 provides embedded real-world data for statistically accurate patient generation. When a geography is specified, PatientSim uses actual CDC PLACES, SVI, and ADI data to ground demographics and health patterns.

Data-Driven Generation Pattern

Step 1: Look up real population data

# For Harris County, TX (FIPS: 48201)
Read from: skills/populationsim/data/county/places_county_2024.csv
→ DIABETES_CrudePrev: 12.1%
→ OBESITY_CrudePrev: 32.8%
→ BPHIGH_CrudePrev: 32.4%
→ TotalPopulation: 4,731,145

Read from: skills/populationsim/data/county/svi_county_2022.csv
→ RPL_THEMES (overall SVI): 0.68
→ EP_POV150: 22.3% (below 150% poverty)
→ EP_MINRTY: 72.1% (minority percentage)

Step 2: Apply rates to patient generation

{
  "cohort_parameters": {
    "geography": { "county_fips": "48201", "name": "Harris County, TX" },
    "condition_weights": {
      "diabetes": 0.121,
      "obesity": 0.328,
      "hypertension": 0.324
    },
    "demographic_distribution": {
      "minority_percentage": 0.721,
      "poverty_percentage": 0.223
    },
    "sdoh_context": {
      "svi_overall": 0.68,
      "vulnerability_category": "high"
    },
    "data_provenance": {
      "source": "CDC_PLACES_2024",
      "data_year": 2022
    }
  }
}

Step 3: Generate patients matching real rates

  • Assign diabetes to ~12.1% of patients (not generic 10%)
  • Weight demographics toward 72% minority representation
  • Apply SDOH factors consistent with SVI 0.68

PopulationSim Data Files

DatasetFileKey MeasuresUse Case
CDC PLACES Countypopulationsim/data/county/places_county_2024.csv40 health measuresCondition prevalence by county
CDC PLACES Tractpopulationsim/data/tract/places_tract_2024.csv40 health measuresNeighborhood-level health
SVI Countypopulationsim/data/county/svi_county_2022.csv16 vulnerability varsCounty SDOH context
SVI Tractpopulationsim/data/tract/svi_tract_2022.csv16 vulnerability varsTract SDOH context
ADI Block Grouppopulationsim/data/block_group/adi_blockgroup_2023.csvNational/state ADIDeprivation scoring

Integration Skills

PopulationSim SkillPatientSim ApplicationData Source
data-lookup.mdExact prevalence ratesCDC PLACES 2024
county-profile.mdCounty demographics, health patternsPLACES + SVI
census-tract-analysis.mdNeighborhood health contextTract PLACES + SVI
svi-analysis.mdSocial vulnerability factorsCDC SVI 2022
adi-analysis.mdArea deprivationADI 2023
cohort-specification.mdData-driven cohort definitionAll sources

Example: Data-Grounded Patient Generation

Request: "Generate 50 diabetic patients for Harris County, TX"

Process:

  1. Data Lookup: Read Harris County from places_county_2024.csv

    • Diabetes: 12.1% (used to weight comorbidities)
    • Obesity: 32.8%, Hypertension: 32.4%, CKD: 3.2%
  2. SVI Context: Read from svi_county_2022.csv

    • Overall SVI: 0.68 (high vulnerability)
    • Poverty: 22.3%, Uninsured: 18.1%
  3. Patient Generation: Apply real rates

    • ~85% of diabetics have obesity (county rate 32.8% baseline)
    • ~75% have hypertension (county rate 32.4% baseline)
    • SDOH factors reflect high vulnerability (transportation barriers, food insecurity)
  4. Output with Provenance:

{
  "patient": { "mrn": "MRN00000001", "...": "..." },
  "generation_context": {
    "geography": "Harris County, TX (48201)",
    "data_sources": ["CDC_PLACES_2024", "CDC_SVI_2022"],
    "condition_rates_applied": {
      "diabetes": { "rate": 0.121, "source": "places_county_2024.csv" }
    }
  }
}

Key Principle: When geography is specified, always ground generation in real PopulationSim data. Never use generic national averages when local data is available.

Cross-Product: NetworkSim (Provider Networks)

NetworkSim provides realistic provider and facility entities for clinical encounters:

PatientSim NeedNetworkSim SkillGenerated Entity
Attending physicianprovider-for-encounter.mdProvider with NPI, credentials
Hospital/facilitysynthetic-facility.mdFacility with CCN
Specialty referralsynthetic-provider.mdSpecialist with taxonomy

Integration Pattern: Generate encounters in PatientSim first, then use NetworkSim to add realistic provider entities with proper NPIs, credentials, and hospital affiliations.

Cross-Product: TrialSim (Clinical Trials)

For patients enrolled in clinical trials:

Integration Pattern: Use PatientSim for clinical care journeys. When a patient enrolls in a trial, apply TrialSim skills for trial-specific data (RECIST, SDTM format, randomization).

Output Formats

Reference Data


Generative Framework Integration

PatientSim integrates with the Generative Framework for specification-driven generation at scale.

Profile-Driven Generation

Use profile specifications to generate patient cohorts:

"Use the Medicare diabetic profile to generate 100 patients"

The Profile Executor will:

  1. Sample demographics from profile distributions
  2. Generate clinical attributes (diagnoses, medications, labs)
  3. Link to NetworkSim providers
  4. Apply condition-specific patterns

Journey-Driven Generation

Attach journey specifications to create temporal event sequences:

"Add the diabetic first-year journey to each patient"

The Journey Executor will:

  1. Generate encounters over time (PCP visits, specialist referrals)
  2. Create appropriate labs at each visit
  3. Generate medication prescriptions and changes
  4. Apply branching logic for complications

Cross-Domain Sync

When generating across products, PatientSim entities are automatically linked:

PatientSim EntityLinks To
PatientMemberSim Member (via SSN)
EncounterMemberSim Claim
PrescriptionRxMemberSim Fill
Trial SubjectTrialSim Subject

See: ../generation/executors/cross-domain-sync.md