healthsim-populationsim

PopulationSim provides population-level intelligence using public data sources. Use this skill for ANY request involving: (1) population demographics or profiles, (2) geographic health patterns or disparities, (3) social determinants of health (SDOH), (4) SVI or ADI analysis, (5) cohort definition or specification, (6) clinical trial feasibility, site selection, or enrollment projection, (7) service area analysis, (8) health equity assessment, (9) census data or ACS variables, (10) CDC PLACES health indicators.

$ Installieren

git clone https://github.com/mark64oswald/healthsim-workspace /tmp/healthsim-workspace && cp -r /tmp/healthsim-workspace/skills/populationsim ~/.claude/skills/healthsim-workspace

// tip: Run this command in your terminal to install the skill


name: healthsim-populationsim description: > PopulationSim provides population-level intelligence using public data sources. Use this skill for ANY request involving: (1) population demographics or profiles, (2) geographic health patterns or disparities, (3) social determinants of health (SDOH), (4) SVI or ADI analysis, (5) cohort definition or specification, (6) clinical trial feasibility, site selection, or enrollment projection, (7) service area analysis, (8) health equity assessment, (9) census data or ACS variables, (10) CDC PLACES health indicators.

PopulationSim - Population Intelligence & Cohort Generation

Overview

PopulationSim provides population-level intelligence using public data sources (Census ACS, CDC PLACES, Social Vulnerability Index, Area Deprivation Index) to enable:

  1. Standalone Analysis: Geographic profiling, health disparities analysis, population comparisons
  2. Cross-Product Integration: Cohort specifications that drive realistic data generation in PatientSim, MemberSim, RxMemberSim, and TrialSim

Key Differentiator: Unlike other HealthSim products that generate synthetic records, PopulationSim analyzes real population characteristics and creates specifications for generation.

Quick Reference

I want to...Use This SkillKey Triggers
Data Access (v2.0)
Look up exact data valuesdata-access/data-lookup.md"what is the exact", "look up", "from PLACES"
Resolve FIPS codesdata-access/geography-lookup.md"FIPS for", "which county is", "list counties in MSA"
Aggregate geographic datadata-access/data-aggregation.md"aggregate tracts", "metro total", "combine counties"
Geographic Intelligence
Profile a county or regiongeographic/county-profile.md"county profile", "demographics for", "health indicators"
Analyze census tractsgeographic/census-tract-analysis.md"tract level", "granular", "hotspots"
Profile a metro areageographic/metro-area-profile.md"metro", "MSA", "metropolitan"
Define custom regiongeographic/custom-region-builder.md"service area", "combine", "custom region"
Health Patterns
Analyze disease prevalencehealth-patterns/chronic-disease-prevalence.md"diabetes rate", "prevalence", "CDC PLACES"
Analyze health behaviorshealth-patterns/health-behavior-patterns.md"smoking rate", "obesity", "physical activity"
Assess healthcare accesshealth-patterns/healthcare-access-analysis.md"uninsured", "provider ratio", "access"
Identify health disparitieshealth-patterns/health-outcome-disparities.md"disparities", "equity", "by race"
SDOH Analysis
Analyze SVIsdoh/svi-analysis.md"SVI", "social vulnerability", "vulnerable"
Analyze ADIsdoh/adi-analysis.md"ADI", "area deprivation", "deprived"
Analyze economicssdoh/economic-indicators.md"poverty", "income", "unemployment"
Analyze community factorssdoh/community-factors.md"housing", "transportation", "food access"
Cohort Definition
Define a cohortcohorts/cohort-specification.md"define cohort", "cohort spec", "population segment"
Build demographicscohorts/demographic-distribution.md"age distribution", "demographics for cohort"
Build clinical profilecohorts/clinical-prevalence-profile.md"comorbidity rates", "clinical profile"
Build SDOH profilecohorts/sdoh-profile-builder.md"SDOH profile", "Z-code rates"
Trial Support
Estimate trial feasibilitytrial-support/feasibility-estimation.md"feasibility", "eligible population"
Select trial sitestrial-support/site-selection-support.md"site selection", "best locations"
Project enrollmenttrial-support/enrollment-projection.md"enrollment timeline", "recruitment rate"

Trigger Phrases

Data Access (v2.0)

  • "What is the exact [measure] in [geography]?"
  • "Look up [measure] from CDC PLACES"
  • "What's the FIPS code for [county]?"
  • "Which counties are in the [metro] MSA?"
  • "Aggregate tract data for [county]"

Geographic Intelligence

  • "What's the population profile for [county/region]?"
  • "Show me demographics for [geography]"
  • "Compare [region A] to [region B]"
  • "Analyze census tracts in [area] with high vulnerability"
  • "Profile the [metro area] MSA"

Health Patterns

  • "What's the diabetes prevalence in [geography]?"
  • "Show health disparities by race in [region]"
  • "Compare chronic disease rates across [geographies]"
  • "What are the smoking rates in [county]?"
  • "Which counties have the highest obesity?"

SDOH Analysis

  • "What's the SVI for [geography]?"
  • "Show me high-deprivation areas in [state]"
  • "Analyze social determinants in [region]"
  • "Which tracts have transportation barriers?"
  • "Find food deserts in [county]"

Cohort Definition

  • "Define a cohort of diabetics in underserved California areas"
  • "Create a population specification for high-risk heart failure patients"
  • "Build a cohort spec for PatientSim generation"
  • "Specify a population segment for claims testing"
  • "What are the comorbidity rates for diabetics?"

Trial Support

  • "Estimate feasibility for a T2DM trial"
  • "How many patients are eligible for [criteria]?"
  • "Rank trial sites for cardiovascular outcomes study"
  • "Best locations for diabetes trial enrollment"
  • "Project enrollment for 2,000 subjects across 40 sites"

Output Types

PopulationProfile

Geographic entity with demographics, health indicators, and SDOH indices:

{
  "geography": {
    "type": "county",
    "fips": "06073",
    "name": "San Diego County",
    "state": "CA",
    "region": "Pacific"
  },
  "demographics": {
    "total_population": 3286069,
    "median_age": 37.1,
    "age_distribution": {
      "0-17": 0.21,
      "18-64": 0.62,
      "65+": 0.17
    },
    "race_ethnicity": {
      "white_nh": 0.43,
      "hispanic": 0.34,
      "asian": 0.12,
      "black": 0.05,
      "other": 0.06
    },
    "median_household_income": 102285,
    "poverty_rate": 0.103
  },
  "health_indicators": {
    "source": "CDC_PLACES_2024",
    "diabetes_prevalence": 0.095,
    "obesity_prevalence": 0.280,
    "hypertension_prevalence": 0.285,
    "depression_prevalence": 0.195,
    "smoking_prevalence": 0.098
  },
  "sdoh_indices": {
    "svi_overall": 0.42,
    "svi_themes": {
      "socioeconomic": 0.38,
      "household_composition": 0.45,
      "minority_language": 0.52,
      "housing_transportation": 0.35
    },
    "adi_national_rank": 35
  },
  "healthcare_access": {
    "uninsured_rate": 0.071,
    "pcp_per_100k": 82.4,
    "insurance_mix": {
      "employer": 0.52,
      "medicare": 0.15,
      "medicaid": 0.18,
      "individual": 0.08,
      "uninsured": 0.07
    }
  }
}

CohortSpecification

Generation input for other HealthSim products:

{
  "cohort_id": "houston_diabetics_2024",
  "name": "Houston Metro Diabetic Adults",
  "target_size": 10000,
  "geography": {
    "type": "msa",
    "cbsa_code": "26420",
    "name": "Houston-The Woodlands-Sugar Land, TX"
  },
  "demographics": {
    "age": {
      "min": 18, "max": 85, "mean": 58.4,
      "brackets": { "18-44": 0.18, "45-64": 0.42, "65-74": 0.28, "75+": 0.12 }
    },
    "sex": { "male": 0.48, "female": 0.52 },
    "race_ethnicity": { "white_nh": 0.28, "black": 0.22, "hispanic": 0.38, "asian": 0.08 }
  },
  "clinical_profile": {
    "primary_condition": { "icd10": "E11", "name": "Type 2 Diabetes" },
    "comorbidities": {
      "I10": { "name": "Hypertension", "rate": 0.71 },
      "E78": { "name": "Hyperlipidemia", "rate": 0.68 },
      "E66": { "name": "Obesity", "rate": 0.62 }
    }
  },
  "sdoh_profile": {
    "poverty_rate": 0.18,
    "uninsured_rate": 0.16,
    "food_insecurity": 0.15,
    "svi_mean": 0.58
  },
  "z_code_rates": {
    "Z59.6": { "name": "Low income", "rate": 0.18 },
    "Z59.41": { "name": "Food insecurity", "rate": 0.15 }
  },
  "insurance_mix": {
    "medicare": 0.38, "medicaid": 0.22, "commercial": 0.32, "uninsured": 0.08
  }
}

Cross-Product Integration

Integration Flow

                    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
                    โ”‚   PopulationSim     โ”‚
                    โ”‚  CohortSpecificationโ”‚
                    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                               โ”‚
           โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
           โ”‚                   โ”‚                   โ”‚
           โ–ผ                   โ–ผ                   โ–ผ
    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”     โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”     โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
    โ”‚ PatientSim  โ”‚     โ”‚ MemberSim   โ”‚     โ”‚  TrialSim   โ”‚
    โ”‚ - patients  โ”‚     โ”‚ - members   โ”‚     โ”‚ - subjects  โ”‚
    โ”‚ - diagnoses โ”‚     โ”‚ - claims    โ”‚     โ”‚ - diversity โ”‚
    โ”‚ - SDOH codesโ”‚     โ”‚ - plans     โ”‚     โ”‚ - sites     โ”‚
    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”˜     โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”˜     โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
           โ”‚                   โ”‚
           โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                     โ–ผ
              โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
              โ”‚ RxMemberSim โ”‚
              โ”‚ - Rx claims โ”‚
              โ”‚ - adherence โ”‚
              โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Integration Patterns

PopulationSim OutputReceiving ProductResult
CohortSpecificationPatientSimPatients matching demographic/clinical profile
CohortSpecificationMemberSimMembers with realistic plan/utilization mix
CohortSpecificationTrialSimDiverse trial subjects meeting FDA guidance
PopulationProfileNetworkSimService area provider network design

Data Sources (Embedded v2.0)

PopulationSim includes an embedded data package (148 MB) with 100% US coverage:

SourceFileRecordsData Year
CDC PLACES (County)data/county/places_county_2024.csv3,1432022 BRFSS
CDC PLACES (Tract)data/tract/places_tract_2024.csv83,5222022 BRFSS
SVI (County)data/county/svi_county_2022.csv3,1442018-2022 ACS
SVI (Tract)data/tract/svi_tract_2022.csv84,1202018-2022 ACS
ADI (Block Group)data/block_group/adi_blockgroup_2023.csv242,3362019-2023 ACS
Geography Crosswalksdata/crosswalks/*.csvVarious2023 Census

DuckDB Reference Tables

For SQL-based analysis, reference data is also available in the DuckDB database:

TableSourcePurpose
population.places_tractCDC PLACESTract-level health indicators
population.places_countyCDC PLACESCounty-level health indicators
population.svi_tractCDC SVITract-level vulnerability
population.svi_countyCDC SVICounty-level vulnerability
population.adi_blockgroupADIBlock group deprivation

See Data Architecture for details.

Directory Structure

skills/populationsim/
โ”œโ”€โ”€ SKILL.md                           # This file - master router
โ”œโ”€โ”€ README.md                          # Product overview
โ”œโ”€โ”€ population-intelligence-domain.md  # Core domain knowledge
โ”‚
โ”œโ”€โ”€ data/                              # Embedded Data Package (v2.0)
โ”‚   โ”œโ”€โ”€ README.md                      # Data dictionary
โ”‚   โ”œโ”€โ”€ county/                        # County-level files
โ”‚   โ”œโ”€โ”€ tract/                         # Tract-level files
โ”‚   โ”œโ”€โ”€ block_group/                   # Block group files (ADI)
โ”‚   โ””โ”€โ”€ crosswalks/                    # FIPS and CBSA mappings
โ”‚
โ”œโ”€โ”€ data-access/                       # Data Access Skills (v2.0)
โ”‚   โ”œโ”€โ”€ README.md                      # Category overview
โ”‚   โ”œโ”€โ”€ data-lookup.md                 # Direct value lookups
โ”‚   โ”œโ”€โ”€ geography-lookup.md            # FIPS code resolution
โ”‚   โ””โ”€โ”€ data-aggregation.md            # Geographic aggregation
โ”‚
โ”œโ”€โ”€ geographic/                        # Geographic Intelligence
โ”‚   โ”œโ”€โ”€ README.md                      # Category overview
โ”‚   โ”œโ”€โ”€ county-profile.md              # County-level profiles
โ”‚   โ”œโ”€โ”€ census-tract-analysis.md       # Tract-level analysis
โ”‚   โ”œโ”€โ”€ metro-area-profile.md          # MSA/CBSA profiles
โ”‚   โ””โ”€โ”€ custom-region-builder.md       # Custom region aggregation
โ”‚
โ”œโ”€โ”€ health-patterns/                   # Health Analysis
โ”‚   โ”œโ”€โ”€ README.md                      # Category overview
โ”‚   โ”œโ”€โ”€ chronic-disease-prevalence.md  # Disease burden analysis
โ”‚   โ”œโ”€โ”€ health-behavior-patterns.md    # Risk factor analysis
โ”‚   โ”œโ”€โ”€ healthcare-access-analysis.md  # Coverage and access
โ”‚   โ””โ”€โ”€ health-outcome-disparities.md  # Disparity analysis
โ”‚
โ”œโ”€โ”€ sdoh/                              # Social Determinants
โ”‚   โ”œโ”€โ”€ README.md                      # SDOH framework overview
โ”‚   โ”œโ”€โ”€ svi-analysis.md                # Social Vulnerability Index
โ”‚   โ”œโ”€โ”€ adi-analysis.md                # Area Deprivation Index
โ”‚   โ”œโ”€โ”€ economic-indicators.md         # Income, poverty, employment
โ”‚   โ””โ”€โ”€ community-factors.md           # Housing, transportation, food
โ”‚
โ”œโ”€โ”€ cohorts/                           # Cohort Definition
โ”‚   โ”œโ”€โ”€ README.md                      # CohortSpecification schema
โ”‚   โ”œโ”€โ”€ cohort-specification.md        # Complete cohort definition
โ”‚   โ”œโ”€โ”€ demographic-distribution.md    # Age, sex, race distributions
โ”‚   โ”œโ”€โ”€ clinical-prevalence-profile.md # Comorbidity patterns
โ”‚   โ””โ”€โ”€ sdoh-profile-builder.md        # SDOH and Z-code rates
โ”‚
โ””โ”€โ”€ trial-support/                     # Clinical Trial Support
    โ”œโ”€โ”€ README.md                      # Trial integration overview
    โ”œโ”€โ”€ feasibility-estimation.md      # Eligible population modeling
    โ”œโ”€โ”€ site-selection-support.md      # Site network optimization
    โ””โ”€โ”€ enrollment-projection.md       # Timeline projections

Quick Examples

Example 1: County Health Profile

Request: "Profile San Diego County health indicators"

Response:

San Diego County, CA (FIPS 06073) Population Profile:

Demographics (ACS 2022 5-year):

  • Total Population: 3,286,069
  • Median Age: 37.1 years
  • Race/Ethnicity: 43% White NH, 34% Hispanic, 12% Asian, 5% Black

Health Indicators (CDC PLACES 2024):

  • Diabetes: 9.5% (vs 10.1% national)
  • Obesity: 28.0% (vs 32.1% national)
  • Hypertension: 28.5% (vs 32.4% national)

SDOH Indices:

  • SVI Overall: 0.42 (moderate vulnerability)
  • ADI National Rank: 35th percentile

Example 2: Cohort for PatientSim

Request: "Define a cohort of high-risk diabetics in underserved California"

Response: Creates CohortSpecification with:

  • Geography: CA census tracts with SVI โ‰ฅ 0.70
  • Demographics: Hispanic 58%, mean age 58.4
  • Comorbidities: HTN 71%, obesity 62%
  • SDOH: Rx cost barrier 31%, food insecurity 22%

Example 3: Trial Feasibility

Request: "Feasibility for T2DM trial: age 40-70, HbA1c 8-11%"

Response:

StagePopulationConversion
T2DM Prevalent34.2M-
Age 40-7024.8M72.5%
HbA1c 8-11%7.4M29.8%
After exclusions4.2M-

Top Metros: Houston (128K), Miami (115K), Los Angeles (108K)

Related Products

Domain Knowledge

For detailed concepts and methodology, see:


Generative Framework Integration

PopulationSim provides real population data that drives realistic generation in the Generative Framework.

PopulationSim's Unique Role

Unlike other products that generate synthetic data, PopulationSim:

  1. Queries real reference data (Census, CDC PLACES, SVI/ADI)
  2. Outputs specifications that define cohort characteristics
  3. Feeds the Profile Builder with realistic distributions

Integration Flow

User Request
     โ”‚
     โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”     โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”     โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ PopulationSim โ”‚โ”€โ”€โ–บโ”‚ Profile Builder โ”‚โ”€โ”€โ–บโ”‚ Profile Executor โ”‚
โ”‚ "TX Medicare" โ”‚   โ”‚ Add conditions  โ”‚    โ”‚ Generate 100    โ”‚
โ”‚ demographics  โ”‚   โ”‚ Add coverage    โ”‚    โ”‚ patients        โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜     โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜     โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Example: Realistic Geographic Cohort

"Generate 200 Medicare diabetics in Harris County, TX with realistic demographics"

PopulationSim provides:

  • Age distribution from Census (mean 72.3, std 8.1)
  • Gender split (47% M, 53% F)
  • Race/ethnicity from ACS (38% Hispanic, 32% White, 22% Black, 8% Asian)
  • SDOH indicators from SVI (moderate vulnerability)
  • Diabetes prevalence from CDC PLACES (16.2%)

This becomes a ProfileSpecification that the Profile Executor uses.

See: integration/ for detailed integration patterns