Marketplace

data-modeling

Data modeling with Entity-Relationship Diagrams (ERDs), data dictionaries, and conceptual/logical/physical models. Documents data structures, relationships, and attributes.

allowed_tools: Read, Glob, Grep, Task, Skill

$ インストール

git clone https://github.com/melodic-software/claude-code-plugins /tmp/claude-code-plugins && cp -r /tmp/claude-code-plugins/plugins/business-analysis/skills/data-modeling ~/.claude/skills/claude-code-plugins

// tip: Run this command in your terminal to install the skill


name: data-modeling description: Data modeling with Entity-Relationship Diagrams (ERDs), data dictionaries, and conceptual/logical/physical models. Documents data structures, relationships, and attributes. allowed-tools: Read, Glob, Grep, Task, Skill

Data Modeling

When to Use This Skill

Use this skill when:

  • Data Modeling tasks - Working on data modeling with entity-relationship diagrams (erds), data dictionaries, and conceptual/logical/physical models. documents data structures, relationships, and attributes
  • Planning or design - Need guidance on Data Modeling approaches
  • Best practices - Want to follow established patterns and standards

Overview

Create and document data structures using Entity-Relationship Diagrams (ERDs), data dictionaries, and structured data models. Supports conceptual, logical, and physical modeling levels for database design and data architecture.

What is Data Modeling?

Data modeling creates visual and structured representations of data elements and their relationships. It documents:

  • Entities: Things about which data is stored
  • Attributes: Properties of entities
  • Relationships: How entities connect
  • Constraints: Rules governing data

Modeling Levels

LevelPurposeAudienceDetail
ConceptualBusiness conceptsBusiness usersEntities, high-level relationships
LogicalData structureAnalysts, designersEntities, attributes, all relationships
PhysicalImplementationDevelopers, DBAsTables, columns, types, indexes

Conceptual Model

High-level view of business concepts:

  • Major entities only
  • Key relationships
  • No attributes (or minimal)
  • No technical details

Logical Model

Technology-independent data structure:

  • All entities and attributes
  • Primary and foreign keys
  • All relationships with cardinality
  • Normalization applied
  • No physical implementation details

Physical Model

Database-specific implementation:

  • Table names (physical naming)
  • Column names and data types
  • Indexes and constraints
  • Views and stored procedures
  • Database-specific features

ERD Notation

Entity (Rectangle)

An entity represents a thing about which data is stored.

┌─────────────────┐
│    CUSTOMER     │
├─────────────────┤
│ customer_id PK  │
│ name            │
│ email           │
│ created_at      │
└─────────────────┘

Entity Types:

TypeDescriptionExample
StrongIndependent existenceCustomer, Product
WeakDepends on another entityOrder Line (depends on Order)
AssociativeResolves M:N relationshipsEnrollment (Student-Course)

Attributes

TypeSymbolDescription
Primary Key (PK)Underlined/PKUnique identifier
Foreign Key (FK)FKReference to another entity
Required* or NOT NULLMust have value
Optional○ or NULLMay be empty
Derived/Calculated from other attributes
Composite{attrs}Made of sub-attributes
Multi-valued[attr]Can have multiple values

Relationships (Lines)

Notation Styles:

StyleUsed In
ChenAcademic, conceptual
Crow's FootIndustry standard
UMLSoftware design
IDEF1XGovernment, structured

Crow's Foot Notation:

SymbolMeaning
──One (mandatory)
──○Zero or one (optional)
──<Many
──○<Zero or many

Cardinality

NotationMeaningExample
1:1One to oneEmployee → Workstation
1:MOne to manyCustomer → Orders
M:NMany to manyStudents ↔ Courses

Reading Cardinality:

"One [Entity A] has [min]..[max] [Entity B]"

Example: "One Customer has 0..many Orders"

Workflow

Phase 1: Identify Entities

Step 1: Extract Nouns from Requirements

From business requirements, identify:

  • Things the business tracks
  • Subjects of business rules
  • Sources and targets of data

Step 2: Filter Candidates

KeepExclude
Independent conceptsAttributes (properties of entities)
Things with multiple instancesSynonyms (same concept, different name)
Things requiring data storageActions (verbs, not nouns)

Step 3: Document Entities

## Entities

| Entity | Description | Example |
|--------|-------------|---------|
| Customer | Person or organization that purchases | John Smith, Acme Corp |
| Order | Purchase transaction | Order #12345 |
| Product | Item available for sale | Widget, Gadget |

Phase 2: Define Attributes

Step 1: List Attributes for Each Entity

For each entity, identify:

  • What do we need to know about this entity?
  • What uniquely identifies it?
  • What data does the business reference?

Step 2: Classify Attributes

AttributeTypeRequiredNotes
customer_idPKYesSurrogate key
emailUniqueYesBusiness key
nameStringYes
phoneStringNoOptional

Step 3: Identify Keys

  • Primary Key (PK): Unique identifier
  • Natural Key: Business-meaningful identifier
  • Surrogate Key: System-generated identifier
  • Composite Key: Multiple attributes combined

Phase 3: Define Relationships

Step 1: Identify Connections

For each pair of entities:

  • Is there a business connection?
  • What is the nature of the relationship?
  • What is the cardinality?

Step 2: Document Relationships

## Relationships

| Relationship | From | To | Cardinality | Description |
|--------------|------|-----|-------------|-------------|
| places | Customer | Order | 1:M | Customer places orders |
| contains | Order | Product | M:N | Order contains products |

Step 3: Resolve Many-to-Many

M:N relationships require associative entities:

Student ──M:N── Course

Becomes:

Student ──1:M── Enrollment ──M:1── Course

Phase 4: Normalize (Logical Model)

Normal Forms:

FormRuleViolation Example
1NFAtomic values, no repeating groupsPhone1, Phone2, Phone3
2NFNo partial dependenciesNon-key depends on part of composite key
3NFNo transitive dependenciesNon-key depends on non-key
BCNFEvery determinant is a candidate keyOverlap in candidate keys

When to Denormalize:

  • Read performance critical
  • Reporting/analytics use cases
  • Data warehouse design
  • Justified with clear trade-off analysis

Phase 5: Create Physical Model

Step 1: Map to Physical Types

Logical TypePhysical (PostgreSQL)Physical (SQL Server)
String(50)VARCHAR(50)NVARCHAR(50)
IntegerINTEGERINT
Decimal(10,2)NUMERIC(10,2)DECIMAL(10,2)
DateDATEDATE
TimestampTIMESTAMPDATETIME2
BooleanBOOLEANBIT

Step 2: Define Constraints

  • Primary key constraints
  • Foreign key constraints
  • Unique constraints
  • Check constraints
  • Default values

Step 3: Plan Indexes

  • Primary key (automatic)
  • Foreign keys (for joins)
  • Frequently queried columns
  • Covering indexes for performance

Output Formats

Mermaid ERD

erDiagram
    CUSTOMER ||--o{ ORDER : places
    ORDER ||--|{ ORDER_LINE : contains
    PRODUCT ||--o{ ORDER_LINE : includes

    CUSTOMER {
        int customer_id PK
        string name
        string email UK
        date created_at
    }

    ORDER {
        int order_id PK
        int customer_id FK
        date order_date
        decimal total
        string status
    }

    ORDER_LINE {
        int order_id PK,FK
        int product_id PK,FK
        int quantity
        decimal unit_price
    }

    PRODUCT {
        int product_id PK
        string name
        string sku UK
        decimal price
        int stock_qty
    }

Data Dictionary

## Data Dictionary

### CUSTOMER

| Column | Type | Null | Key | Default | Description |
|--------|------|------|-----|---------|-------------|
| customer_id | INT | No | PK | AUTO | Unique identifier |
| name | VARCHAR(100) | No | | | Customer full name |
| email | VARCHAR(255) | No | UK | | Contact email |
| phone | VARCHAR(20) | Yes | | NULL | Contact phone |
| created_at | TIMESTAMP | No | | NOW() | Record creation |

**Indexes:**
- `pk_customer` (customer_id) - Primary
- `uk_customer_email` (email) - Unique
- `ix_customer_name` (name) - Search

**Constraints:**
- Email format validation (CHECK)
- Name length minimum 2 characters

Structured Data (YAML)

data_model:
  name: "E-Commerce"
  version: "1.0"
  date: "2025-01-15"
  level: "logical"  # conceptual, logical, physical
  analyst: "data-modeler"

  entities:
    - name: "Customer"
      type: "strong"
      description: "Person or organization that makes purchases"
      attributes:
        - name: "customer_id"
          type: "integer"
          key: "primary"
          required: true
          generated: true

        - name: "email"
          type: "string"
          length: 255
          key: "unique"
          required: true

        - name: "name"
          type: "string"
          length: 100
          required: true

    - name: "Order"
      type: "strong"
      description: "Purchase transaction"
      attributes:
        - name: "order_id"
          type: "integer"
          key: "primary"
          required: true

        - name: "customer_id"
          type: "integer"
          key: "foreign"
          references: "Customer.customer_id"
          required: true

  relationships:
    - name: "places"
      from: "Customer"
      to: "Order"
      cardinality: "1:M"
      from_participation: "optional"  # 0..1
      to_participation: "mandatory"   # 1..M
      description: "Customer places orders"

  constraints:
    - entity: "Customer"
      type: "check"
      expression: "LENGTH(name) >= 2"
      description: "Name minimum length"

  indexes:
    - entity: "Order"
      name: "ix_order_date"
      columns: ["order_date"]
      purpose: "Date range queries"

Narrative Summary

## Data Model: E-Commerce

**Version:** 1.0
**Date:** [ISO Date]
**Level:** Logical

### Entity Summary

| Entity | Description | Key Relationships |
|--------|-------------|-------------------|
| Customer | Purchasers | Places Orders |
| Order | Transactions | Belongs to Customer, Contains Products |
| Product | Items for sale | Included in Orders |
| Order Line | Order details | Links Order to Product |

### Key Relationships

1. **Customer → Order (1:M)**
   - One customer can place many orders
   - Each order belongs to exactly one customer

2. **Order ↔ Product (M:N via Order Line)**
   - An order can contain many products
   - A product can appear in many orders

### Data Integrity Rules

1. Orders cannot exist without a customer
2. Order lines must reference valid order and product
3. Stock quantity cannot be negative
4. Email must be unique per customer

### Notes

- Consider partitioning Orders by date for large volumes
- Product price stored in Order Line for historical accuracy

Common Patterns

Inheritance (Subtype/Supertype)

erDiagram
    PERSON ||--o| EMPLOYEE : "is a"
    PERSON ||--o| CUSTOMER : "is a"

    PERSON {
        int person_id PK
        string name
        string email
    }

    EMPLOYEE {
        int person_id PK,FK
        date hire_date
        decimal salary
    }

    CUSTOMER {
        int person_id PK,FK
        string company
        decimal credit_limit
    }

Self-Referencing

erDiagram
    EMPLOYEE ||--o{ EMPLOYEE : "manages"

    EMPLOYEE {
        int employee_id PK
        string name
        int manager_id FK
    }

Audit Trail

erDiagram
    ENTITY ||--o{ ENTITY_HISTORY : "has history"

    ENTITY {
        int id PK
        string data
        timestamp updated_at
    }

    ENTITY_HISTORY {
        int history_id PK
        int entity_id FK
        string data
        timestamp valid_from
        timestamp valid_to
        string changed_by
    }

Integration

Upstream

  • Requirements - Data requirements source
  • domain-storytelling - Domain concepts
  • process-modeling - Data in processes

Downstream

  • Database design - Physical implementation
  • API design - Data contracts
  • Integration - Data exchange

Related Skills

  • process-modeling - Process context for data
  • journey-mapping - Customer data touchpoints
  • decision-analysis - Data-driven decisions
  • capability-mapping - Data supporting capabilities

Version History

  • v1.0.0 (2025-12-26): Initial release