excel-formula-analyzer
Analyze Excel spreadsheet formulas to build dependency DAGs (Directed Acyclic Graphs) and understand calculation chains. This skill should be used when the user wants to reverse-engineer Excel formula dependencies, trace how values are calculated from inputs to outputs, validate formula logic, or create reusable calculators from spreadsheet logic.
$ Installer
git clone https://github.com/majiayu000/claude-skill-registry /tmp/claude-skill-registry && cp -r /tmp/claude-skill-registry/skills/development/excel-formula-analyzer ~/.claude/skills/claude-skill-registry// tip: Run this command in your terminal to install the skill
name: excel-formula-analyzer description: Analyze Excel spreadsheet formulas to build dependency DAGs (Directed Acyclic Graphs) and understand calculation chains. This skill should be used when the user wants to reverse-engineer Excel formula dependencies, trace how values are calculated from inputs to outputs, validate formula logic, or create reusable calculators from spreadsheet logic.
Excel Formula Analyzer
Quick Start
To trace formula dependencies in an Excel workbook:
python .claude/skills/excel-formula-analyzer/scripts/trace-formula-dag.py "Workbook.xlsx" "Sheet Name" \
--table MyTable \
--columns 16-36,57,58,59
Sample output:
[Col 36] End of Harvest
= =IF([@[Actual End of Harvest]]="",[@[Expected End of Harvest]],[@[Actual ...
Depends on: ['Actual End of Harvest', 'Expected End of Harvest']
|- [31] Expected End of Harvest
= =[@[Expected Beginning of Harvest]]+[@[Harvest Window]]+[@[Additional ...
|- [28] Expected Beginning of Harvest (calculated)
|- [59] Harvest Window (INPUT)
|- [33] Additional Days of Harvest (INPUT)
Overview
This skill enables analysis of Excel spreadsheet formulas to understand how calculations flow from inputs to outputs. It builds dependency graphs (DAGs) from formula references and can help create equivalent code implementations.
When to Use This Skill
- Tracing how a specific cell's value is calculated
- Building a complete dependency graph of formula relationships
- Reverse-engineering business logic embedded in spreadsheets
- Creating TypeScript/Python implementations that replicate Excel calculations
- Validating that code implementations match Excel behavior
Core Workflow
Step 1: Load the Workbook
Load with data_only=False to get formulas instead of computed values:
import openpyxl
wb = openpyxl.load_workbook("spreadsheet.xlsx", data_only=False)
ws = wb["SheetName"]
Step 2: Parse Structured Table References
Excel tables use structured references like TableName[[#This Row],[Column]]. See references/structured-refs.md for complete syntax details.
Key patterns:
Table[[#This Row],[Column]]- Same-row referenceTable[Column]- Full column reference (used in XLOOKUP)
Step 3: Build the DAG
Run the trace script to analyze dependencies:
python .claude/skills/excel-formula-analyzer/scripts/trace-formula-dag.py "file.xlsx" "Sheet" --columns 16-36
Step 4: Implement in Code
Translate formulas to TypeScript/Python using patterns in references/formula-patterns.md.
What Didn't Work (Lessons Learned)
The formulas Library
The Python formulas library can parse Excel formulas into ASTs, but fails on structured table references:
Error: Not a valid formula: =MEDIAN(Crops[Revenue - Copy])
The library expects A1-style references and cannot handle TableName[Column] syntax.
xlwings COM Interface
xlwings can use Excel's native DirectPrecedents API, but:
- Requires Excel to be installed (Windows/Mac only)
- Can timeout on large workbooks
- COM interface can be unreliable
Converting Table References to A1 Notation
While Excel can convert structured refs to A1 notation (Table menu > Convert to Range), this destroys the semantic meaning and makes the formulas harder to understand.
Key Nuances
Excel Date Handling
Excel stores dates as numbers (days since 1900-01-01, with a leap year bug):
from datetime import datetime, timedelta
def excel_to_date(excel_num):
if excel_num is None or excel_num < 1:
return None
return datetime(1899, 12, 30) + timedelta(days=excel_num)
Formula vs Value Reading
data_only=True: Returns calculated values (requires the file was saved with calculations)data_only=False: Returns formulas as strings
To get both, load the workbook twice.
COALESCE Patterns
Excel implements COALESCE with IF(ISBLANK()). In code, use nullish coalescing:
const value = actualValue ?? plannedValue;
Resources
scripts/
- trace-formula-dag.py - Build formula DAGs with CLI arguments for table name and columns
- validate-calculator.py - Validate code implementations against Excel values
references/
- structured-refs.md - Complete guide to Excel structured table reference syntax, parsing patterns, and gotchas
- formula-patterns.md - Common Excel formulas and their TypeScript/Python equivalents
Repository
