extracting-session-data
Locates, lists, filters, and extracts structured data from Claude Code native session logs. Supports both single and multiple session analysis.
$ 설치
git clone https://github.com/bitwarden/ai-plugins /tmp/ai-plugins && cp -r /tmp/ai-plugins/plugins/claude-retrospective/skills/extracting-session-data ~/.claude/skills/ai-plugins// tip: Run this command in your terminal to install the skill
name: extracting-session-data description: Locates, lists, filters, and extracts structured data from Claude Code native session logs. Supports both single and multiple session analysis.
Extracting Session Data Skill
Core Responsibility
Provide raw access to Claude Code session logs stored in ~/.claude/projects/{project-dir}/{session-id}.jsonl.
Key Principle: This skill extracts data only - return raw data to calling skills for analysis. Do not analyze or interpret within this skill.
Available Scripts
All scripts located in scripts/ subdirectory relative to this skill.
1. locate-logs.sh
Find log directory or specific session file path.
# Get logs directory for current working directory
scripts/locate-logs.sh
# Get logs directory for specific project
scripts/locate-logs.sh /path/to/project
# Get specific session log file path
scripts/locate-logs.sh /path/to/project abc123-session-id
Use when: Building dynamic paths, verifying logs exist before processing.
2. list-sessions.sh
Enumerate all sessions with metadata (ID, size, lines, date, branch).
# List all sessions (table format)
scripts/list-sessions.sh
# JSON output
scripts/list-sessions.sh --format json
# Sort by size or lines
scripts/list-sessions.sh --sort size
scripts/list-sessions.sh --sort lines
# Specific project
scripts/list-sessions.sh /path/to/project
Output formats: table, json, csv
Sort options: date, size, lines
Use when: Starting retrospective, showing available sessions to user, checking for recent sessions.
3. extract-data.sh
Parse JSONL logs and extract specific data types.
Available extraction types:
metadata- Session info (ID, timestamps, branch, working dir)user-prompts- All user messagestool-usage- Tool call statisticserrors- Failed tool calls with timestampsthinking- Thinking blocks (if extended thinking enabled)text-responses- Assistant text responses onlystatistics- Session metrics (message counts, tool calls, errors)all- Combined extraction
# Extract from specific session
scripts/extract-data.sh --type statistics --session SESSION_ID
scripts/extract-data.sh --type errors --session SESSION_ID
scripts/extract-data.sh --type tool-usage --session SESSION_ID
# Extract from all sessions (omit --session)
scripts/extract-data.sh --type statistics
# Limit output
scripts/extract-data.sh --type user-prompts --limit 10
# Different project
scripts/extract-data.sh --type metadata --project /path/to/project
Use when: Need specific data without loading entire log, generating metrics, identifying errors.
4. filter-sessions.sh
Find sessions matching criteria.
Filter options:
--since DATE- Sessions modified since date ("2 days ago", "2025-10-20")--until DATE- Sessions modified until date--branch NAME- Sessions on specific git branch--min-size SIZE- Minimum file size ("1M", "500K")--max-size SIZE- Maximum file size--min-lines N- Minimum line count--max-lines N- Maximum line count--has-errors- Only sessions with failed tool calls--keyword WORD- Sessions containing keyword
Output formats: list, paths, json
# Recent sessions
scripts/filter-sessions.sh --since "2 days ago"
# Large sessions with errors
scripts/filter-sessions.sh --min-lines 500 --has-errors
# Sessions on main branch in last week
scripts/filter-sessions.sh --branch main --since "7 days ago"
# Sessions containing keyword
scripts/filter-sessions.sh --keyword "authentication"
# Get paths only (for piping)
scripts/filter-sessions.sh --since "1 day ago" --format paths
Use when: User requests analysis of recent sessions, finding sessions for specific feature/branch, identifying problematic sessions.
Working Process
Single Session Analysis
# 1. Verify session exists and get metadata
scripts/extract-data.sh --type metadata --session SESSION_ID
# 2. Get session statistics (to determine size)
scripts/extract-data.sh --type statistics --session SESSION_ID
# 3. Extract specific data as needed
scripts/extract-data.sh --type errors --session SESSION_ID
scripts/extract-data.sh --type tool-usage --session SESSION_ID
Multiple Session Analysis
# 1. Filter to find relevant sessions
scripts/filter-sessions.sh --since "7 days ago" --branch main
# 2. Extract data from all filtered sessions
scripts/extract-data.sh --type statistics
# 3. Or iterate through filtered subset
SESSIONS=$(scripts/filter-sessions.sh --has-errors --format paths)
for session in $SESSIONS; do
SESSION_ID=$(basename "$session" .jsonl)
scripts/extract-data.sh --type errors --session $SESSION_ID
done
Integration Pattern for Calling Skills
When another skill (like retrospecting) needs session data:
- Discovery: Use
list-sessions.shorfilter-sessions.shto find relevant sessions - Size Check: Use
extract-data.sh --type statisticsto determine session complexity - Targeted Extraction: Use
extract-data.shwith specific types for needed data - Return Raw Data: Return extracted data to caller for analysis
Example:
# Get latest session ID
LATEST=$(scripts/list-sessions.sh --format json --sort date | jq -r '.[0].sessionId')
# Check size before processing
STATS=$(scripts/extract-data.sh --type statistics --session $LATEST)
LINE_COUNT=$(echo "$STATS" | grep "Total Lines:" | awk '{print $3}')
# Extract based on size
if [ "$LINE_COUNT" -lt 500 ]; then
# Small session: extract detail
scripts/extract-data.sh --type errors --session $LATEST
scripts/extract-data.sh --type tool-usage --session $LATEST
else
# Large session: summary only
scripts/extract-data.sh --type statistics --session $LATEST
fi
Context Budget Management
CRITICAL: This skill is designed for context efficiency
Use Bash Processing, Not Read Tool
# GOOD: Extract via bash, stays in bash context
STATS=$(scripts/extract-data.sh --type statistics)
# Process $STATS in bash
# BAD: Reading full log files
Read ~/.claude/projects/-path/session.jsonl
# Loads entire file into context unnecessarily
Check Session Size Before Loading
Never load full session logs into context without checking size first.
# Always check statistics first
scripts/extract-data.sh --type statistics --session SESSION_ID
# Shows total lines, message counts, etc.
# Decision rules:
# - Small (<500 lines): Can extract detail safely
# - Medium (500-2000 lines): Use selective extraction
# - Large (>2000 lines): Statistics only, offer targeted deep-dives
Return Raw Data to Caller
This skill should:
- Execute bash scripts to extract data
- Return raw text output to calling skill
- Let calling skill manage context for analysis
- Avoid interpretation or analysis within this skill
Output Format
Return raw extracted data with minimal formatting:
# Statistics output
Session: abc123-def456-ghi789
Total Lines: 450
User Messages: 12
Assistant Messages: 23
Tool Calls: 45
Errors: 2
# Tool usage output
=== Tool Usage: abc123-def456-ghi789 ===
Read 15
Bash 12
Edit 8
Grep 5
Write 3
No analysis, no interpretation - just data extraction.
Error Handling
All scripts exit with non-zero status on errors and output to stderr.
Check exit status before processing:
if ! scripts/locate-logs.sh /path/to/project &>/dev/null; then
# Handle: logs directory doesn't exist
echo "Project has no session logs yet"
fi
if ! scripts/extract-data.sh --type metadata --session abc123 &>/dev/null; then
# Handle: session doesn't exist
echo "Session not found"
fi
Common error messages:
Error: Logs directory not found: ~/.claude/projects/-pathError: Session file not found: ~/.claude/projects/-path/session-id.jsonlError: --type is requiredError: jq is required but not installed. Install with: brew install jq
Path Calculation
Claude Code stores sessions using this pattern:
~/.claude/projects/{project-identifier}/{session-id}.jsonl
Where {project-identifier} is calculated by replacing all / with - in the absolute working directory path:
# Example: /Users/user/project → -Users-user-project
PROJECT_ID=$(echo "${PWD}" | sed 's/\//\-/g')
LOGS_DIR="${HOME}/.claude/projects/${PROJECT_ID}"
All scripts use locate-logs.sh internally for consistent path calculation.
Anti-Patterns to Avoid
Don't:
- Load full session logs into context without checking size
- Parse JSONL manually - use
extract-data.sh - Hardcode log paths - use
locate-logs.sh - Analyze or interpret data - return raw data to caller
- Process large logs synchronously without user awareness
Do:
- Check session size with
--type statisticsbefore processing - Use appropriate extraction type for specific needs
- Filter sessions before extraction for efficiency
- Stream/pipe data when processing multiple sessions
- Return raw data for caller to analyze
Success Criteria
Effective use of this skill means:
- Efficient Discovery: Quickly find relevant sessions without manual searching
- Targeted Extraction: Get exactly the data needed, nothing more
- Context Preservation: Avoid loading unnecessary data into context
- Raw Data Focus: Return unprocessed data for caller to analyze
- Multi-Session Support: Handle analysis across timeframes or branches efficiently
Dependencies
Required:
bash(v4.0+)jq(JSON parser)
Scripts check for jq and provide installation instructions if missing:
Error: jq is required but not installed. Install with: brew install jq
Repository
