Marketplace

data-extraction-patterns

Common patterns for extracting analytics data from GA4 and GSC with API handling

$ Installieren

git clone https://github.com/MadAppGang/claude-code /tmp/claude-code && cp -r /tmp/claude-code/plugins/seo/skills/data-extraction-patterns ~/.claude/skills/claude-code

// tip: Run this command in your terminal to install the skill


name: data-extraction-patterns description: Common patterns for extracting analytics data from GA4 and GSC with API handling

Data Extraction Patterns

When to Use

  • Setting up analytics data pipelines
  • Combining data from multiple sources
  • Handling API rate limits and errors
  • Caching frequently accessed data
  • Building data collection workflows

API Reference

Google Analytics 4 (GA4)

MCP Server: mcp-server-google-analytics

Key Operations:

get_report({
  propertyId: "properties/123456789",
  dateRange: { startDate: "30daysAgo", endDate: "today" },
  dimensions: ["pagePath", "date"],
  metrics: ["screenPageViews", "averageSessionDuration", "bounceRate"]
})

Useful Metrics:

MetricDescriptionUse Case
screenPageViewsTotal page viewsTraffic volume
sessionsUser sessionsVisitor count
averageSessionDurationAvg time in sessionEngagement
bounceRateSingle-page visitsContent quality
engagementRateEngaged sessions %True engagement
scrolledUsersUsers who scrolledContent consumption

Useful Dimensions:

DimensionDescription
pagePathURL path
dateDate (for trending)
sessionSourceTraffic source
deviceCategoryDesktop/mobile/tablet

Google Search Console (GSC)

MCP Server: mcp-server-gsc

Key Operations:

search_analytics({
  siteUrl: "https://example.com",
  startDate: "2025-11-27",
  endDate: "2025-12-27",
  dimensions: ["query", "page"],
  rowLimit: 1000
})

get_url_inspection({
  siteUrl: "https://example.com",
  inspectionUrl: "https://example.com/page"
})

Available Metrics:

MetricDescriptionUse Case
clicksTotal clicks from searchTraffic from Google
impressionsTimes shown in resultsVisibility
ctrClick-through rateSnippet effectiveness
positionAverage rankingSEO success

Dimensions:

DimensionDescription
querySearch query
pageLanding page URL
countryUser country
deviceDesktop/mobile/tablet
dateDate (for trending)

Parallel Execution Pattern

Optimal Data Fetch (All Sources)

## Parallel Data Fetch Pattern

When fetching from multiple sources, issue all requests in a SINGLE message
for parallel execution:

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚  MESSAGE 1: Parallel Data Requests                              โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚                                                                  โ”‚
โ”‚  [MCP Call 1]: google-analytics.get_report(...)                 โ”‚
โ”‚  [MCP Call 2]: google-search-console.search_analytics(...)      โ”‚
โ”‚                                                                  โ”‚
โ”‚  โ†’ All execute simultaneously                                    โ”‚
โ”‚  โ†’ Results return when all complete                              โ”‚
โ”‚  โ†’ ~2x faster than sequential                                    โ”‚
โ”‚                                                                  โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Sequential (When Needed)

Some operations require sequential execution:

## Sequential Pattern (Dependencies)

When one request depends on another's result:

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚  MESSAGE 1: Get list of pages                                   โ”‚
โ”‚  โ†’ Returns: ["/page1", "/page2", "/page3"]                      โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚  MESSAGE 2: Get details for each page                           โ”‚
โ”‚  โ†’ Uses page list from Message 1                                โ”‚
โ”‚  โ†’ Can parallelize within this message                          โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Rate Limiting

API Rate Limits

APILimitStrategy
GA410 QPS per propertyBatch dimensions
GSC1,200 requests/minPaginate large exports

Retry Pattern

#!/bin/bash
# Retry with exponential backoff

MAX_RETRIES=3
RETRY_DELAY=5

fetch_with_retry() {
    local url="$1"
    local attempt=1

    while [ $attempt -le $MAX_RETRIES ]; do
        response=$(curl -s -w "%{http_code}" -o /tmp/response.json "$url")
        http_code="${response: -3}"

        if [ "$http_code" = "200" ]; then
            cat /tmp/response.json
            return 0
        elif [ "$http_code" = "429" ]; then
            echo "Rate limited, waiting ${RETRY_DELAY}s..." >&2
            sleep $RETRY_DELAY
            RETRY_DELAY=$((RETRY_DELAY * 2))
        else
            echo "Error: HTTP $http_code" >&2
            return 1
        fi

        attempt=$((attempt + 1))
    done

    echo "Max retries exceeded" >&2
    return 1
}

Caching Pattern

Session-Based Cache

# Cache structure
SESSION_PATH="/tmp/seo-performance-20251227-143000-example"
CACHE_DIR="${SESSION_PATH}/cache"
CACHE_TTL=3600  # 1 hour in seconds

mkdir -p "$CACHE_DIR"

# Cache key generation
cache_key() {
    echo "$1" | md5sum | cut -d' ' -f1
}

# Check cache
get_cached() {
    local key=$(cache_key "$1")
    local cache_file="${CACHE_DIR}/${key}.json"

    if [ -f "$cache_file" ]; then
        local age=$(($(date +%s) - $(stat -f%m "$cache_file" 2>/dev/null || stat -c%Y "$cache_file")))
        if [ $age -lt $CACHE_TTL ]; then
            cat "$cache_file"
            return 0
        fi
    fi
    return 1
}

# Save to cache
save_cache() {
    local key=$(cache_key "$1")
    local cache_file="${CACHE_DIR}/${key}.json"
    cat > "$cache_file"
}

# Usage
CACHE_KEY="ga4_${URL}_${DATE_RANGE}"
if ! RESULT=$(get_cached "$CACHE_KEY"); then
    RESULT=$(fetch_from_api)
    echo "$RESULT" | save_cache "$CACHE_KEY"
fi

Date Range Standardization

Common Date Ranges

# Standard date range calculations
TODAY=$(date +%Y-%m-%d)

case "$RANGE" in
    "7d")
        START_DATE=$(date -v-7d +%Y-%m-%d 2>/dev/null || date -d "7 days ago" +%Y-%m-%d)
        ;;
    "30d")
        START_DATE=$(date -v-30d +%Y-%m-%d 2>/dev/null || date -d "30 days ago" +%Y-%m-%d)
        ;;
    "90d")
        START_DATE=$(date -v-90d +%Y-%m-%d 2>/dev/null || date -d "90 days ago" +%Y-%m-%d)
        ;;
    "mtd")
        START_DATE=$(date +%Y-%m-01)
        ;;
    "ytd")
        START_DATE=$(date +%Y-01-01)
        ;;
esac

END_DATE="$TODAY"

API-Specific Formats

APIFormatExample
GA4Relative or ISO"30daysAgo", "2025-12-01"
GSCISO 8601"2025-12-01"

Graceful Degradation

Data Source Fallback

## Fallback Strategy

When a data source is unavailable:

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚  PRIMARY SOURCE      โ”‚  FALLBACK           โ”‚  LAST RESORT       โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚  GA4 traffic data    โ”‚  GSC clicks         โ”‚  Estimate from GSC โ”‚
โ”‚  GSC search perf     โ”‚  Manual SERP check  โ”‚  WebSearch SERP    โ”‚
โ”‚  CWV (CrUX)          โ”‚  PageSpeed API      โ”‚  Lighthouse CLI    โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Partial Data Output

## Analysis Report (Partial Data)

### Data Availability

| Source | Status | Impact |
|--------|--------|--------|
| GA4 | NOT CONFIGURED | Missing engagement metrics |
| GSC | AVAILABLE | Full search data |

### Analysis Notes

This analysis is based on limited data sources:
- Search performance metrics are complete (GSC)
- Engagement metrics unavailable (no GA4)

**Recommendation**: Configure GA4 for complete analysis.
Run `/setup-analytics` to add Google Analytics.

Unified Data Model

Combined Output Structure

{
  "metadata": {
    "url": "https://example.com/page",
    "fetchedAt": "2025-12-27T14:30:00Z",
    "dateRange": {
      "start": "2025-11-27",
      "end": "2025-12-27"
    }
  },
  "sources": {
    "ga4": {
      "available": true,
      "metrics": {
        "pageViews": 2450,
        "avgTimeOnPage": 222,
        "bounceRate": 38.2,
        "engagementRate": 64.5
      }
    },
    "gsc": {
      "available": true,
      "metrics": {
        "impressions": 15200,
        "clicks": 428,
        "ctr": 2.82,
        "avgPosition": 4.2
      },
      "topQueries": [
        {"query": "seo guide", "clicks": 156, "position": 4}
      ]
    }
  },
  "computed": {
    "healthScore": 72,
    "status": "GOOD"
  }
}

Error Handling

Common Errors

ErrorCauseResolution
401 UnauthorizedInvalid/expired credentialsRe-run /setup-analytics
403 ForbiddenMissing permissionsCheck API access in console
429 Too Many RequestsRate limitWait and retry with backoff
404 Not FoundInvalid property/siteVerify IDs in configuration
500 Server ErrorAPI issueRetry later, check status page

Error Output Pattern

## Data Fetch Error

**Source**: Google Analytics 4
**Error**: 403 Forbidden
**Message**: "User does not have sufficient permissions for this property"

### Troubleshooting Steps

1. Verify Service Account email in GA4 Admin
2. Ensure "Viewer" role is granted
3. Check Analytics Data API is enabled
4. Wait 5 minutes for permission propagation

### Workaround

Proceeding with available data sources (GSC).
GA4 engagement metrics will not be included in this analysis.