Marketplace

warehouse-optimization

Query optimization, partitioning, clustering, and warehouse tuning.

$ インストール

git clone https://github.com/timequity/vibe-coder /tmp/vibe-coder && cp -r /tmp/vibe-coder/skills/data/warehouse-optimization ~/.claude/skills/vibe-coder

// tip: Run this command in your terminal to install the skill


name: warehouse-optimization description: Query optimization, partitioning, clustering, and warehouse tuning.

Warehouse Optimization

Partitioning

-- Snowflake: Automatic clustering
ALTER TABLE fct_sales
CLUSTER BY (date_key, store_key);

-- BigQuery: Partition by date
CREATE TABLE fct_sales
PARTITION BY DATE(ordered_at)
CLUSTER BY store_id, product_id;

-- Redshift: Distribution and sort keys
CREATE TABLE fct_sales (
    ...
)
DISTKEY(store_id)
SORTKEY(ordered_at, store_id);

Query Optimization

Use EXPLAIN

EXPLAIN ANALYZE
SELECT ...
FROM fct_sales
WHERE date_key BETWEEN 20240101 AND 20240131;

Common Issues

ProblemSolution
Full table scanAdd partition filter
Skewed joinsRedistribute data
Large sortsPre-aggregate
Too many columnsSelect only needed

Efficient Joins

-- ❌ Bad: Large table on left
SELECT * FROM fct_sales s
JOIN dim_date d ON s.date_key = d.date_key;

-- ✅ Good: Small table on left (broadcast)
SELECT * FROM dim_date d
JOIN fct_sales s ON d.date_key = s.date_key;

Materialized Views

CREATE MATERIALIZED VIEW mv_daily_sales AS
SELECT
    date_key,
    store_key,
    SUM(amount) as total_sales,
    COUNT(*) as transaction_count
FROM fct_sales
GROUP BY 1, 2;

Cost Management

  • Monitor credit/byte usage
  • Set query timeouts
  • Use warehouse scheduling
  • Archive old partitions
  • Implement query tagging