🤖

數據工程

525 skills in 數據與 AI > 數據工程

bigquery-ethereum-data-acquisition

Workflow for acquiring historical Ethereum blockchain data using Google BigQuery free tier. Empirically validated for cost estimation, streaming downloads, and DuckDB integration. Use when planning bulk historical data acquisition or comparing data source options for blockchain network metrics.

majiayu000/claude-skill-registry

更新於 6d ago

historical-backfill-execution

Execute chunked historical blockchain data backfills using canonical 1-year pattern. Use when loading multi-year historical data, filling gaps in ClickHouse, or preventing OOM failures on Cloud Run. Keywords chunked_backfill.sh, BigQuery historical, gap filling, memory-safe backfill.

terrylica/gapless-network-data

更新於 6d ago

architecture-documentation-creator

Create comprehensive technical documentation for code systems including data flow diagrams, architecture overviews, algorithm documentation, cheat sheets, and multi-file documentation sets. Use when documenting pipelines, algorithms, system architecture, data flow, multi-stage processes, similarity algorithms, or creating developer onboarding materials. Covers Mermaid diagrams, progressive disclosure, critical patterns, JSON schemas, Pydantic models, and print-friendly reference materials.

majiayu000/claude-skill-registry

更新於 6d ago

workflow-patterns

Industry-specific workflow patterns and templates for finance, healthcare, logistics, manufacturing, retail, and common use cases like AI document processing, API integration, business rules, ETL, RAG, security, and project management. Use when asking about 'workflow examples', 'workflow templates', 'industry workflows', 'finance workflows', 'healthcare workflows', 'logistics workflows', 'manufacturing workflows', 'retail workflows', 'ETL workflows', 'RAG workflows', 'API workflows', 'document processing', 'business rules', or 'workflow patterns'.

Integrum-Global/kaizen-studio

更新於 6d ago

workflow

Develop, test, and register PMC workflows.Workflows are JSON state machines for Claude CLI, shell, sub-workflows.WORKFLOW:1. DEFINE - Create workflow JSON with states, transitions2. VALIDATE - pmc validate <workflow.json>3. MOCK - Create mock scripts for each state4. TEST MOCK - pmc run --mock to test transitions5. TEST REAL - pmc run with real data6. REGISTER - Add to registry.jsonUse when:- User says "create workflow", "new workflow", "automate"- Automating repetitive multi-step processes- Building CI/CD or development pipelines

jayprimer/pmc-marketplace

更新於 6d ago

plan-audit-orchestrator

Coordinate planning, auditing, and validation workflows. Use when the user mentions plan/planning, audit/auditing, review, or validation for engineering, data, or pipeline work.

majiayu000/claude-skill-registry

更新於 6d ago

mongodb-aggregation-pipeline

Master MongoDB aggregation pipeline for complex data transformations. Learn pipeline stages, grouping, filtering, and data transformation. Use when analyzing data, creating reports, or transforming documents.

pluginagentmarketplace/custom-plugin-mongodb

更新於 6d ago

metadata-manager

Use this skill when creating or updating DAG configurations (dags.yaml), schema.yaml, and metadata.yaml files for BigQuery tables. Handles creating new DAGs when needed and coordinates test updates when queries are modified (invokes sql-test-generator as needed). Works with bigquery-etl-core, query-writer, and sql-test-generator skills.

mozilla/bigquery-etl-skills

更新於 6d ago

orchestration-coordination-framework

Production-scale multi-agent coordination, task orchestration, and workflow automation. Use for distributed system orchestration, agent communication protocols, DAG workflows, state machines, error handling, resource allocation, load balancing, and observability. Covers Apache Airflow, Temporal, Prefect, Celery, Step Functions, and orchestration patterns.

majiayu000/claude-skill-registry

更新於 6d ago

gcp-specialist

Expert GCP specialist for BigQuery, Google Groups, IAM, and L'Oréal BTDP infrastructure. Use when working with any GCP projects with gcloud command (list resources in any GCP products like BigQuery, Cloud Run, Cloud Functions, IAM permissions and roles, Service Accounts, Spanner, Big Table, Dataflow, Firestore, Cloud Storage). Use it also to provide or remove permissions, deploy a resource, delete resources, any create/read/update/delete operations on Google Cloud Platform

smorand/claude-config

更新於 6d ago

data-orchestrator

Coordinates data pipeline tasks (ETL, analytics, feature engineering). Use when implementing data ingestion, transformations, quality checks, or analytics. Applies data-quality-standard.md (95% minimum).

majiayu000/claude-skill-registry

更新於 6d ago

dbt

dbt (data build tool) patterns for data transformation and analytics engineering. Use when building data models, implementing data quality tests, or managing data transformation pipelines.

jpoutrin/product-forge

更新於 6d ago

etl-tools

Apache Airflow, Spark, Kafka, Flink, dbt, and modern data transformation tools

pluginagentmarketplace/custom-plugin-data-engineer

更新於 6d ago

adk-agent-handling

Google ADK (Agent Development Kit) multi-agent system architecture for BigQuery data analytics. Covers BigQuery agent vs conversational agent patterns, ADK Single Parent Rule, domain routing with sub-agents, agent selection mechanisms, SQL error recovery with ReflectAndRetryToolPlugin, transfer_to_agent workflows, and frontend-backend agent coordination. Use when working with Google ADK agents, multi-agent systems, BigQuery SQL automation, domain expert routing, agent orchestration, or implementing error recovery strategies in AI agent applications.

nfbs2000/vibe-with-google-ai-divorce-agent-inflearn

更新於 6d ago

gitops-principles-skill

Comprehensive GitOps methodology and principles skill for cloud-native operations. Use when (1) Designing GitOps architecture for Kubernetes deployments, (2) Implementing declarative infrastructure with Git as single source of truth, (3) Setting up continuous deployment pipelines with ArgoCD/Flux/Kargo, (4) Establishing branching strategies and repository structures, (5) Troubleshooting drift, sync failures, or reconciliation issues, (6) Evaluating GitOps tooling decisions, (7) Teaching or explaining GitOps concepts and best practices, (8) Deploying ArgoCD on Azure Arc-enabled Kubernetes or AKS with workload identity. Covers the 4 pillars of GitOps (OpenGitOps), patterns, anti-patterns, tooling ecosystem, Azure Arc integration, and operational guidance.

majiayu000/claude-skill-registry

更新於 6d ago

keboola-data-engineering

Expert assistant for Keboola data platform. Builds working data pipelines, not just advice. Use for: data extraction, transformation, validation, orchestration, dashboard creation.

更新於 6d ago

architecture-paradigm-pipeline

Compose processing stages using a pipes-and-filters model for ETL, media processing, or compiler-like workloads.

athola/archetypes

更新於 6d ago

qa-acceptance

Evaluate pipeline runs against ACCEPTANCE_MATRIX.md. Use after feature releases to verify acceptance criteria.

majiayu000/claude-skill-registry

更新於 6d ago

data-pipeline-patterns

Follow these patterns when implementing data pipelines, ETL, data ingestion, or data validation in OptAIC. Use for point-in-time (PIT) correctness, Arrow schemas, quality checks, and Prefect orchestration.

majiayu000/claude-skill-registry

更新於 6d ago

backend-engineering

Design and implement robust, production-grade backend systems with strong architecture, correctness, performance, and operational rigor. Use this skill when the user asks to build APIs, services, data pipelines, system architectures, or backend-heavy applications.

kanishka-sahoo/opencode-config

更新於 6d ago