數據工程
525 skills in 數據與 AI > 數據工程
dataset-comparer
Compare two datasets to find differences, added/removed rows, changed values. Use for data validation, ETL verification, or tracking changes.
github-workflow-automation
Advanced GitHub Actions workflow automation with AI swarm coordination, intelligent CI/CD pipelines, and comprehensive repository management
cliftonsites-backend
Use this skill when working with the CliftonSites Supabase backend for any task including understanding database schemas, debugging issues, adding features, querying data, managing RPC functions, reviewing triggers/policies, working with the automation pipeline, security architecture (MFA authentication, RLS, SECURITY DEFINER functions, API route protection), or any database operation. Provides complete expertise on all 12 tables, 25 RPC functions, 5 triggers, RLS policies, SECURITY DEFINER functions, admin MFA authentication, internal API token validation, views, indexes, data flows, and Supabase MCP server operations.
plantix-bigquery
Query Plantix disease detection data and other tables from BigQuery.
council-gate
Quality gate using LLM Council multi-model consensus for CI/CD pipelines.Use for automated approval workflows and pipeline quality checks.Keywords: gate, CI, CD, pipeline, automated approval, quality gate, GitHub Actions
data-quality-auditor
Assess data quality with checks for missing values, duplicates, type issues, and inconsistencies. Use for data validation, ETL pipelines, or dataset documentation.
synthetic-data-generation
Generate realistic synthetic data using Faker and Spark, with non-linear distributions, integrity constraints, and save to Databricks Volumes. Use when creating test data, demo datasets, or synthetic tables.
data-querying
Write and verify SQL queries with BigQuery. Use when executing bq commands, writing SQL queries, or including query results in documents.
cloud-devops-skill
Master cloud platforms (AWS, GCP, Azure), containerization (Docker), orchestration (Kubernetes), infrastructure as code, CI/CD pipelines, and DevOps practices for deploying and managing scalable applications.
databases
Work with MongoDB (document database, BSON documents, aggregation pipelines, Atlas cloud) and PostgreSQL (relational database, SQL queries, psql CLI, pgAdmin). Use when designing database schemas, writing queries and aggregations, optimizing indexes for performance, performing database migrations, configuring replication and sharding, implementing backup and restore strategies, managing database users and permissions, analyzing query performance, or administering production databases. | Sử dụng khi làm việc với cơ sở dữ liệu, database, SQL, query, truy vấn, schema, migration.
ci-monitoring
Use after creating PR - monitor CI pipeline, resolve failures cyclically until green or issue is identified as unresolvable
moai-lang-python
Python 3.13+ development specialist covering FastAPI, Django, async patterns, data science, testing with pytest, and modern Python features. Use when developing Python APIs, web applications, data pipelines, or writing tests.
ahu-conductor
Air Handler Design Pipeline Orchestrator
court-divorce-bigquery-indexing
Claude가 판례 Markdown 파일을 직접 분석하여 JSON 메타데이터를 생성합니다.
troubleshooting-assistant
Diagnoses and resolves MCP server registration failures, GPU detection, BigQuery authentication, index build failures, import errors, search quality issues, and performance problems.
configuring-dapr-pubsub
Configures Dapr pub/sub components for event-driven microservices with Kafka or Redis.Use when wiring agent-to-agent communication, setting up event subscriptions, or integrating Dapr sidecars.Covers component configuration, subscription patterns, publishing events, and Kubernetes deployment.NOT when using direct Kafka clients or non-Dapr messaging patterns.
ci-error-fix
Fixing CI errors systematically. Follows a structured workflow of understanding the error, checking if it exists on the main branch, reproducing locally, fixing, and verifying. Use when CI pipelines fail and you need to diagnose and fix the errors.
workflow-devkit
Build durable, resumable TypeScript workflows with Vercel Workflow DevKit. Use when creating long-running processes, AI agents, background jobs, multi-step pipelines, webhooks, or event-driven systems. Triggers on "workflow", "durable", "resumable", "use workflow", "use step".
new-domain-setup
Complete domain hosting setup workflow combining Plesk, Cloudflare, Let's Encrypt, and GitHub Actions deployment. Use when setting up a new domain from scratch, including DNS configuration, SSL certificates, and automated deployment pipelines. Orchestrates plesk-domain-setup, cloudflare-domain-setup, and github-actions-deploy skills.
cicd
GitHub Actions expert for CI/CD pipelines, workflows, build failures, test failures, lint errors, format checks, gh run, gh pr checks, ESLint, Prettier, TypeScript errors, quality gates, automated fixes, pipeline debugging, workflow monitoring