數據工程
525 skills in 數據與 AI > 數據工程
schemachange
Deploying and managing Snowflake database objects using version control with schemachange. Use this skill when you need to manage database migrations for objects not handled by dbt, implement CI/CD pipelines for schema changes, or coordinate deployments across multiple environments.
dbt-testing
dbt testing strategies using dbt_constraints for database-level enforcement, generic tests, and singular tests. Use this skill when implementing data quality checks, adding primary/foreign key constraints, creating custom tests, or establishing comprehensive testing frameworks across bronze/silver/gold layers.
dbt-modeling
Writing dbt models with proper CTE patterns, SQL structure, and layer-specific templates. Use this skill when writing or refactoring dbt models, implementing CTE patterns, creating staging/intermediate/mart models, or ensuring proper SQL structure and dependencies.
dbt-artifacts
Monitor dbt execution using the dbt Artifacts package. Use this skill when you need to track test and model execution history, analyze run patterns over time, monitor data quality metrics, or enable programmatic access to dbt execution metadata across any dbt version or platform.
data-engineer
Expert data engineer specializing in building scalable data pipelines, ETL/ELT processes, and data infrastructure. Masters big data technologies and cloud platforms with focus on reliable, efficient, and cost-optimized data platforms.
iot-engineer
Expert IoT engineer specializing in connected device architectures, edge computing, and IoT platform development. Masters IoT protocols, device management, and data pipelines with focus on building scalable, secure, and reliable IoT solutions.
qa-testing-playwright
End-to-end web application testing with Playwright 1.57. Write and run browser automation tests from natural language, implement page object models, handle authentication, test responsive designs, and integrate with CI/CD pipelines. Includes Playwright Agents for AI-assisted test generation and Chrome for Testing builds.
data-lake-platform
Universal data lake and lakehouse patterns covering ingestion (dlt, Airbyte), transformation (SQLMesh, dbt), storage formats (Iceberg, Delta, Hudi, Parquet), query engines (ClickHouse, DuckDB, Doris, StarRocks), streaming (Kafka, Flink), orchestration (Dagster, Airflow, Prefect), and visualization (Metabase, Superset, Grafana). Self-hosted and cloud options.
ops-devops-platform
Production-grade DevOps patterns with Kubernetes 1.34+, Terraform 1.9+, Docker 27+, ArgoCD/FluxCD GitOps, SRE, eBPF-based observability, AI-driven monitoring, CI/CD security, and cloud-native operations (AWS, GCP, Azure, Kafka).
design
Design Session - collaborative brainstorming to turn ideas into designs using Double Diamond methodology OR 6-phase planning pipeline. Use when user types "ds", "pl", "/plan", or wants to explore/design a feature before implementation. MUST load maestro-core skill first for routing.
managing-github-ci
Configures GitHub Actions workflows and CI/CD pipelines. Manages automated releases via Changesets, PR validation, and Husky hooks. Troubleshoots CI failures. Triggers on: GitHub Actions, CI pipeline, workflow, release automation, Husky hooks, gh CLI, workflow failure.
devops-engineer
Copilot agent that assists with CI/CD pipeline creation, infrastructure automation, Docker/Kubernetes deployment, and DevOps best practices Trigger terms: CI/CD, DevOps, pipeline, Docker, Kubernetes, deployment automation, containerization, infrastructure automation, GitHub Actions, GitLab CI Use when: User requests involve devops engineer tasks.
media-processing
Process multimedia files with FFmpeg (video/audio encoding, conversion, streaming, filtering, hardware acceleration), ImageMagick (image manipulation, format conversion, batch processing, effects, composition), and RMBG (AI-powered background removal). Use when converting media formats, encoding videos with specific codecs (H.264, H.265, VP9), resizing/cropping images, removing backgrounds from images, extracting audio from video, applying filters and effects, optimizing file sizes, creating streaming manifests (HLS/DASH), generating thumbnails, batch processing images, creating composite images, or implementing media processing pipelines. Supports 100+ formats, hardware acceleration (NVENC, QSV), and complex filtergraphs.
rails-devops
DevOps and infrastructure specialist for Rails applications. Use when setting up Docker, CI/CD pipelines, deployment configurations, monitoring, logging, or production optimizations. Covers GitHub Actions, Docker, Kubernetes, and cloud platforms.
kafka-cli-tools
Expert knowledge of Kafka CLI tools (kcat, kcli, kaf, kafkactl). Auto-activates on keywords kcat, kafkacat, kcli, kaf, kafkactl, kafka cli, kafka command line, produce message, consume topic, list topics, kafka metadata. Provides command examples, installation guides, and tool comparisons.
mlops-dag-builder
Design DAG-based MLOps pipeline architectures with Airflow, Dagster, Kubeflow, or Prefect. Activates for DAG orchestration, workflow automation, pipeline design patterns, CI/CD for ML. Use for platform-agnostic MLOps infrastructure - NOT for SpecWeave increment-based ML (use ml-pipeline-orchestrator instead).
confluent-kafka-connect
Kafka Connect integration expert. Covers source and sink connectors, JDBC, Elasticsearch, S3, Debezium CDC, SMT (Single Message Transforms), connector configuration, and data pipeline patterns. Activates for kafka connect, connectors, source connector, sink connector, jdbc connector, debezium, smt, data pipeline, cdc.
sf-deploy
Comprehensive Salesforce DevOps automation using sf CLI v2. Use when deploying metadata, managing scratch orgs, setting up CI/CD pipelines, or troubleshooting deployment errors.
kafka-architecture
Expert knowledge of Apache Kafka architecture, cluster design, capacity planning, partitioning strategies, replication, and high availability. Auto-activates on keywords kafka architecture, cluster sizing, partition strategy, replication factor, kafka ha, kafka scalability, broker count, topic design, kafka performance, kafka capacity planning.
confluent-ksqldb
ksqlDB stream processing expert. Covers SQL-like queries on Kafka topics, stream and table concepts, joins, aggregations, windowing, materialized views, and real-time data transformations. Activates for ksqldb, ksql, stream processing, kafka sql, real-time analytics, windowing, stream joins, table joins, materialized views.