Data Engineering
525 skills in Data & AI > Data Engineering
ml-pipeline-automation
Automate ML workflows with Airflow, Kubeflow, MLflow. Use for reproducible pipelines, retraining schedules, MLOps, or encountering task failures, dependency errors, experiment tracking issues.
wdk
Tether Wallet Development Kit (WDK) for building non-custodial multi-chain wallets. Use when working with @tetherto/wdk-core, wallet modules (wdk-wallet-btc, wdk-wallet-evm, wdk-wallet-evm-erc-4337, wdk-wallet-solana, wdk-wallet-spark, wdk-wallet-ton, wdk-wallet-tron, ton-gasless, tron-gasfree), and protocol modules including swap (wdk-protocol-swap-velora-evm), bridge (wdk-protocol-bridge-usdt0-evm), and lending (wdk-protocol-lending-aave-evm). Covers wallet creation, transactions, token transfers, DEX swaps, cross-chain bridges, and DeFi lending/borrowing.
pipeline-debugger
Debug and monitor GitLab CI/CD pipelines for merge requests. Check pipeline status, view job logs, and troubleshoot CI failures. Use this when the user needs to investigate GitLab CI pipeline issues, check job statuses, or view specific job logs.
gh-run-failure
Use to analyze failures in GitHub pipelines or jobs.
workers-ci-cd
Complete CI/CD guide for Cloudflare Workers using GitHub Actions and GitLab CI. Use for automated testing, deployment pipelines, preview environments, secrets management, or encountering deployment failures, workflow errors, environment configuration issues.
prepare-dataset
Process and validate datasets for training. Use when setting up data pipelines.
docker-containerization
Package applications into secure, portable Docker images with validated pipelines
ci-fixer
Debug and fix CI/CD pipeline failures. Analyze workflow logs, identify issues, and apply fixes. Use when CI is failing, build errors occur, tests fail in CI, or pipeline is broken.
observability
Establish observability for research systems, experiments, and data pipelines with guardrails and confidence ceilings.
improvement-pipeline
Coordinate sequential improvement stages (analyze → propose → build → validate) with Prompt Architect clarity and Skill Forge guardrails.
streak
Gmail-integrated CRM for managing pipelines, deals (boxes), contacts, and email threads
lang-python
Python 3.13+ development specialist covering FastAPI, Django, async patterns, data science, testing with pytest, and modern Python features. Use when developing Python APIs, web applications, data pipelines, or writing tests.
infra-engineer
Comprehensive infrastructure engineering covering DevOps, cloud platforms, FinOps, and DevSecOps. Platforms: AWS (EC2, Lambda, S3, ECS, EKS, RDS, CloudFormation), Azure basics, Cloudflare (Workers, R2, D1, Pages), GCP (GKE, Cloud Run, Cloud Storage), Docker, Kubernetes. Capabilities: CI/CD pipelines (GitHub Actions, GitLab CI, Jenkins), GitOps, infrastructure as code (Terraform, CloudFormation), container orchestration, cost optimization, security scanning, vulnerability management, secrets management, compliance (SOC2, HIPAA). Actions: deploy, configure, manage, scale, monitor, secure, optimize cloud infrastructure. Keywords: AWS, EC2, Lambda, S3, ECS, EKS, RDS, CloudFormation, Azure, Kubernetes, k8s, Docker, Terraform, CI/CD, GitHub Actions, GitLab CI, Jenkins, ArgoCD, Flux, cost optimization, FinOps, reserved instances, spot instances, security scanning, SAST, DAST, vulnerability management, secrets management, Vault, compliance, monitoring, observability. Use when: deploying to AWS/Azure/GCP/Cloudflare, setting up CI/CD pipelines, implementing GitOps workflows, managing Kubernetes clusters, optimizing cloud costs, implementing security best practices, managing infrastructure as code, container orchestration, compliance requirements, cost analysis and optimization.
vscode-test-setup
This skill provides comprehensive guidance for setting up and configuring test environments for VS Code extension projects. Use when initializing a new test infrastructure, configuring test runners (Mocha, Jest), setting up CI/CD test pipelines, integrating coverage tools (c8, nyc), or troubleshooting test configuration issues.
Konflux is a build tool
Use this skill to query Konflux objects from the Kubernetes cluster. Konflux objects are application, component, pipelinerun, taskrun, snapshot and release. The skill can be used to query logs from failed or removed pipelines and pods.
agent-coordination-patterns
Coordinate multi-agent workflows: sequential, parallel, and iterative patterns.Defines agent handoffs, dependencies, communication protocols, and integration.Use when designing multi-agent workflows, coordinating agent handoffs,planning agent dependencies, or building complex agent pipelines.
deploy
Help users deploy their portfolio to production. Covers Vercel, Netlify, Cloudflare Pages, GitHub Pages, Docker, and MCP integration for AI-driven deployment.
openshift-expert
OpenShift platform and Kubernetes expert with deep knowledge of cluster architecture, operators, networking, storage, troubleshooting, and CI/CD pipelines. Use for analyzing test failures, debugging cluster issues, understanding operator behavior, investigating build problems, or any OpenShift/Kubernetes-related questions.
navigating-github-to-konflux-pipelines
Use when GitHub PR or branch has failing checks and you need to find Konflux pipeline information (cluster, namespace, PipelineRun name). Teaches gh CLI commands to identify Konflux checks (filter out Prow/SonarCloud), extract PipelineRun URLs from builds and integration tests, and parse URLs for kubectl debugging.
debugging-pipeline-failures
Use when Konflux pipelines fail, are stuck, timeout, or show errors like ImagePullBackOff. Covers PipelineRun failures, TaskRun issues (Pending, Failed, stuck Running), build errors, and systematic debugging of Tekton pipeline problems using kubectl and logs.