🤖

Data Engineering

525 skills in Data & AI > Data Engineering

airflow-dag-patterns

Marketplace

Build production Apache Airflow DAGs with best practices for operators, sensors, testing, and deployment. Use when creating data pipelines, orchestrating workflows, or scheduling batch jobs.

wshobson/agents
24.2k
2.7k
Updated 4d ago

spark-optimization

Marketplace

Optimize Apache Spark jobs with partitioning, caching, shuffle optimization, and memory tuning. Use when improving Spark performance, debugging slow jobs, or scaling data processing pipelines.

wshobson/agents
24.2k
2.7k
Updated 4d ago

data-quality-frameworks

Marketplace

Implement data quality validation with Great Expectations, dbt tests, and data contracts. Use when building data quality pipelines, implementing validation rules, or establishing data contracts.

wshobson/agents
24.2k
2.7k
Updated 4d ago

senior-data-engineer

Marketplace

World-class data engineering skill for building scalable data pipelines, ETL/ELT systems, and data infrastructure. Expertise in Python, SQL, Spark, Airflow, dbt, Kafka, and modern data stack. Includes data modeling, pipeline orchestration, data quality, and DataOps. Use when designing data architectures, building data pipelines, optimizing data workflows, or implementing data governance.

davila7/claude-code-templates
14.5k
1.3k
Updated 4d ago

cocoindex

Marketplace

Comprehensive toolkit for developing with the CocoIndex library. Use when users need to create data transformation pipelines (flows), write custom functions, or operate flows via CLI or API. Covers building ETL workflows for AI data processing, including embedding documents into vector databases, building knowledge graphs, creating search indexes, or processing data streams with incremental updates.

davila7/claude-code-templates
14.5k
1.3k
Updated 4d ago

ena-database

Marketplace

Access European Nucleotide Archive via API/FTP. Retrieve DNA/RNA sequences, raw reads (FASTQ), genome assemblies by accession, for genomics and bioinformatics pipelines. Supports multiple formats.

davila7/claude-code-templates
14.5k
1.3k
Updated 4d ago

senior-devops

Marketplace

Comprehensive DevOps skill for CI/CD, infrastructure automation, containerization, and cloud platforms (AWS, GCP, Azure). Includes pipeline setup, infrastructure as code, deployment automation, and monitoring. Use when setting up pipelines, deploying applications, managing infrastructure, implementing monitoring, or optimizing deployment processes.

davila7/claude-code-templates
14.5k
1.3k
Updated 4d ago

marketing-demand-acquisition

Marketplace

Multi-channel demand generation, paid media optimization, SEO strategy, and partnership programs for Series A+ startups. Includes CAC calculator, channel playbooks, HubSpot integration, and international expansion tactics. Use when planning demand generation campaigns, optimizing paid media, building SEO strategies, establishing partnerships, or when user mentions demand gen, paid ads, LinkedIn ads, Google ads, CAC, acquisition, lead generation, or pipeline generation.

davila7/claude-code-templates
14.5k
1.3k
Updated 4d ago

dnanexus-integration

Marketplace

DNAnexus cloud genomics platform. Build apps/applets, manage data (upload/download), dxpy Python SDK, run workflows, FASTQ/BAM/VCF, for genomics pipeline development and execution.

davila7/claude-code-templates
14.5k
1.3k
Updated 4d ago

pysam

Marketplace

Genomic file toolkit. Read/write SAM/BAM/CRAM alignments, VCF/BCF variants, FASTA/FASTQ sequences, extract regions, calculate coverage, for NGS data processing pipelines.

davila7/claude-code-templates
14.5k
1.3k
Updated 4d ago

github-workflow-automation

Marketplace

Advanced GitHub Actions workflow automation with AI swarm coordination, intelligent CI/CD pipelines, and comprehensive repository management

ruvnet/claude-flow
9.9k
1.3k
Updated 3d ago

stream-chain

Marketplace

Stream-JSON chaining for multi-agent pipelines, data transformation, and sequential workflows

ruvnet/claude-flow
9.9k
1.3k
Updated 3d ago

pipeline-assistant

Marketplace

This skill should be used when users need to create or fix Redpanda Connect pipeline configurations. Trigger when users mention "config", "pipeline", "YAML", "create a config", "fix my config", "validate my pipeline", or describe a streaming pipeline need like "read from Kafka and write to S3".

redpanda-data/connect
8.5k
905
Updated 3d ago

advanced-evaluation

Marketplace

This skill should be used when the user asks to "implement LLM-as-judge", "compare model outputs", "create evaluation rubrics", "mitigate evaluation bias", or mentions direct scoring, pairwise comparison, position bias, evaluation pipelines, or automated quality assessment.

muratcankoylan/Agent-Skills-for-Context-Engineering
5.4k
424
Updated 3d ago

docetl

Build and run LLM-powered data processing pipelines with DocETL. Use when users say "docetl", want to analyze unstructured data, process documents, extract information, or run ETL tasks on text. Helps with data collection, pipeline creation, execution, and optimization.

ucbepic/docetl
3.4k
359
Updated 3d ago

pysam

Marketplace

Genomic file toolkit. Read/write SAM/BAM/CRAM alignments, VCF/BCF variants, FASTA/FASTQ sequences, extract regions, calculate coverage, for NGS data processing pipelines.

K-Dense-AI/claude-scientific-skills
3.0k
334
Updated 3d ago

ena-database

Marketplace

Access European Nucleotide Archive via API/FTP. Retrieve DNA/RNA sequences, raw reads (FASTQ), genome assemblies by accession, for genomics and bioinformatics pipelines. Supports multiple formats.

K-Dense-AI/claude-scientific-skills
3.0k
334
Updated 3d ago

dnanexus-integration

Marketplace

DNAnexus cloud genomics platform. Build apps/applets, manage data (upload/download), dxpy Python SDK, run workflows, FASTQ/BAM/VCF, for genomics pipeline development and execution.

K-Dense-AI/claude-scientific-skills
3.0k
334
Updated 3d ago

multi-tool-pipeline

Template for chaining multiple MCP tools in a single script

parcadei/Continuous-Claude-v2
1.3k
76
Updated 3d ago

backend-development

Marketplace

Build robust backend systems with modern technologies (Node.js, Python, Go, Rust), frameworks (NestJS, FastAPI, Django), databases (PostgreSQL, MongoDB, Redis), APIs (REST, GraphQL, gRPC), authentication (OAuth 2.1, JWT), testing strategies, security best practices (OWASP Top 10), performance optimization, scalability patterns (microservices, caching, sharding), DevOps practices (Docker, Kubernetes, CI/CD), and monitoring. Use when designing APIs, implementing authentication, optimizing database queries, setting up CI/CD pipelines, handling security vulnerabilities, building microservices, or developing production-ready backend systems.

mrgoonie/claudekit-skills
1.1k
227
Updated 3d ago