cloud-infrastructure
Cloud platforms (AWS, Cloudflare, GCP, Azure), containerization (Docker), Kubernetes, Infrastructure as Code (Terraform), CI/CD, and observability.
$ 安裝
git clone https://github.com/pluginagentmarketplace/custom-plugin-cloudflare /tmp/custom-plugin-cloudflare && cp -r /tmp/custom-plugin-cloudflare/skills/cloud-infrastructure ~/.claude/skills/custom-plugin-cloudflare// tip: Run this command in your terminal to install the skill
═══════════════════════════════════════════════════════════════════════════
SKILL: Cloud Infrastructure
Version: 2.0.0 | Updated: 2025-01
═══════════════════════════════════════════════════════════════════════════
name: cloud-infrastructure description: Cloud platforms (AWS, Cloudflare, GCP, Azure), containerization (Docker), Kubernetes, Infrastructure as Code (Terraform), CI/CD, and observability.
ACTIVATION TRIGGERS
triggers:
- aws
- kubernetes
- docker
- cloud
- terraform
- devops
- ci-cd
- cloudflare
- gcp
- azure
SKILL PARAMETERS
parameters: platform: type: string enum: [aws, gcp, azure, cloudflare, multi-cloud] required: true focus: type: string enum: [compute, containers, iac, cicd, monitoring] required: false
OUTPUT SPECIFICATION
outputs: architecture: type: object services: type: array learning_path: type: array
RELIABILITY
retry: max_attempts: 3 backoff: exponential
OBSERVABILITY
observability: log_level: info
level: advanced prerequisites:
- linux-basics
- networking-basics
sasmp_version: "1.3.0" bonded_agent: 01-core-paths bond_type: PRIMARY_BOND
Cloud Infrastructure Skill
Quick Reference
| Platform | Market | Best For | Learning |
|---|---|---|---|
| AWS | 32% | Everything | 3-6 mo |
| Azure | 24% | Microsoft stack | 3-6 mo |
| GCP | 11% | Data, ML | 3-6 mo |
| Cloudflare | Edge | CDN, Workers | 2-4 wk |
Learning Paths
AWS
[1] IAM + VPC (1-2 wk)
│ └─ Roles, policies, networking
│
▼
[2] Compute: EC2, Lambda (2-3 wk)
│
▼
[3] Storage: S3, EBS (1-2 wk)
│
▼
[4] Database: RDS, DynamoDB (2-3 wk)
│
▼
[5] Containers: ECS, EKS (3-4 wk)
│
▼
[6] Monitoring: CloudWatch (1-2 wk)
Docker & Containers
[1] Docker Basics (1 wk)
│ └─ Images, containers, Dockerfile
│
▼
[2] Multi-stage Builds (1 wk)
│ └─ Optimization, layer caching
│
▼
[3] Docker Compose (1 wk)
│ └─ Multi-container apps
│
▼
[4] Registry & Security (1 wk)
└─ Push/pull, scanning, non-root
Kubernetes
[1] Pods & Deployments (2 wk)
│
▼
[2] Services & Networking (1-2 wk)
│
▼
[3] ConfigMaps & Secrets (1 wk)
│
▼
[4] Helm Charts (2 wk)
│
▼
[5] Production Patterns (ongoing)
└─ HPA, PDB, resource limits
Terraform (IaC)
[1] Resources & State (1 wk)
│
▼
[2] Variables & Outputs (1 wk)
│
▼
[3] Modules (1-2 wk)
│
▼
[4] Remote State (1 wk)
│
▼
[5] Workspaces & Environments (1 wk)
Kubernetes Quick Reference
| Resource | Purpose | Example |
|---|---|---|
| Pod | Smallest unit | Single container |
| Deployment | Manage replicas | Web app |
| Service | Network access | ClusterIP, LoadBalancer |
| Ingress | HTTP routing | Path-based routing |
| ConfigMap | Configuration | Environment variables |
| Secret | Sensitive data | Credentials |
| StatefulSet | Stateful apps | Databases |
Terraform Structure
project/
├── main.tf # Resources
├── variables.tf # Inputs
├── outputs.tf # Outputs
├── providers.tf # Provider config
├── versions.tf # Version constraints
├── modules/
│ ├── vpc/
│ ├── eks/
│ └── rds/
└── environments/
├── dev.tfvars
├── staging.tfvars
└── prod.tfvars
CI/CD Pipeline Template
# GitHub Actions
name: CI/CD
on:
push:
branches: [main]
jobs:
build-test-deploy:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Build
run: docker build -t app .
- name: Test
run: docker run app pytest
- name: Push
run: docker push registry/app:${{ github.sha }}
- name: Deploy
if: github.ref == 'refs/heads/main'
run: kubectl set image deployment/app app=registry/app:${{ github.sha }}
Monitoring Stack
┌─────────────────────────────────────────┐
│ OBSERVABILITY STACK │
├─────────────────────────────────────────┤
│ Metrics: Prometheus → Grafana │
│ Logs: Loki / ELK │
│ Traces: Jaeger / Tempo │
│ Alerts: Alertmanager → PagerDuty │
└─────────────────────────────────────────┘
Troubleshooting
Container not starting?
├─► docker logs <container>
├─► Check port conflicts
├─► Check image name/tag
└─► Check resource limits
Pod in CrashLoopBackOff?
├─► kubectl describe pod <name>
├─► kubectl logs <pod>
├─► Check resource limits
├─► Check probes configuration
└─► Check image pull secrets
Terraform apply fails?
├─► terraform plan first
├─► Check state lock
├─► terraform import existing
└─► Restore state from backup
High cloud bill?
├─► Enable cost alerts
├─► Right-size instances
├─► Use spot instances
├─► Delete unused resources
└─► Storage lifecycle policies
Common Failure Modes
| Symptom | Root Cause | Recovery |
|---|---|---|
| Pod CrashLoopBackOff | App error or OOM | Check logs, increase limits |
| ImagePullBackOff | Wrong image or auth | Verify image, check secrets |
| Terraform drift | Manual changes | Import or terraform apply |
| Slow deploys | Large images | Multi-stage builds, layer caching |
Best Practices
Docker
- Use multi-stage builds
- Run as non-root user
- Use .dockerignore
- Pin base image versions
- Scan for vulnerabilities
Kubernetes
- Set resource requests/limits
- Use readiness/liveness probes
- Store config in ConfigMaps
- Use namespaces for isolation
- Enable network policies
Terraform
- Use remote state (S3, GCS)
- Lock state file
- Use modules for reuse
- Plan before apply
- Tag all resources
Next Actions
Specify your cloud platform and focus area for detailed guidance.
Repository
