Marketplace
scalability-advisor
Guidance for scaling systems from startup to enterprise scale. Use when planning for growth, diagnosing bottlenecks, or designing systems that need to handle 10x-1000x current load.
$ Installer
git clone https://github.com/alirezarezvani/claude-cto-team /tmp/claude-cto-team && cp -r /tmp/claude-cto-team/skills/scalability-advisor ~/.claude/skills/claude-cto-team// tip: Run this command in your terminal to install the skill
SKILL.md
name: scalability-advisor description: Guidance for scaling systems from startup to enterprise scale. Use when planning for growth, diagnosing bottlenecks, or designing systems that need to handle 10x-1000x current load.
Scalability Advisor
Provides systematic guidance for scaling systems at different growth stages, identifying bottlenecks, and designing for horizontal scalability.
When to Use
- Planning for 10x, 100x, or 1000x growth
- Diagnosing current performance bottlenecks
- Designing new systems for scale
- Evaluating scaling strategies (vertical vs. horizontal)
- Capacity planning and infrastructure sizing
Scaling Stages Framework
Stage Overview
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ SCALING JOURNEY โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ โ
โ Stage 1 Stage 2 Stage 3 Stage 4 โ
โ Startup Growth Scale Enterprise โ
โ 0-10K users 10K-100K 100K-1M 1M+ users โ
โ โ
โ Single Add caching, Horizontal Global, โ
โ server read replicas scaling multi-region โ
โ โ
โ $100/mo $1K/mo $10K/mo $100K+/mo โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Stage 1: Startup (0-10K Users)
Architecture
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Single Server โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ App Server (Node/Python/etc) โ โ
โ โ + Database (PostgreSQL) โ โ
โ โ + File Storage (local/S3) โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Key Metrics
| Metric | Target | Warning |
|---|---|---|
| Response time (P95) | < 500ms | > 1s |
| Database queries/request | < 10 | > 20 |
| Server CPU | < 70% | > 85% |
| Database connections | < 50% pool | > 80% pool |
What to Focus On
DO:
- Write clean, maintainable code
- Use database indexes on frequently queried columns
- Implement basic monitoring (uptime, errors)
- Keep architecture simple (monolith is fine)
DON'T:
- Over-engineer for scale you don't have
- Add caching before you need it
- Split into microservices prematurely
- Worry about multi-region yet
When to Move to Stage 2
- Database CPU consistently > 70%
- Response times degrading
- Single queries taking > 100ms
- Server resources maxed
Stage 2: Growth (10K-100K Users)
Architecture
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ โ
โ โโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ CDN โ โ Load Balancer โ โ
โ โโโโโโฌโโโโโ โโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโ โ
โ โ โ โ
โ โ โโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโ โ
โ โ โ โ โ โ
โ โผ โผ โผ โผ โ
โ โโโโโโโโโโโ โโโโโโโโโโโ โโโโโโโโโโโ โโโโโโโโโโโ โ
โ โ Static โ โ App 1 โ โ App 2 โ โ App 3 โ โ
โ โ Assets โ โโโโโโฌโโโโโ โโโโโโฌโโโโโ โโโโโโฌโโโโโ โ
โ โโโโโโโโโโโ โ โ โ โ
โ โโโโโโโโโโโโโโโโผโโโโโโโโโโโโโ โ
โ โ โ
โ โโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโ โ
โ โ โ โ โ
โ โผ โผ โผ โ
โ โโโโโโโโโโโ โโโโโโโโโโโ โโโโโโโโโโโ โ
โ โ Primary โ โ Read โ โ Redis โ โ
โ โ DB โโโโโ Replica โ โ Cache โ โ
โ โโโโโโโโโโโ โโโโโโโโโโโ โโโโโโโโโโโ โ
โ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Key Additions
| Component | Purpose | When to Add |
|---|---|---|
| CDN | Static asset caching | Images, JS, CSS taking > 20% bandwidth |
| Load Balancer | Distribute traffic | Single server CPU > 70% |
| Read Replicas | Offload reads | > 80% database ops are reads |
| Redis Cache | Application caching | Same queries repeated frequently |
| Job Queue | Async processing | Background tasks blocking requests |
Caching Strategy
Request Flow with Caching:
1. Check CDN (static assets) โโบ HIT: Return cached
โ
2. Check Application Cache (Redis) โโบ HIT: Return cached
โ
3. Check Database โโบ Return + Cache result
What to Cache:
- Session data (TTL: session duration)
- User profile data (TTL: 5-15 minutes)
- API responses (TTL: varies by freshness needs)
- Database query results (TTL: 1-5 minutes)
- Computed values (TTL: based on computation cost)
Database Optimization
-- Find slow queries
SELECT query, calls, mean_time, total_time
FROM pg_stat_statements
ORDER BY total_time DESC
LIMIT 20;
-- Find missing indexes
SELECT schemaname, tablename, indexrelname, idx_scan, seq_scan
FROM pg_stat_user_indexes
WHERE idx_scan = 0 AND seq_scan > 1000;
When to Move to Stage 3
- Write traffic overwhelming single primary
- Cache hit rate plateauing despite optimization
- Read replicas can't keep up with replication lag
- Need independent scaling of components
Stage 3: Scale (100K-1M Users)
Architecture
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ CDN / Edge โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ API Gateway โ
โ (Rate limiting, Auth, Routing) โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ โ โ
โผ โผ โผ
โโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโ
โ Service A โ โ Service B โ โ Service C โ
โ (Users) โ โ (Orders) โ โ (Search) โ
โ Auto-scale โ โ Auto-scale โ โ Auto-scale โ
โโโโโโโโโฌโโโโโโโโ โโโโโโโโโฌโโโโโโโโ โโโโโโโโโฌโโโโโโโโ
โ โ โ
โผ โผ โผ
โโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโ
โ User DB โ โ Order DB โ โ Elasticsearch โ
โ (Sharded) โ โ (Sharded) โ โ (Cluster) โ
โโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโ
โ
โผ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Message Queue โ
โ (Kafka / SQS) โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Key Patterns
Database Sharding
Sharding Strategies:
1. Hash-based (user_id % num_shards)
PRO: Even distribution
CON: Hard to add shards
2. Range-based (user_id 1-1M โ shard 1)
PRO: Easy to add shards
CON: Hotspots possible
3. Directory-based (lookup table)
PRO: Flexible
CON: Lookup overhead
Event-Driven Architecture
Synchronous โ Asynchronous
Before:
API โ Service A โ Service B โ Service C โ Response (slow)
After:
API โ Service A โ Queue โ Response (fast)
โ
Service B, C process async
Scaling Checklist
- Stateless application servers (no local state)
- Database read/write separation
- Asynchronous processing for non-critical paths
- Circuit breakers between services
- Distributed tracing implemented
- Auto-scaling configured with proper metrics
- Database connection pooling (PgBouncer, ProxySQL)
When to Move to Stage 4
- Need geographic distribution for latency
- Regulatory requirements (data residency)
- Single region can't handle failover
- Global user base with latency requirements
Stage 4: Enterprise (1M+ Users)
Architecture
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Global Load Balancer โ
โ (GeoDNS, Anycast, Route53) โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ โ
โโโโโโโโโโดโโโโโโโโโ โโโโโโโโโดโโโโโโโโโ
โ โ โ โ
โผ โผ โผ โผ
โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโ
โ US-East โ โ US-West โ โ EU-West โ โ AP-South โ
โ Region โ โ Region โ โ Region โ โ Region โ
โ โโโโโโโโโโ โ โ โโโโโโโโโโ โ โ โโโโโโโโโโ โ โ โโโโโโโโโโ โ
โ โServicesโ โ โ โServicesโ โ โ โServicesโ โ โ โServicesโ โ
โ โโโโโโโโโโ โ โ โโโโโโโโโโ โ โ โโโโโโโโโโ โ โ โโโโโโโโโโ โ
โ โโโโโโโโโโ โ โ โโโโโโโโโโ โ โ โโโโโโโโโโ โ โ โโโโโโโโโโ โ
โ โDatabaseโ โ โ โDatabaseโ โ โ โDatabaseโ โ โ โDatabaseโ โ
โ โ(Primary)โ โ โ โ(Replica)โ โ โ โ(Primary)โ โ โ โ(Replica)โ โ
โ โโโโโโโโโโ โ โ โโโโโโโโโโ โ โ โโโโโโโโโโ โ โ โโโโโโโโโโ โ
โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโ
โ โ
โโโโโโโโโโโฌโโโโโโโโโโ
โ
Cross-Region Replication
Multi-Region Patterns
| Pattern | Consistency | Latency | Complexity |
|---|---|---|---|
| Active-Passive | Strong | High failover | Low |
| Active-Active | Eventual | Low | High |
| Follow-the-Sun | Strong per region | Medium | Medium |
Data Consistency Strategies
CAP Theorem Trade-offs:
Strong Consistency (CP):
- All regions see same data
- Higher latency for writes
- Use for: Financial transactions, inventory
Eventual Consistency (AP):
- Regions may have stale data briefly
- Low latency always
- Use for: Social feeds, analytics, non-critical
Causal Consistency:
- Related operations ordered correctly
- Balance of latency and correctness
- Use for: Messaging, collaboration
Enterprise Checklist
- Multi-region deployment
- Cross-region data replication
- Global CDN with edge caching
- Disaster recovery tested
- Compliance (SOC 2, GDPR, data residency)
- 99.99% SLA architecture
- Zero-downtime deployments
- Chaos engineering practice
Bottleneck Diagnosis Guide
Finding the Bottleneck
Systematic Diagnosis:
1. Where is time spent?
โโโบ Distributed tracing (Jaeger, Datadog)
2. Is it the database?
โโโบ Check slow query logs, connection pool
3. Is it the application?
โโโบ CPU profiling, memory analysis
4. Is it the network?
โโโบ Latency between services, DNS resolution
5. Is it external services?
โโโบ Third-party API latency, rate limits
Common Bottlenecks by Layer
| Layer | Symptoms | Solutions |
|---|---|---|
| Database | Slow queries, high CPU | Indexing, read replicas, caching |
| Application | High CPU, memory | Optimize code, scale horizontally |
| Network | High latency, timeouts | CDN, edge caching, connection pooling |
| Storage | Slow I/O, high wait | SSD, object storage, caching |
| External APIs | Timeouts, rate limits | Circuit breakers, caching, fallbacks |
Database Bottleneck Checklist
## Quick Database Health Check
1. Connection Pool
- Current connections vs max?
- Connection wait time?
- Pool exhaustion events?
2. Query Performance
- Slowest queries (pg_stat_statements)?
- Missing indexes (seq scans > 10K)?
- Lock contention?
3. Replication
- Replica lag?
- Write throughput?
- Read distribution?
4. Storage
- Disk I/O wait?
- Table/index bloat?
- WAL write latency?
Scaling Calculations
Capacity Planning Formula
Required Capacity = Peak Traffic ร Growth Factor ร Safety Margin
Example:
- Current peak: 1,000 req/sec
- Expected growth: 3x in 12 months
- Safety margin: 1.5x
Required: 1,000 ร 3 ร 1.5 = 4,500 req/sec capacity
Database Sizing
Connection Pool Size:
connections = (num_cores ร 2) + effective_spindle_count
Example: 8 cores, SSD
connections = (8 ร 2) + 1 = 17 connections per instance
Read Replica Sizing:
replicas = ceiling(read_traffic / single_replica_capacity)
Example: 10,000 reads/sec, 3,000/replica capacity
replicas = ceiling(10,000 / 3,000) = 4 replicas
Cache Sizing
Cache Size:
memory = working_set_size ร (1 + overhead_factor)
Working set = frequently accessed data (usually 10-20% of total)
Overhead = ~1.5x for Redis data structures
Example: 10GB working set
Redis memory = 10GB ร 1.5 = 15GB
Quick Reference
Scaling Decision Matrix
| Symptom | First Try | Then Try | Finally |
|---|---|---|---|
| Slow page loads | Add caching | CDN | Edge compute |
| Database slow | Add indexes | Read replicas | Sharding |
| API timeouts | Async processing | Circuit breakers | Event-driven |
| High server CPU | Vertical scale | Horizontal scale | Optimize code |
| High memory | Increase RAM | Fix memory leaks | Redesign data structures |
Infrastructure Cost at Scale
| Users | Architecture | Monthly Cost |
|---|---|---|
| 10K | Single server | $100-300 |
| 100K | Load balanced + cache | $1,000-3,000 |
| 1M | Microservices + sharding | $10,000-30,000 |
| 10M | Multi-region | $100,000+ |
References
- Bottleneck Diagnosis Guide - Detailed troubleshooting
- Capacity Planning Calculator - Sizing formulas
Repository

alirezarezvani
Author
alirezarezvani/claude-cto-team/skills/scalability-advisor
32
Stars
7
Forks
Updated6d ago
Added1w ago