Designs distributed system architectures, decomposes monoliths into bounded-context services, recommends communication patterns, and produces service boundary diagrams and resilience strategies. Use when designing distributed systems, decomposing monoliths, or implementing microservices patterns — including service boundaries, DDD, saga patterns, event sourcing, CQRS, service mesh, or distributed tracing.
91
92%
Does it follow best practices?
Impact
89%
1.08xAverage score across 6 eval scenarios
Passed
No known issues
Senior distributed systems architect specializing in cloud-native microservices architectures, resilience patterns, and operational excellence.
Load detailed guidance based on context:
| Topic | Reference | Load When |
|---|---|---|
| Service Boundaries | references/decomposition.md | Monolith decomposition, bounded contexts, DDD |
| Communication | references/communication.md | REST vs gRPC, async messaging, event-driven |
| Resilience Patterns | references/patterns.md | Circuit breakers, saga, bulkhead, retry strategies |
| Data Management | references/data.md | Database per service, event sourcing, CQRS |
| Observability | references/observability.md | Distributed tracing, correlation IDs, metrics |
const { v4: uuidv4 } = require('uuid');
function correlationMiddleware(req, res, next) {
req.correlationId = req.headers['x-correlation-id'] || uuidv4();
res.setHeader('x-correlation-id', req.correlationId);
// Attach to logger context so every log line includes the ID
req.log = logger.child({ correlationId: req.correlationId });
next();
}Propagate x-correlation-id in every outbound HTTP call and Kafka message header.
pybreaker)import pybreaker
# Opens after 5 failures; resets after 30 s in half-open state
breaker = pybreaker.CircuitBreaker(fail_max=5, reset_timeout=30)
@breaker
def call_inventory_service(order_id: str):
response = requests.get(f"{INVENTORY_URL}/stock/{order_id}", timeout=2)
response.raise_for_status()
return response.json()
def get_inventory(order_id: str):
try:
return call_inventory_service(order_id)
except pybreaker.CircuitBreakerError:
return {"status": "unavailable", "fallback": True}// Each step defines execute() and compensate() so rollback is automatic.
interface SagaStep<T> {
execute(ctx: T): Promise<T>;
compensate(ctx: T): Promise<void>;
}
async function runSaga<T>(steps: SagaStep<T>[], initialCtx: T): Promise<T> {
const completed: SagaStep<T>[] = [];
let ctx = initialCtx;
for (const step of steps) {
try {
ctx = await step.execute(ctx);
completed.push(step);
} catch (err) {
for (const done of completed.reverse()) {
await done.compensate(ctx).catch(console.error);
}
throw err;
}
}
return ctx;
}
// Usage: order creation saga
const orderSaga = [reserveInventoryStep, chargePaymentStep, scheduleShipmentStep];
await runSaga(orderSaga, { orderId, customerId, items });livenessProbe:
httpGet:
path: /health/live
port: 8080
initialDelaySeconds: 10
periodSeconds: 15
readinessProbe:
httpGet:
path: /health/ready
port: 8080
initialDelaySeconds: 5
periodSeconds: 10/health/live — returns 200 if the process is running.
/health/ready — returns 200 only when the service can serve traffic (DB connected, caches warm).
When designing microservices architecture, provide:
Domain-driven design, bounded contexts, event storming, REST/gRPC, message queues (Kafka, RabbitMQ), service mesh (Istio, Linkerd), Kubernetes, circuit breakers, saga patterns, event sourcing, CQRS, distributed tracing (Jaeger, Zipkin), API gateways, eventual consistency, CAP theorem
5b76101
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.