API Migration: Monolith to Microservices

When to Migrate — and When Not To

Microservices solve organizational and operational problems, not technical ones. Before starting, be clear about your actual pain points:

Indicators You Should Migrate

Team scaling: Multiple teams stepping on each other's code, merge conflicts are constant, a single team owns "everything" and is the bottleneck
Deployment coupling: A bug fix to the email service requires deploying the entire application, including the payment system
Technology diversity: You need to use Python for ML inference, Go for high-throughput streaming, and Node.js for real-time collaboration — impossible in a single codebase
Scaling asymmetry: Your image processing service needs 50 servers but your admin panel needs 2 — you cannot scale them independently

Indicators You Should NOT Migrate

You have fewer than 3-5 engineering teams
Your monolith is not yet well-understood or documented
You want to "fix" code quality issues — microservices distribute complexity, they do not eliminate it
You lack observability infrastructure (distributed tracing is non-negotiable)

Amazon CTO Werner Vogels: *"Don't do microservices unless you have a monolith that is too painful to work with."*

Step 1: Domain Boundary Identification

Event Storming

Event Storming is a collaborative workshop technique to discover domain boundaries. Gather engineers, product managers, and domain experts around a long paper roll:

1. Domain Events (orange sticky) — things that happen: "Order Placed", "Payment Captured"
2. Commands (blue sticky)         — what triggers events: "Place Order", "Capture Payment"
3. Aggregates (yellow sticky)     — the data that commands act on: Order, Payment, User
4. Bounded Contexts              — circles grouping related aggregates

API Surface Analysis

Analyze your monolith's internal module dependencies to find natural seams:

# Find Python module dependencies (using pydeps):
pip install pydeps
pydeps your_app --max-bacon 2 --pylib-all  # Generate dependency graph

# Count cross-module imports to identify coupling:
grep -r 'from orders import\|from payments import' --include='*.py' | \
  awk -F: '{print $1}' | sort | uniq -c | sort -rn

Tight clusters of modules that rarely import from other clusters are natural service boundaries. High coupling between potential service boundaries means high coordination cost — avoid splitting there first.

Bounded Context Example

┌──────────────────┐  ┌──────────────────┐  ┌──────────────────┐
│   Order Context   │  │  Payment Context  │  │  Catalog Context  │
│                  │  │                  │  │                  │
│ Order            │  │ Payment          │  │ Product          │
│ OrderItem        │  │ Invoice          │  │ Category         │
│ ShippingAddress  │  │ Refund           │  │ Inventory        │
└──────────────────┘  └──────────────────┘  └──────────────────┘

Step 2: API Gateway Introduction

The API gateway is the single entry point for all clients. It handles cross-cutting concerns so individual services do not have to.

Routing Rules

# Kong API Gateway configuration example:
services:
  - name: order-service
    url: http://order-service:8080
    routes:
      - name: orders-route
        paths: ['/api/orders', '/api/orders/*']

  - name: payment-service
    url: http://payment-service:8080
    routes:
      - name: payments-route
        paths: ['/api/payments', '/api/payments/*']

  # During migration: monolith handles everything else
  - name: monolith
    url: http://monolith:8080
    routes:
      - name: monolith-fallback
        paths: ['/api/*']

Authentication Centralization

Move auth from each service to the gateway:

# Kong JWT plugin on all routes:
plugins:
  - name: jwt
    config:
      claims_to_verify: [exp]
      key_claim_name: kid

Each microservice receives a pre-validated JWT as a header and trusts the gateway:

# In your microservice — trust the gateway's user context:
from fastapi import Request

def get_current_user(request: Request) -> dict:
    # Gateway passes validated claims as headers:
    user_id = request.headers.get('X-User-Id')
    user_roles = request.headers.get('X-User-Roles', '').split(',')
    return {'id': user_id, 'roles': user_roles}

Request Aggregation

The gateway can combine multiple microservice calls into a single client response, avoiding N+1 round-trips from mobile clients:

GET /api/dashboard  (single client request)
     │
     ├─ GET /api/orders?limit=5         (order-service)
     ├─ GET /api/notifications/unread   (notification-service)
     └─ GET /api/user/profile           (user-service)
     │
     └─ Aggregated JSON response to client

Step 3: Database Decomposition

The Shared Database Anti-Pattern

The most common mistake: run microservices but keep a single shared database. This couples services at the data layer and negates most microservice benefits.

WRONG:
  Order Service ──┐
  Payment Service─┼──► shared PostgreSQL DB  (tight coupling!)
  User Service  ──┘

RIGHT:
  Order Service   ──► orders_db (PostgreSQL)
  Payment Service ──► payments_db (PostgreSQL)
  User Service    ──► users_db (PostgreSQL)

Data Sync During Transition

During migration, you will temporarily need data in multiple services. Options:

Option 1: Database view sharing (read-only bridge)
  Orders DB ──► read replica ──► Payment Service (temporary read access)

Option 2: Event-driven synchronization
  Order Service publishes OrderCreated event
  Payment Service subscribes and maintains its own user snapshot

Option 3: API calls between services
  Payment Service calls GET /api/users/{id} when it needs user data
  (acceptable for low-frequency lookups, problematic for high-volume joins)

Eventual Consistency

Accept that data across services will be temporarily inconsistent. Design UX accordingly:

# Order service publishes event after database write:
import json
from dataclasses import dataclass, asdict

@dataclass
class OrderCreatedEvent:
    order_id: str
    user_id: str
    total_cents: int
    created_at: str

def publish_order_created(order: Order, publisher: MessagePublisher) -> None:
    event = OrderCreatedEvent(
        order_id=str(order.id),
        user_id=str(order.user_id),
        total_cents=order.total_cents,
        created_at=order.created_at.isoformat(),
    )
    publisher.publish('orders.created', json.dumps(asdict(event)))

Step 4: Status Code Consistency Across Services

One of the most common microservices mistakes: each service invents its own error format, leaving clients to parse different error schemas per endpoint.

Unified Error Contract (RFC 7807)

Define a shared error library used by all services:

# shared-lib/error_response.py — used by all microservices
from dataclasses import dataclass
from typing import Any

@dataclass
class ErrorResponse:
    type: str       # URL to error docs
    title: str      # Human-readable summary
    status: int     # HTTP status code
    detail: str     # Request-specific explanation
    instance: str   # Request path
    trace_id: str   # Correlation ID for distributed tracing

    def to_dict(self) -> dict[str, Any]:
        return {
            'type': self.type,
            'title': self.title,
            'status': self.status,
            'detail': self.detail,
            'instance': self.instance,
            'traceId': self.trace_id,
        }

Correlation IDs and Distributed Tracing

The API gateway injects a unique X-Trace-Id header on every request. Each service passes it downstream and logs it with every line:

import uuid
from fastapi import FastAPI, Request, Response

app = FastAPI()

@app.middleware('http')
async def trace_id_middleware(request: Request, call_next: Any) -> Response:
    trace_id = request.headers.get('X-Trace-Id', str(uuid.uuid4()))
    request.state.trace_id = trace_id
    response = await call_next(request)
    response.headers['X-Trace-Id'] = trace_id
    return response

When an error occurs, the client receives the traceId in the error response and can provide it to support. Engineers query Jaeger, Zipkin, or Datadog APM to trace the full request path across all services.

HTTP Status Code Consistency Rules

Enforce these rules across all services via an API style guide:

200 OK           — successful GET/PATCH/PUT (with response body)
201 Created      — successful POST that created a resource (with Location header)
204 No Content   — successful DELETE or POST with no response body
400 Bad Request  — invalid input, schema validation failure
401 Unauthorized — missing or invalid authentication token
403 Forbidden    — valid token, insufficient permissions
404 Not Found    — resource does not exist
409 Conflict     — optimistic locking conflict, duplicate resource
422 Unprocessable Entity — semantic validation failure (valid schema, invalid logic)
429 Too Many Requests   — rate limit exceeded (add Retry-After header)
500 Internal Server Error — unexpected server failure (never expose internals)
503 Service Unavailable — dependent service is down, circuit breaker open

Run automated contract tests (using Pact or Dredd) against every service to verify it returns the agreed status codes for each scenario. Catching a service returning 200 with {"error": "..."} in CI is far better than discovering it in production.