The fastest way to set a migration on fire is to rewrite a working monolith into microservices in one heroic project. You inherit all the coordination cost of distributed systems before you have learned to operate any of it, and you usually end up with a distributed monolith — the same tight coupling, now over a network, with worse latency and no transactional guarantees. This article lays out the incremental alternative: containerise as-is, get the platform under the running monolith first, and only then strangle out services along real domain boundaries.
When not to do this
Be honest about whether you should migrate at all. Kubernetes is an operational tax you pay every day; it pays back only at a certain scale and team size.
If you have one team, predictable traffic, and the monolith deploys cleanly, you may want managed compute (ECS Fargate, App Runner, a PaaS) instead of Kubernetes. Adopt Kubernetes for genuine multi-team autonomy, heterogeneous workloads, or scale that a single deployable cannot serve.
The distributed-monolith trap is the dominant failure mode. If two "services" must be deployed together, share a database schema, and call each other synchronously in a request chain, you have not decomposed anything — you have added network failure modes to a monolith. Decompose only where you have a real seam: an independent business capability with its own data and its own release cadence.
Step 1 — Containerise the monolith as-is
Do not refactor while you containerise; change one variable at a time. The goal is the same application, running identically, inside an image. The work here is closing 12-factor gaps: externalise config, secrets and state so the container is stateless and disposable.
# Multi-stage build for a JVM monolith — keep the runtime image lean
FROM eclipse-temurin:21-jdk AS build
WORKDIR /app
COPY . .
RUN ./gradlew --no-daemon bootJar
FROM eclipse-temurin:21-jre
RUN useradd -r -u 1001 appuser
WORKDIR /app
COPY --from=build /app/build/libs/app.jar app.jar
# Config and secrets come from the environment, never baked in
ENV JAVA_OPTS="-XX:MaxRAMPercentage=75 -XX:+UseG1GC"
USER 1001
EXPOSE 8080
ENTRYPOINT ["sh","-c","java $JAVA_OPTS -jar app.jar"]
The common gaps to fix at this stage:
- Config: read from environment variables / a config map, not bundled property files per environment.
- Secrets: mount from a secrets manager (External Secrets Operator backed by AWS Secrets Manager), never in the image.
- State: sticky in-memory sessions must move to Redis or become stateless JWTs. Local file writes must go to S3 or a persistent volume. Logs go to stdout.
- Health: expose real
/livezand/readyzendpoints — readiness must reflect downstream dependency health.
Step 2 — Stand up the platform, run the monolith on it
Build the platform and prove it by running the monolith on it — not a new service. This de-risks the platform itself (networking, ingress, secrets, CI/CD, observability) against a known-good workload before you add the complexity of decomposition.
apiVersion: apps/v1
kind: Deployment
metadata:
name: monolith
spec:
replicas: 3
selector:
matchLabels: { app: monolith }
template:
metadata:
labels: { app: monolith }
spec:
containers:
- name: monolith
image: registry.example.com/monolith:1.42.0
ports: [{ containerPort: 8080 }]
envFrom:
- secretRef: { name: monolith-secrets }
resources:
requests: { cpu: "500m", memory: "1Gi" }
limits: { memory: "1Gi" } # no CPU limit: avoid throttling
readinessProbe:
httpGet: { path: /readyz, port: 8080 }
initialDelaySeconds: 20
livenessProbe:
httpGet: { path: /livez, port: 8080 }
periodSeconds: 15
---
apiVersion: v1
kind: Service
metadata:
name: monolith
spec:
selector: { app: monolith }
ports: [{ port: 80, targetPort: 8080 }]
Stand up the supporting platform in parallel: EKS with managed node groups (or Karpenter for autoscaling), an ingress controller (AWS Load Balancer Controller or NGINX), observability (Prometheus + Grafana, OpenTelemetry tracing, centralised logs) wired up before you decompose — you cannot debug a distributed system you cannot see — and a CI/CD pipeline doing image build, scan, and progressive rollout via Argo CD or Flux.
Step 3 — Strangler-fig decomposition
Now route traffic through an ingress/API gateway and carve services out one capability at a time. The ingress fronts both the monolith and each extracted service; you move routes over incrementally. The monolith keeps serving everything you have not yet extracted.
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: app
annotations:
nginx.ingress.kubernetes.io/use-regex: "true"
spec:
rules:
- host: app.example.com
http:
paths:
- path: /api/payments(/|$)(.*) # extracted — new service
pathType: ImplementationSpecific
backend: { service: { name: payments-svc, port: { number: 80 } } }
- path: / # everything else — still the monolith
pathType: Prefix
backend: { service: { name: monolith, port: { number: 80 } } }
Pick the first service deliberately: a capability with a clear boundary, low coupling and limited shared data — a notifications or read-only reporting service is a far safer first cut than the order/payment core.
Decoupling the shared database
This is the genuinely hard part; the network routing is trivial by comparison. The new service must own its data, but the monolith still reads and writes the same tables. Two patterns, two sets of pitfalls:
| Approach | How it works | Pitfall |
|---|---|---|
| Anti-corruption layer | New service exposes an API; monolith calls it instead of the shared tables | Requires changing the monolith first; chatty if boundary is wrong |
| Dual-write | App writes to both old and new stores | No atomicity — partial failure silently diverges the data |
| CDC (Debezium etc.) | Stream changes from the source DB into the new service's store | Eventual consistency; ordering, replay and schema-drift handling |
Avoid dual-write. Without a distributed transaction it cannot be made consistent under failure, and you will spend more time reconciling drift than the migration saved. Prefer CDC for the transitional read path, and an anti-corruption layer to firm up the write boundary.
The sequence that works: introduce the anti-corruption layer so the monolith goes through the new service's API; use CDC to keep the new store hydrated during transition; then cut writes over to the new service and retire the shared tables. Expect the database to be the long pole of the whole programme.
State, sessions and traffic shifting
Sessions must already be externalised (step 1) or extracted services cannot serve authenticated traffic. Shift traffic progressively, never all at once: start with a canary (mirror traffic or send 1–5%), watch error rate and latency against the monolith baseline, then ramp. Service mesh or weighted ingress makes this controllable; keep the monolith route live as your instant rollback.
Conway's law is the real constraint
You ship your org chart. If one team owns both the monolith and the new service, the boundary will erode back into coupling — because the path of least resistance is a shared call, not an API contract.
Reorganise teams around the service boundaries you intend to create before you create them. A small platform team owns the cluster, ingress and CI/CD as a paved road; stream-aligned teams own services end to end. Without that topology, the technical decomposition will not hold.
The failure modes to watch for: extracting too many services too fast (operational overload), distributed transactions hiding inside synchronous call chains, no distributed tracing when latency regresses, and treating the database split as an afterthought. Go slow on boundaries, fast on automation, and keep the monolith as your safety net until each replacement has earned production traffic.
Planning a move to Kubernetes and want to avoid the big-bang trap? i2zone designs and delivers incremental, strangler-fig migrations with the platform and team topology to back them. Talk to us.