Migrating a Monolith to Kubernetes Without a Big Bang

The fastest way to set a migration on fire is to rewrite a working monolith into microservices in one heroic project. You inherit all the coordination cost of distributed systems before you have learned to operate any of it, and you usually end up with a distributed monolith — the same tight coupling, now over a network, with worse latency and no transactional guarantees. This article lays out the incremental alternative: containerise as-is, get the platform under the running monolith first, and only then strangle out services along real domain boundaries.

When not to do this

Be honest about whether you should migrate at all. Kubernetes is an operational tax you pay every day; it pays back only at a certain scale and team size.

If you have one team, predictable traffic, and the monolith deploys cleanly, you may want managed compute (ECS Fargate, App Runner, a PaaS) instead of Kubernetes. Adopt Kubernetes for genuine multi-team autonomy, heterogeneous workloads, or scale that a single deployable cannot serve.

The distributed-monolith trap is the dominant failure mode. If two "services" must be deployed together, share a database schema, and call each other synchronously in a request chain, you have not decomposed anything — you have added network failure modes to a monolith. Decompose only where you have a real seam: an independent business capability with its own data and its own release cadence.

Step 1 — Containerise the monolith as-is

Do not refactor while you containerise; change one variable at a time. The goal is the same application, running identically, inside an image. The work here is closing 12-factor gaps: externalise config, secrets and state so the container is stateless and disposable.

# Multi-stage build for a JVM monolith — keep the runtime image lean
FROM eclipse-temurin:21-jdk AS build
WORKDIR /app
COPY . .
RUN ./gradlew --no-daemon bootJar

FROM eclipse-temurin:21-jre
RUN useradd -r -u 1001 appuser
WORKDIR /app
COPY --from=build /app/build/libs/app.jar app.jar
# Config and secrets come from the environment, never baked in
ENV JAVA_OPTS="-XX:MaxRAMPercentage=75 -XX:+UseG1GC"
USER 1001
EXPOSE 8080
ENTRYPOINT ["sh","-c","java $JAVA_OPTS -jar app.jar"]

The common gaps to fix at this stage:

Config: read from environment variables / a config map, not bundled property files per environment.
Secrets: mount from a secrets manager (External Secrets Operator backed by AWS Secrets Manager), never in the image.
State: sticky in-memory sessions must move to Redis or become stateless JWTs. Local file writes must go to S3 or a persistent volume. Logs go to stdout.
Health: expose real /livez and /readyz endpoints — readiness must reflect downstream dependency health.

Step 2 — Stand up the platform, run the monolith on it

Build the platform and prove it by running the monolith on it — not a new service. This de-risks the platform itself (networking, ingress, secrets, CI/CD, observability) against a known-good workload before you add the complexity of decomposition.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: monolith
spec:
  replicas: 3
  selector:
    matchLabels: { app: monolith }
  template:
    metadata:
      labels: { app: monolith }
    spec:
      containers:
        - name: monolith
          image: registry.example.com/monolith:1.42.0
          ports: [{ containerPort: 8080 }]
          envFrom:
            - secretRef: { name: monolith-secrets }
          resources:
            requests: { cpu: "500m", memory: "1Gi" }
            limits:   { memory: "1Gi" }   # no CPU limit: avoid throttling
          readinessProbe:
            httpGet: { path: /readyz, port: 8080 }
            initialDelaySeconds: 20
          livenessProbe:
            httpGet: { path: /livez, port: 8080 }
            periodSeconds: 15
---
apiVersion: v1
kind: Service
metadata:
  name: monolith
spec:
  selector: { app: monolith }
  ports: [{ port: 80, targetPort: 8080 }]

Stand up the supporting platform in parallel: EKS with managed node groups (or Karpenter for autoscaling), an ingress controller (AWS Load Balancer Controller or NGINX), observability (Prometheus + Grafana, OpenTelemetry tracing, centralised logs) wired up before you decompose — you cannot debug a distributed system you cannot see — and a CI/CD pipeline doing image build, scan, and progressive rollout via Argo CD or Flux.

Step 3 — Strangler-fig decomposition

Now route traffic through an ingress/API gateway and carve services out one capability at a time. The ingress fronts both the monolith and each extracted service; you move routes over incrementally. The monolith keeps serving everything you have not yet extracted.

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: app
  annotations:
    nginx.ingress.kubernetes.io/use-regex: "true"
spec:
  rules:
    - host: app.example.com
      http:
        paths:
          - path: /api/payments(/|$)(.*)   # extracted — new service
            pathType: ImplementationSpecific
            backend: { service: { name: payments-svc, port: { number: 80 } } }
          - path: /                          # everything else — still the monolith
            pathType: Prefix
            backend: { service: { name: monolith, port: { number: 80 } } }

Pick the first service deliberately: a capability with a clear boundary, low coupling and limited shared data — a notifications or read-only reporting service is a far safer first cut than the order/payment core.

Decoupling the shared database

This is the genuinely hard part; the network routing is trivial by comparison. The new service must own its data, but the monolith still reads and writes the same tables. Two patterns, two sets of pitfalls:

Approach	How it works	Pitfall
Anti-corruption layer	New service exposes an API; monolith calls it instead of the shared tables	Requires changing the monolith first; chatty if boundary is wrong
Dual-write	App writes to both old and new stores	No atomicity — partial failure silently diverges the data
CDC (Debezium etc.)	Stream changes from the source DB into the new service's store	Eventual consistency; ordering, replay and schema-drift handling

Avoid dual-write. Without a distributed transaction it cannot be made consistent under failure, and you will spend more time reconciling drift than the migration saved. Prefer CDC for the transitional read path, and an anti-corruption layer to firm up the write boundary.

The sequence that works: introduce the anti-corruption layer so the monolith goes through the new service's API; use CDC to keep the new store hydrated during transition; then cut writes over to the new service and retire the shared tables. Expect the database to be the long pole of the whole programme.

State, sessions and traffic shifting

Sessions must already be externalised (step 1) or extracted services cannot serve authenticated traffic. Shift traffic progressively, never all at once: start with a canary (mirror traffic or send 1–5%), watch error rate and latency against the monolith baseline, then ramp. Service mesh or weighted ingress makes this controllable; keep the monolith route live as your instant rollback.

Conway's law is the real constraint

You ship your org chart. If one team owns both the monolith and the new service, the boundary will erode back into coupling — because the path of least resistance is a shared call, not an API contract.

Reorganise teams around the service boundaries you intend to create before you create them. A small platform team owns the cluster, ingress and CI/CD as a paved road; stream-aligned teams own services end to end. Without that topology, the technical decomposition will not hold.

The failure modes to watch for: extracting too many services too fast (operational overload), distributed transactions hiding inside synchronous call chains, no distributed tracing when latency regresses, and treating the database split as an afterthought. Go slow on boundaries, fast on automation, and keep the monolith as your safety net until each replacement has earned production traffic.

Planning a move to Kubernetes and want to avoid the big-bang trap? i2zone designs and delivers incremental, strangler-fig migrations with the platform and team topology to back them. Talk to us.