.NET 8 Cloud-Native: Health Probes, Polly Resilience & Zero-Downtime Kubernetes Deployments

Q: When should a circuit breaker open vs stay closed?

The circuit opens when failures exceed a threshold (e.g., 50% of requests fail in a 30-second window). While open, calls fail immediately without touching the downstream service — this protects both your API from timeout cascades and the struggling dependency from additional load. After a break duration, the circuit enters half-open: one probe request goes through. Success closes the circuit; failure reopens it. Set break duration to at least 30 seconds — shorter and you'll hammer a recovering service into a second outage.

Q: What should I check in logs during a rolling deployment?

Watch four things: (1) HTTP 5xx error rate — any spike above baseline is a problem; (2) readiness probe failures — repeated failures delay the rollout and signal a startup issue; (3) graceful shutdown logs — confirm 'Application is shutting down' and 'Hosted service stopped' appear before the pod terminates; (4) migration logs — confirm the init container completed successfully before the first app pod starts. Set up an alert on 5xx rate before deploying so you can roll back automatically if it triggers.

Q: Is Polly v8 the same as Microsoft.Extensions.Resilience?

Polly v8 is the core library. Microsoft.Extensions.Resilience (shipped with .NET 8) wraps it with DI integration, pre-built pipeline configurations, telemetry, and HttpClient integration via AddStandardResilienceHandler(). For most APIs, use Microsoft.Extensions.Resilience — it gives you sensible defaults with one line of code and integrates with OpenTelemetry automatically. Use raw Polly v8 APIs when you need custom pipeline logic, non-HTTP operations, or fine-grained control over fallback strategies.

Q: How do I test health checks locally without a Kubernetes cluster?

Use Docker Compose with the HEALTHCHECK directive to test liveness and readiness locally. The compose healthcheck command runs curl against /health/live and /health/ready on an interval. You can also use the ASP.NET Core HealthChecks UI package to get a visual dashboard at /healthchecks-ui showing all check statuses in real time. For load testing, k6 and NBomber both run locally and produce the same metrics graphs you'd see in production.

Your API works on your machine. It passes all tests. You deploy it to Kubernetes and a rolling update drops 3% of requests. The readiness probe keeps failing on startup. A slow downstream service cascades into your entire API timing out. The database migration ran on three pods simultaneously and two threw exceptions.

These aren't edge cases. They're the standard failure modes of containerized APIs and every one of them has a specific, well-understood fix. This tutorial wires them all into a single Orders API you can run, containerize, and deploy with confidence.

What You'll Build

A production-ready containerized Orders API (tutorials/cloud-native/OrdersApiCloudNative/) with every cloud-native layer in place:

Liveness and readiness health endpoints with separate DB and queue dependency checks
Resilience pipeline — timeouts, exponential-backoff retries, and circuit breaker via Polly v8 / Microsoft.Extensions.Resilience
Background order processor — IHostedService consuming a queue with cooperative cancellation and graceful shutdown
Production Dockerfile — multi-stage, non-root user, health check, env-var configuration
Safe migration strategy — init container pattern, no startup race conditions
Kubernetes manifests — Deployment with rolling update, startup/liveness/readiness probes, resource limits
Quick load test with k6 and what to watch in structured logs and metrics

Project Setup

Start with a Minimal API — fewer abstractions means less to debug when something goes wrong at the infrastructure layer. Add only the packages you'll actually use.

Terminal

dotnet new webapi -n OrdersApiCloudNative -minimal
cd OrdersApiCloudNative

# Health checks
dotnet add package Microsoft.Extensions.Diagnostics.HealthChecks.EntityFrameworkCore

# Resilience (wraps Polly v8 with DI integration)
dotnet add package Microsoft.Extensions.Http.Resilience
dotnet add package Microsoft.Extensions.Resilience

# EF Core + SQLite for demo (swap for Postgres/SQL Server in production)
dotnet add package Microsoft.EntityFrameworkCore.Sqlite
dotnet add package Microsoft.EntityFrameworkCore.Design

# OpenTelemetry for metrics/tracing
dotnet add package OpenTelemetry.Extensions.Hosting
dotnet add package OpenTelemetry.Instrumentation.AspNetCore
dotnet add package OpenTelemetry.Exporter.Console

Domain Model & DbContext

Models/Order.cs & Data/OrdersDbContext.cs

public enum OrderStatus { Pending, Processing, Completed, Failed }

public class Order
{
    public int         Id         { get; set; }
    public string      CustomerId { get; set; } = "";
    public decimal     Total      { get; set; }
    public OrderStatus Status     { get; set; } = OrderStatus.Pending;
    public DateTime    CreatedAt  { get; set; } = DateTime.UtcNow;
    public DateTime?   ProcessedAt { get; set; }
}

public class OrdersDbContext(DbContextOptions<OrdersDbContext> options)
    : DbContext(options)
{
    public DbSet<Order> Orders => Set<Order>();
}

// Register in Program.cs
builder.Services.AddDbContext<OrdersDbContext>(opt =>
    opt.UseSqlite(builder.Configuration.GetConnectionString("DefaultConnection")
        ?? "Data Source=orders.db"));

SQLite for Local Development, Real Database for Production

SQLite lets you run this tutorial with zero external dependencies. Every pattern here — health checks, resilience, migrations — works identically with PostgreSQL or SQL Server. Swap the package and connection string and the rest of the code is unchanged. The EF Core abstraction is the point.

Health Checks: Liveness & Readiness

Health probes are the contract between your API and its orchestrator. Kubernetes reads them to decide whether to route traffic to your pod and whether to restart it. Get the semantics wrong and you'll either restart healthy pods or keep unhealthy ones serving traffic.

The Two Probes and What They Mean

Liveness — /health/live

Is this process still healthy enough to keep running? Failure triggers a container restart. Use only for detecting catastrophic, unrecoverable states: deadlocked thread pool, corrupted internal state, OOM conditions. Never put slow external dependency checks here — a 5-second database timeout would restart a perfectly healthy API pod.

Readiness — /health/ready

Is this instance ready to accept traffic right now? Failure removes the pod from the load balancer without restarting it. Use for: startup warm-up (EF Core model compilation, cache priming), dependency availability (DB reachable, queue connected), and temporary overload detection.

Register Health Check Endpoints

Program.cs — Health Check Setup

builder.Services.AddHealthChecks()
    // Liveness: just checks the process is alive and responding
    .AddCheck("self", () => HealthCheckResult.Healthy("Process is running"),
              tags: ["live"])

    // Readiness: DB reachable
    .AddDbContextCheck<OrdersDbContext>(
        name: "database",
        tags: ["ready"])

    // Readiness: queue reachable (custom check — see Section 3)
    .AddCheck<QueueHealthCheck>(
        name: "queue",
        tags: ["ready"]);

var app = builder.Build();

// Separate endpoints filtered by tag
app.MapHealthChecks("/health/live", new HealthCheckOptions
{
    Predicate      = check => check.Tags.Contains("live"),
    ResponseWriter = WriteHealthResponse
});

app.MapHealthChecks("/health/ready", new HealthCheckOptions
{
    Predicate      = check => check.Tags.Contains("ready"),
    ResponseWriter = WriteHealthResponse
});

// Full diagnostic endpoint — restrict to internal networks in production
app.MapHealthChecks("/health", new HealthCheckOptions
{
    ResponseWriter = WriteHealthResponse
})
.RequireAuthorization("InternalOnly"); // not exposed publicly

JSON Health Response Writer

The default response is a plain-text "Healthy" string. Replace it with structured JSON so monitoring systems can parse individual check results.

Structured JSON Health Response

static Task WriteHealthResponse(HttpContext ctx, HealthReport report)
{
    ctx.Response.ContentType = "application/json";

    var result = new
    {
        status   = report.Status.ToString(),
        duration = report.TotalDuration.TotalMilliseconds,
        checks   = report.Entries.Select(e => new
        {
            name        = e.Key,
            status      = e.Value.Status.ToString(),
            description = e.Value.Description,
            duration    = e.Value.Duration.TotalMilliseconds,
            error       = e.Value.Exception?.Message
        })
    };

    return ctx.Response.WriteAsJsonAsync(result);
}

// Example response
// {
//   "status": "Healthy",
//   "duration": 12.4,
//   "checks": [
//     { "name": "database", "status": "Healthy", "duration": 8.1 },
//     { "name": "queue",    "status": "Healthy", "duration": 4.3 }
//   ]
// }

Health Endpoints Must Be Fast

Kubernetes probes fire every few seconds. If your health endpoint takes more than 1–2 seconds, probe timeouts cascade into false readiness failures, which cascade into traffic routing gaps. Set a global timeout on all health checks: builder.Services.AddHealthChecks().AddTimeoutCheck("global-timeout", TimeSpan.FromSeconds(2)). Any check that runs longer than 2 seconds reports Degraded rather than blocking the response.

Dependency Health Checks (DB & Queue)

The built-in AddDbContextCheck handles database reachability. For queues, external HTTP services, and any other dependency, write a custom IHealthCheck. Keep them lightweight — a ping, not a full integration test.

Custom Queue Health Check

HealthChecks/QueueHealthCheck.cs

public class QueueHealthCheck(IOrderQueue queue) : IHealthCheck
{
    public async Task<HealthCheckResult> CheckHealthAsync(
        HealthCheckContext context,
        CancellationToken  cancellationToken = default)
    {
        try
        {
            // Lightweight ping — don't dequeue or produce a real message
            var isReachable = await queue.PingAsync(cancellationToken);

            return isReachable
                ? HealthCheckResult.Healthy("Queue is reachable")
                : HealthCheckResult.Unhealthy("Queue ping returned false");
        }
        catch (Exception ex)
        {
            // Degraded vs Unhealthy: Degraded means "still usable but impaired"
            // Unhealthy means "do not route traffic here"
            return HealthCheckResult.Unhealthy(
                description: "Queue health check threw an exception",
                exception:   ex,
                data: new Dictionary<string, object>
                {
                    ["error"]     = ex.Message,
                    ["timestamp"] = DateTime.UtcNow
                });
        }
    }
}

External HTTP Dependency Check

HealthChecks/PaymentServiceHealthCheck.cs

public class PaymentServiceHealthCheck(IHttpClientFactory factory) : IHealthCheck
{
    public async Task<HealthCheckResult> CheckHealthAsync(
        HealthCheckContext context,
        CancellationToken  cancellationToken = default)
    {
        try
        {
            using var client  = factory.CreateClient("PaymentService");
            using var cts     = CancellationTokenSource.CreateLinkedTokenSource(cancellationToken);
            cts.CancelAfter(TimeSpan.FromSeconds(2)); // local timeout — don't rely on global

            var response = await client.GetAsync("/health", cts.Token);

            return response.IsSuccessStatusCode
                ? HealthCheckResult.Healthy($"Payment service returned {(int)response.StatusCode}")
                : HealthCheckResult.Degraded($"Payment service returned {(int)response.StatusCode}");
        }
        catch (Exception ex) when (ex is HttpRequestException or TaskCanceledException)
        {
            return HealthCheckResult.Unhealthy("Payment service is unreachable", ex);
        }
    }
}

// Register with a tag so it only appears in readiness checks
builder.Services.AddHealthChecks()
    .AddCheck<PaymentServiceHealthCheck>("payment-service", tags: ["ready"]);

Never Include Slow or Flaky Checks in Liveness

A liveness probe failure restarts your container. If you add a database check to liveness, a 30-second DB maintenance window restarts every pod in your deployment simultaneously — replacing a planned maintenance event with an unplanned outage. Keep liveness checks limited to: memory pressure checks, self-diagnostic assertions, and basic process health. All dependency checks belong in readiness only.

Resilience Pipeline: Timeouts, Retries & Circuit Breaker

Every external call your API makes — to a database, a payment service, a notification queue — can fail, slow down, or become unavailable. Without a resilience pipeline, one slow downstream service turns into a thread-pool exhaustion cascade that takes down your entire API. Polly v8 and Microsoft.Extensions.Resilience give you a composable pipeline to handle all three failure modes.

Standard Resilience Handler for HttpClient

For HTTP calls to downstream services, AddStandardResilienceHandler() gives you a pre-configured pipeline with sensible defaults in one line.

Program.cs — Standard HttpClient Resilience

builder.Services.AddHttpClient("PaymentService", client =>
{
    client.BaseAddress = new Uri(
        builder.Configuration["Services:PaymentService:BaseUrl"]
            ?? throw new InvalidOperationException("PaymentService URL is required"));
    client.DefaultRequestHeaders.Add("Accept", "application/json");
})
.AddStandardResilienceHandler(options =>
{
    // Total request timeout (outermost) — caps the entire retry cycle
    options.TotalRequestTimeout = new HttpTimeoutStrategyOptions
    {
        Timeout = TimeSpan.FromSeconds(30)
    };

    // Per-attempt timeout — fail fast on a single slow attempt
    options.AttemptTimeout = new HttpTimeoutStrategyOptions
    {
        Timeout = TimeSpan.FromSeconds(8)
    };

    // Retry with jittered exponential backoff — 3 retries by default
    options.Retry = new HttpRetryStrategyOptions
    {
        MaxRetryAttempts = 3,
        BackoffType      = DelayBackoffType.Exponential,
        UseJitter        = true,  // avoids thundering herd on recovery
        Delay            = TimeSpan.FromMilliseconds(500),
        ShouldHandle     = args => ValueTask.FromResult(
            args.Outcome.Result?.StatusCode is
                HttpStatusCode.RequestTimeout or
                HttpStatusCode.TooManyRequests or
                HttpStatusCode.BadGateway or
                HttpStatusCode.ServiceUnavailable or
                HttpStatusCode.GatewayTimeout)
    };

    // Circuit breaker — opens after too many failures
    options.CircuitBreaker = new HttpCircuitBreakerStrategyOptions
    {
        SamplingDuration         = TimeSpan.FromSeconds(30),
        FailureRatio             = 0.5,   // open when 50% of requests fail
        MinimumThroughput        = 10,    // need at least 10 requests to evaluate
        BreakDuration            = TimeSpan.FromSeconds(30) // stay open for 30s
    };
});

Manual Resilience Pipeline for Non-HTTP Operations

Database calls, queue writes, and internal service calls need resilience too. Build a named pipeline via ResiliencePipelineProvider.

Custom Resilience Pipeline — DB Operations

using Microsoft.Extensions.Resilience;
using Polly;

builder.Services.AddResiliencePipeline("db-pipeline", builder =>
{
    builder
        // Timeout: single operation cap
        .AddTimeout(TimeSpan.FromSeconds(5))

        // Retry: transient DB errors (deadlocks, connection resets)
        .AddRetry(new RetryStrategyOptions
        {
            MaxRetryAttempts = 3,
            BackoffType      = DelayBackoffType.Exponential,
            UseJitter        = true,
            Delay            = TimeSpan.FromMilliseconds(200),
            ShouldHandle     = args => args.Outcome.Exception switch
            {
                DbUpdateException      => new ValueTask<bool>(true),
                TimeoutException       => new ValueTask<bool>(true),
                OperationCanceledException => new ValueTask<bool>(false),
                _                      => new ValueTask<bool>(false)
            },
            OnRetry = args =>
            {
                // Structured log on each retry attempt
                Console.WriteLine(
                    $"DB retry attempt {args.AttemptNumber}, delay {args.RetryDelay.TotalMs}ms");
                return default;
            }
        })

        // Circuit breaker: stop hammering a struggling DB
        .AddCircuitBreaker(new CircuitBreakerStrategyOptions
        {
            SamplingDuration  = TimeSpan.FromSeconds(60),
            FailureRatio      = 0.6,
            MinimumThroughput = 5,
            BreakDuration     = TimeSpan.FromSeconds(45),
            OnOpened = args =>
            {
                Console.WriteLine($"DB circuit breaker OPENED. Break duration: {args.BreakDuration}");
                return default;
            },
            OnClosed = _ =>
            {
                Console.WriteLine("DB circuit breaker CLOSED — DB recovered.");
                return default;
            }
        });
});

// Use in a service
public class OrderService(
    OrdersDbContext                          db,
    ResiliencePipelineProvider<string>       pipelines)
{
    private readonly ResiliencePipeline _pipeline =
        pipelines.GetPipeline("db-pipeline");

    public async Task<Order?> GetOrderAsync(int id, CancellationToken ct)
    {
        return await _pipeline.ExecuteAsync(
            async token => await db.Orders.FindAsync([id], token),
            ct);
    }
}

Jitter Is Not Optional

Without jitter, all retrying clients hit a recovering service at exactly the same intervals — 500ms, 1000ms, 2000ms — creating a thundering herd that re-overwhelms the service just as it's coming back. UseJitter = true adds random variance to each delay, spreading retry load across time. The performance difference during a real outage recovery is significant.

Background Worker & Graceful Shutdown

The Orders API processes orders asynchronously — a background service pulls from a queue and processes each order. When a pod shuts down, in-flight orders must complete or be safely returned to the queue. Dropping them silently is not acceptable.

IHostedService Order Processor

Workers/OrderProcessorService.cs

public class OrderProcessorService(
    IOrderQueue                      queue,
    IServiceScopeFactory             scopeFactory,
    ILogger<OrderProcessorService>   logger)
    : BackgroundService
{
    protected override async Task ExecuteAsync(CancellationToken stoppingToken)
    {
        logger.LogInformation("Order processor started");

        // stoppingToken is cancelled when the host begins shutting down
        await foreach (var orderId in queue.ReadAllAsync(stoppingToken))
        {
            // Create a new scope per order — avoids DbContext lifetime issues
            await using var scope = scopeFactory.CreateAsyncScope();
            var orderService = scope.ServiceProvider.GetRequiredService<OrderService>();

            try
            {
                logger.LogInformation("Processing order {OrderId}", orderId);
                await orderService.ProcessAsync(orderId, stoppingToken);
                logger.LogInformation("Order {OrderId} processed successfully", orderId);
            }
            catch (OperationCanceledException) when (stoppingToken.IsCancellationRequested)
            {
                // Shutdown requested — return message to queue before exiting
                await queue.RequeueAsync(orderId);
                logger.LogWarning("Shutdown: requeued order {OrderId}", orderId);
                break; // exit the processing loop cleanly
            }
            catch (Exception ex)
            {
                logger.LogError(ex, "Failed to process order {OrderId}", orderId);
                await queue.DeadLetterAsync(orderId, ex.Message);
            }
        }

        logger.LogInformation("Order processor stopped");
    }
}

// Register in Program.cs
builder.Services.AddHostedService<OrderProcessorService>();

Graceful Shutdown Configuration

When Kubernetes sends SIGTERM, it simultaneously starts removing the pod from load balancers. There's a propagation delay — requests can still arrive for several seconds after SIGTERM. Add a shutdown delay to absorb it.

Program.cs — Graceful Shutdown

// Allow up to 30 seconds for in-flight work to complete
builder.Services.Configure<HostOptions>(options =>
{
    options.ShutdownTimeout = TimeSpan.FromSeconds(30);
});

// Configure Kestrel to delay accepting new connections on shutdown
builder.WebHost.ConfigureKestrel(options =>
{
    // Drain existing requests before fully stopping
    options.Limits.MaxRequestBodySize = 10 * 1024 * 1024; // 10 MB
});

var app = builder.Build();

// Register shutdown lifecycle hooks
var lifetime = app.Services.GetRequiredService<IHostApplicationLifetime>();

lifetime.ApplicationStopping.Register(() =>
{
    // SIGTERM received — log it immediately
    var logger = app.Services.GetRequiredService<ILogger<Program>>();
    logger.LogWarning("Application is shutting down — draining traffic");

    // Delay to absorb Kubernetes endpoint propagation lag
    // New requests may still arrive for 5-10s after SIGTERM
    Thread.Sleep(TimeSpan.FromSeconds(10));
});

lifetime.ApplicationStopped.Register(() =>
{
    var logger = app.Services.GetRequiredService<ILogger<Program>>();
    logger.LogInformation("Application stopped cleanly");
});

The preStop Hook Pairs with the Shutdown Delay

The Kubernetes preStop lifecycle hook runs before SIGTERM is sent. Set it to sleep 10 in your pod spec — this gives Kubernetes time to drain the endpoint before your app even starts its shutdown sequence. The combination of preStop sleep + shutdown delay in code means you absorb the full propagation window without relying on either alone.

Production Dockerfile & Configuration

A production container image needs to be small, run as non-root, and expose only what it needs to. Multi-stage builds handle the size. Explicit user creation handles the security.

Multi-Stage Dockerfile

Dockerfile

##── Stage 1: Build ──────────────────────────────────────────
FROM mcr.microsoft.com/dotnet/sdk:8.0 AS build
WORKDIR /src

# Copy csproj first for layer caching — restores only when dependencies change
COPY ["OrdersApiCloudNative.csproj", "./"]
RUN dotnet restore --locked-mode

# Copy source and publish
COPY . .
RUN dotnet publish -c Release -o /app/publish \
    --no-restore \
    -p:PublishSingleFile=false \
    -p:PublishReadyToRun=true

##── Stage 2: Runtime ────────────────────────────────────────
FROM mcr.microsoft.com/dotnet/aspnet:8.0 AS runtime
WORKDIR /app

# Create non-root user — never run as root in production
RUN groupadd -r appgroup && useradd -r -g appgroup -s /sbin/nologin appuser

# Copy published output
COPY --from=build --chown=appuser:appgroup /app/publish .

# Switch to non-root user
USER appuser

# Expose only the port Kestrel listens on
EXPOSE 8080

# Container-level health check (Docker Compose / standalone Docker)
HEALTHCHECK --interval=30s --timeout=5s --start-period=20s --retries=3 \
  CMD curl -f http://localhost:8080/health/live || exit 1

# Entry point
ENTRYPOINT ["dotnet", "OrdersApiCloudNative.dll"]

Configuration via Environment Variables

Never bake secrets into the image. Pass all environment-specific config at runtime via environment variables. .NET maps __ to : automatically.

docker-compose.yml — Local Run

version: "3.9"
services:
  orders-api:
    build: .
    ports:
      - "8080:8080"
    environment:
      - ASPNETCORE_ENVIRONMENT=Production
      - ASPNETCORE_URLS=http://+:8080
      - ConnectionStrings__DefaultConnection=Data Source=/data/orders.db
      - Services__PaymentService__BaseUrl=http://payment-svc:8081
      - Jwt__Key=${JWT_KEY}        # from .env file — never hard-coded
      - Jwt__Issuer=https://orders.local
    volumes:
      - orders-data:/data
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:8080/health/live"]
      interval: 30s
      timeout: 5s
      retries: 3
      start_period: 20s
    restart: unless-stopped

volumes:
  orders-data:

ASPNETCORE_URLS Must Be Set in Containers

By default, Kestrel in .NET 8 listens on port 8080 inside containers when no explicit configuration is set — but only if ASPNETCORE_ENVIRONMENT is set correctly. Always set ASPNETCORE_URLS=http://+:8080 explicitly. Without it, a misconfigured environment variable can cause Kestrel to bind to port 5000, your EXPOSE 8080 directive becomes meaningless, and health checks fail silently.

Safe Database Migrations Strategy

Running db.Database.MigrateAsync() on startup in a Kubernetes deployment causes a race condition. Three pods start simultaneously, all hit the migrations check at the same time, and two throw concurrency exceptions. The fix is to separate migration execution from application startup entirely.

The Init Container Pattern

Dedicated Migrate Command

// Program.cs — check for migrate flag before building the full app
if (args.Contains("--migrate"))
{
    // Minimal host — just enough to run migrations
    var host = Host.CreateDefaultBuilder(args)
        .ConfigureServices((ctx, services) =>
        {
            services.AddDbContext<OrdersDbContext>(opt =>
                opt.UseSqlite(
                    ctx.Configuration.GetConnectionString("DefaultConnection")));
        })
        .Build();

    using var scope = host.Services.CreateScope();
    var db     = scope.ServiceProvider.GetRequiredService<OrdersDbContext>();
    var logger = scope.ServiceProvider.GetRequiredService<ILogger<Program>>();

    logger.LogInformation("Applying pending migrations…");
    await db.Database.MigrateAsync();
    logger.LogInformation("Migrations complete");

    return; // exit — don't start the API
}

// Normal startup continues below
var builder = WebApplication.CreateBuilder(args);

Kubernetes Init Container

k8s/deployment.yaml — Init Container

spec:
  initContainers:
    # Runs BEFORE any app containers start
    # Completes migrations once — then Kubernetes starts the rolling app update
    - name: db-migrate
      image: your-registry/orders-api:latest
      command: ["dotnet", "OrdersApiCloudNative.dll", "--migrate"]
      env:
        - name: ConnectionStrings__DefaultConnection
          valueFrom:
            secretKeyRef:
              name: orders-secrets
              key: db-connection-string
      resources:
        requests:
          memory: "128Mi"
          cpu: "100m"
        limits:
          memory: "256Mi"
          cpu: "500m"

  containers:
    - name: orders-api
      image: your-registry/orders-api:latest
      # No --migrate flag — application starts clean

Backward-Compatible Migrations Only

During a rolling deployment, old and new versions of your API run simultaneously for a window of time. A migration that drops a column, renames a field, or changes a constraint will break the still-running old pods. Write migrations in two phases: Phase 1 (deploy n+1 compatible with the old schema) then Phase 2 (deploy the cleanup migration after all old pods are gone). This is the "expand/contract" or "parallel change" migration pattern.

Kubernetes Manifests & Zero-Downtime Rollouts

A rolling deployment replaces old pods with new ones incrementally. Zero-downtime means traffic never routes to a pod that isn't ready, and no request is dropped when a pod shuts down. Every K8s configuration choice here serves one of those two guarantees.

Deployment with All Probes

k8s/deployment.yaml — Complete

apiVersion: apps/v1
kind: Deployment
metadata:
  name: orders-api
  labels:
    app: orders-api
spec:
  replicas: 3
  selector:
    matchLabels:
      app: orders-api

  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxUnavailable: 0    # never take a pod down before a new one is ready
      maxSurge:       1    # allow one extra pod during the update

  template:
    metadata:
      labels:
        app: orders-api
    spec:
      terminationGracePeriodSeconds: 60   # must exceed shutdownDelay + longest request

      initContainers:
        - name: db-migrate
          image: your-registry/orders-api:latest
          command: ["dotnet", "OrdersApiCloudNative.dll", "--migrate"]
          envFrom:
            - secretRef:
                name: orders-secrets

      containers:
        - name: orders-api
          image: your-registry/orders-api:latest
          ports:
            - containerPort: 8080

          envFrom:
            - secretRef:
                name: orders-secrets
            - configMapRef:
                name: orders-config

          resources:
            requests:
              memory: "256Mi"
              cpu:    "250m"
            limits:
              memory: "512Mi"
              cpu:    "1000m"

          # Startup probe: check readiness during initial container boot
          # Fails = restart (not just remove from LB)
          # Total startup budget: failureThreshold * periodSeconds = 60s
          startupProbe:
            httpGet:
              path: /health/ready
              port: 8080
            failureThreshold: 12
            periodSeconds:    5
            timeoutSeconds:   3

          # Liveness: is this pod healthy enough to keep running?
          livenessProbe:
            httpGet:
              path: /health/live
              port: 8080
            initialDelaySeconds: 0    # startup probe handles initial delay
            periodSeconds:       15
            timeoutSeconds:      3
            failureThreshold:    3

          # Readiness: is this pod ready to accept traffic?
          readinessProbe:
            httpGet:
              path: /health/ready
              port: 8080
            periodSeconds:    10
            timeoutSeconds:   3
            failureThreshold: 3
            successThreshold: 1

          # Graceful shutdown: drain traffic before SIGTERM
          lifecycle:
            preStop:
              exec:
                command: ["/bin/sh", "-c", "sleep 10"]

ConfigMap & Secret

k8s/configmap.yaml & secret.yaml

## ConfigMap — non-sensitive configuration
apiVersion: v1
kind: ConfigMap
metadata:
  name: orders-config
data:
  ASPNETCORE_ENVIRONMENT: "Production"
  ASPNETCORE_URLS: "http://+:8080"
  Services__PaymentService__BaseUrl: "http://payment-svc:8081"
  Jwt__Issuer: "https://api.yourcompany.com"

---

## Secret — sensitive values (use External Secrets Operator in real clusters)
apiVersion: v1
kind: Secret
metadata:
  name: orders-secrets
type: Opaque
stringData:
  ConnectionStrings__DefaultConnection: "Server=postgres-svc;Database=orders;..."
  Jwt__Key: "your-production-signing-key-min-32-chars"

Three Probe Types, Three Jobs

Startup probe: handles slow initialization (EF Core model compilation, cache warming). Once it succeeds, Kubernetes switches to liveness and readiness. Liveness probe: detects stuck/broken processes, triggers restart. Readiness probe: gates traffic routing during normal operation. The startup probe is the most commonly forgotten — without it, a slow-starting pod fails liveness immediately on first deploy and enters a restart loop.

Load Testing & Observability

A resilience pipeline that was never tested under load is a guess. Run a quick load test before you deploy to production. Watch the metrics and logs while it runs — the patterns you see locally are the same ones you'll see in production under stress.

k6 Load Test Script

load-tests/orders-test.js

import http  from 'k6/http';
import { check, sleep } from 'k6';
import { Rate } from 'k6/metrics';

const errorRate = new Rate('errors');

export const options = {
    stages: [
        { duration: '30s', target: 20  },  // ramp up
        { duration: '60s', target: 50  },  // steady state
        { duration: '30s', target: 100 },  // peak
        { duration: '30s', target: 0   },  // ramp down
    ],
    thresholds: {
        http_req_duration: ['p(95)<500'],  // 95th percentile under 500ms
        errors:            ['rate<0.01'],  // error rate under 1%
    },
};

const BASE_URL = __ENV.BASE_URL || 'http://localhost:8080';

export default function () {
    // POST: create order
    const payload = JSON.stringify({
        customerId: `customer-${Math.floor(Math.random() * 1000)}`,
        total:      (Math.random() * 500).toFixed(2)
    });

    const createRes = http.post(`${BASE_URL}/orders`, payload, {
        headers: { 'Content-Type': 'application/json' }
    });

    const created = check(createRes, {
        'create order 201':        r => r.status === 201,
        'create order fast <200ms': r => r.timings.duration < 200,
    });
    errorRate.add(!created);

    sleep(0.5);

    // GET: fetch orders list
    const listRes = http.get(`${BASE_URL}/orders`);
    check(listRes, {
        'list orders 200': r => r.status === 200,
    });

    // Check health endpoint under load
    const healthRes = http.get(`${BASE_URL}/health/ready`);
    check(healthRes, {
        'health ready 200': r => r.status === 200,
    });

    sleep(0.5);
}

Terminal — Run k6 Test

# Install k6: https://k6.io/docs/getting-started/installation/
k6 run load-tests/orders-test.js

# Against a specific environment
BASE_URL=https://staging.yourcompany.com k6 run load-tests/orders-test.js

# With output to InfluxDB for Grafana visualisation
k6 run --out influxdb=http://localhost:8086/k6 load-tests/orders-test.js

OpenTelemetry Metrics Setup

Program.cs — OpenTelemetry

using OpenTelemetry.Resources;
using OpenTelemetry.Trace;
using OpenTelemetry.Metrics;

builder.Services.AddOpenTelemetry()
    .WithTracing(tracing => tracing
        .SetResourceBuilder(ResourceBuilder.CreateDefault()
            .AddService("OrdersApiCloudNative"))
        .AddAspNetCoreInstrumentation()
        .AddHttpClientInstrumentation()
        .AddConsoleExporter())      // swap for OTLP exporter (Jaeger/Tempo) in production

    .WithMetrics(metrics => metrics
        .SetResourceBuilder(ResourceBuilder.CreateDefault()
            .AddService("OrdersApiCloudNative"))
        .AddAspNetCoreInstrumentation()   // http.server.request.duration histogram
        .AddHttpClientInstrumentation()   // outgoing call metrics
        .AddRuntimeInstrumentation()      // GC, thread pool, memory
        .AddConsoleExporter());           // swap for OTLP / Prometheus in production

What to Watch During a Deployment

Four signals tell you if a rolling deployment is going well: (1) HTTP 5xx rate — any spike above 0.1% baseline during the rollout needs investigation; (2) Readiness probe failures — repeated failures stall the rollout and usually indicate a startup or migration issue; (3) p95 latency — a spike signals the new version is slower or a dependency is degraded; (4) Circuit breaker state changes — a breaker opening during a deploy means a downstream service is unhappy with the new version's request patterns.

End-to-End: Complete Program.cs

All layers assembled — health checks, resilience, background worker, graceful shutdown, OpenTelemetry, and the full middleware pipeline in the correct order.

Program.cs — Complete

using Microsoft.EntityFrameworkCore;
using Microsoft.Extensions.Diagnostics.HealthChecks;
using OpenTelemetry.Resources;
using OpenTelemetry.Trace;
using OpenTelemetry.Metrics;
using System.Threading.RateLimiting;

// ─── Migrate-only mode ────────────────────────────────────────────────────
if (args.Contains("--migrate"))
{
    var migrateHost = Host.CreateDefaultBuilder(args)
        .ConfigureServices((ctx, svc) => svc.AddDbContext<OrdersDbContext>(o =>
            o.UseSqlite(ctx.Configuration.GetConnectionString("DefaultConnection"))))
        .Build();

    using var scope = migrateHost.Services.CreateScope();
    await scope.ServiceProvider.GetRequiredService<OrdersDbContext>()
               .Database.MigrateAsync();
    Console.WriteLine("Migrations complete");
    return;
}

// ─── Normal startup ───────────────────────────────────────────────────────
var builder = WebApplication.CreateBuilder(args);

// Database
builder.Services.AddDbContext<OrdersDbContext>(opt =>
    opt.UseSqlite(builder.Configuration.GetConnectionString("DefaultConnection")
        ?? "Data Source=orders.db"));

// Resilience pipelines
builder.Services.AddResiliencePipeline("db-pipeline", rb =>
    rb.AddTimeout(TimeSpan.FromSeconds(5))
      .AddRetry(new() { MaxRetryAttempts = 3, UseJitter = true })
      .AddCircuitBreaker(new() { FailureRatio = 0.6, BreakDuration = TimeSpan.FromSeconds(45) }));

builder.Services.AddHttpClient("PaymentService", c =>
    c.BaseAddress = new Uri(builder.Configuration["Services:PaymentService:BaseUrl"]!))
    .AddStandardResilienceHandler();

// Health checks
builder.Services.AddHealthChecks()
    .AddCheck("self",        () => HealthCheckResult.Healthy(), tags: ["live"])
    .AddDbContextCheck<OrdersDbContext>("database",  tags: ["ready"])
    .AddCheck<QueueHealthCheck>("queue",             tags: ["ready"]);

// Background worker
builder.Services.AddHostedService<OrderProcessorService>();

// Graceful shutdown
builder.Services.Configure<HostOptions>(o => o.ShutdownTimeout = TimeSpan.FromSeconds(30));

// Rate limiting
builder.Services.AddRateLimiter(o =>
{
    o.AddPolicy("per-ip", ctx =>
        RateLimitPartition.GetFixedWindowLimiter(
            ctx.Connection.RemoteIpAddress?.ToString() ?? "unknown",
            _ => new() { PermitLimit = 100, Window = TimeSpan.FromMinutes(1) }));
    o.OnRejected = async (ctx, ct) =>
    {
        ctx.HttpContext.Response.StatusCode = 429;
        await ctx.HttpContext.Response.WriteAsJsonAsync(new
        {
            type = "https://tools.ietf.org/html/rfc6585#section-4",
            title = "Too Many Requests", status = 429
        }, ct);
    };
});

// OpenTelemetry
builder.Services.AddOpenTelemetry()
    .WithTracing(t => t
        .SetResourceBuilder(ResourceBuilder.CreateDefault().AddService("OrdersApiCloudNative"))
        .AddAspNetCoreInstrumentation()
        .AddHttpClientInstrumentation()
        .AddConsoleExporter())
    .WithMetrics(m => m
        .AddAspNetCoreInstrumentation()
        .AddRuntimeInstrumentation()
        .AddConsoleExporter());

var app = builder.Build();

// ─── Middleware pipeline ──────────────────────────────────────────────────
app.UseHttpsRedirection();
app.UseRateLimiter();

// Security headers
app.Use(async (ctx, next) =>
{
    ctx.Response.Headers["X-Content-Type-Options"] = "nosniff";
    ctx.Response.Headers["X-Frame-Options"]         = "DENY";
    await next();
});

// ─── Health endpoints ─────────────────────────────────────────────────────
app.MapHealthChecks("/health/live",  new() { Predicate = c => c.Tags.Contains("live"),  ResponseWriter = WriteHealthResponse });
app.MapHealthChecks("/health/ready", new() { Predicate = c => c.Tags.Contains("ready"), ResponseWriter = WriteHealthResponse });

// ─── API endpoints ────────────────────────────────────────────────────────
app.MapGet("/orders", async (OrdersDbContext db, ResiliencePipelineProvider<string> pipelines) =>
{
    var pipeline = pipelines.GetPipeline("db-pipeline");
    var orders   = await pipeline.ExecuteAsync(async ct =>
        await db.Orders.OrderByDescending(o => o.CreatedAt).Take(50).ToListAsync(ct));
    return TypedResults.Ok(orders);
})
.RequireRateLimiting("per-ip")
.WithTags("Orders");

app.MapPost("/orders", async (CreateOrderRequest req, OrdersDbContext db) =>
{
    var order = new Order
    {
        CustomerId = req.CustomerId,
        Total      = req.Total,
        Status     = OrderStatus.Pending
    };
    db.Orders.Add(order);
    await db.SaveChangesAsync();
    return TypedResults.Created($"/orders/{order.Id}", order);
})
.RequireRateLimiting("per-ip")
.WithTags("Orders");

app.MapGet("/orders/{id:int}", async (int id, OrdersDbContext db) =>
{
    var order = await db.Orders.FindAsync(id);
    return order is not null ? TypedResults.Ok(order) : TypedResults.NotFound();
})
.WithTags("Orders");

// ─── Graceful shutdown hook ───────────────────────────────────────────────
var lifetime = app.Services.GetRequiredService<IHostApplicationLifetime>();
lifetime.ApplicationStopping.Register(() =>
{
    app.Logger.LogWarning("SIGTERM received — draining traffic (10s delay)");
    Thread.Sleep(TimeSpan.FromSeconds(10));
});

app.Run();

// ─── Helpers ──────────────────────────────────────────────────────────────
static Task WriteHealthResponse(HttpContext ctx, HealthReport report)
{
    ctx.Response.ContentType = "application/json";
    return ctx.Response.WriteAsJsonAsync(new
    {
        status   = report.Status.ToString(),
        duration = report.TotalDuration.TotalMilliseconds,
        checks   = report.Entries.Select(e => new
        {
            name     = e.Key,
            status   = e.Value.Status.ToString(),
            duration = e.Value.Duration.TotalMilliseconds,
            error    = e.Value.Exception?.Message
        })
    });
}

record CreateOrderRequest(string CustomerId, decimal Total);

Health Check Integration Test

HealthChecks.Tests/HealthCheckTests.cs

public class HealthCheckTests : IClassFixture<WebApplicationFactory<Program>>
{
    private readonly HttpClient _client;

    public HealthCheckTests(WebApplicationFactory<Program> factory)
        => _client = factory.CreateClient();

    [Fact]
    public async Task Liveness_Returns_200_When_Healthy()
    {
        var response = await _client.GetAsync("/health/live");
        Assert.Equal(HttpStatusCode.OK, response.StatusCode);

        var body = await response.Content.ReadFromJsonAsync<JsonElement>();
        Assert.Equal("Healthy", body.GetProperty("status").GetString());
    }

    [Fact]
    public async Task Readiness_Returns_200_When_All_Dependencies_Up()
    {
        var response = await _client.GetAsync("/health/ready");
        Assert.Equal(HttpStatusCode.OK, response.StatusCode);
    }

    [Fact]
    public async Task Health_Response_Is_Json_With_Check_Details()
    {
        var response = await _client.GetAsync("/health/live");
        Assert.Equal("application/json", response.Content.Headers.ContentType?.MediaType);

        var body   = await response.Content.ReadFromJsonAsync<JsonElement>();
        var checks = body.GetProperty("checks");
        Assert.True(checks.GetArrayLength() > 0);
    }
}

Resources & Next Steps

The Orders API now has every production-readiness layer that matters: it tells Kubernetes when it's healthy, it handles downstream failures without cascading, it shuts down without dropping requests, and it deploys without downtime. These patterns scale from a single-service deployment to a 50-service microservices mesh.

Official Documentation

ASP.NET Core Health Checks — Health checks in ASP.NET Core — Microsoft Learn
Microsoft.Extensions.Resilience — Resilience and transient-fault handling — Microsoft Learn
Polly v8 Documentation — Polly — official docs for strategies, pipelines, and telemetry
BackgroundService & IHostedService — Background tasks with hosted services — Microsoft Learn
EF Core Migrations — Applying migrations — init containers, bundles, and CLI tools
k6 Load Testing — k6 documentation — scripting, thresholds, and CI integration
Kubernetes Probes — Configure liveness, readiness and startup probes — kubernetes.io
OpenTelemetry .NET — Getting started with OpenTelemetry for .NET

Next Steps

Add the HealthChecks UI package (AspNetCore.HealthChecks.UI) for a visual dashboard of all check statuses. Integrate with Azure Service Bus or RabbitMQ for a production-grade queue instead of the in-memory stub. Deploy to AKS or GKE and wire up Prometheus scraping of your OpenTelemetry metrics endpoint — then build a Grafana dashboard with SLO burn-rate alerts. Add chaos engineering with Chaos Monkey for .NET to verify your resilience pipeline holds under random dependency failures.

Frequently Asked Questions

What is the difference between liveness and readiness probes?

A liveness probe answers: is this process still alive and worth keeping? If it fails, the container is restarted. A readiness probe answers: is this instance ready to accept traffic? If it fails, the pod is removed from the load balancer — but not restarted. Keep liveness checks minimal and fast. Put all dependency checks in readiness. A slow database will not be fixed by restarting your API.

Should I run EF Core migrations automatically on startup?

Only in development. In production, running migrations on every startup creates a race condition when multiple pods start simultaneously — all try to apply the same migration, and two will throw concurrency exceptions. Use the init container pattern instead: a dedicated --migrate command runs once before the rolling update begins, then all app pods start against an already-migrated schema.

How does graceful shutdown prevent dropped requests?

When Kubernetes sends SIGTERM to a pod, it simultaneously removes the pod from the load balancer — but there's a propagation delay of several seconds during which new requests can still arrive. Adding a shutdown delay of 10–15 seconds lets in-flight requests complete and absorbs that propagation window. Without this, you'll see a spike of 502 errors on every rolling update, even with a perfect resilience pipeline.

When should a circuit breaker open vs stay closed?

The circuit opens when failures exceed a threshold — for example, 50% of requests fail in a 30-second window. While open, calls fail immediately without touching the downstream service. This protects your API from timeout cascades and gives the struggling dependency breathing room to recover. Set break duration to at least 30 seconds — shorter and you risk hammering a recovering service back into failure before it stabilises.

What should I check in logs during a rolling deployment?

Watch four signals: (1) HTTP 5xx error rate — any spike above baseline is a problem; (2) readiness probe failures — repeated failures signal a startup or migration issue; (3) graceful shutdown logs — confirm "Application is shutting down" and "Hosted service stopped" appear before the pod terminates; (4) migration init container exit code — confirm it completed successfully before the first app pod starts.

Is Polly v8 the same as Microsoft.Extensions.Resilience?

Polly v8 is the core library. Microsoft.Extensions.Resilience wraps it with DI integration, pre-built pipeline configurations, telemetry, and HttpClient integration via AddStandardResilienceHandler(). For most APIs, use the Extensions package — it gives you sensible defaults with one line of code and integrates with OpenTelemetry automatically. Use raw Polly v8 APIs when you need custom pipeline logic or fine-grained control over non-HTTP operations.

How do I test health checks locally without a Kubernetes cluster?

Use Docker Compose with the HEALTHCHECK directive to test liveness and readiness locally — the compose healthcheck command runs curl against your endpoints on an interval. Add the AspNetCore.HealthChecks.UI package to get a live visual dashboard at /healthchecks-ui showing all check statuses in real time. For load testing without a cluster, k6 runs locally and produces the same p95 latency and error-rate graphs you'd see in production.