Zero-Downtime Deploys

How Lift performs zero-downtime deployments using scale-based blue-green strategy with automatic rollback.

Zero-Downtime Deploys

Lift uses a scale-based blue-green strategy to deploy updates without any downtime. Users never see an error page during a deploy.

How It Works

Current container stays running -- The existing container continues serving traffic throughout the deploy
New container scales up -- A second container is started alongside the old one (scale 1 to 2)
Traefik routes to both -- The load balancer distributes traffic across both containers
Health check is monitored -- The new container's Docker healthcheck is watched for a healthy status
Healthy: old container removed -- Once the new container passes its health check, the old one is stopped (scale back to 1)
Unhealthy: new container removed -- If the new container fails its health check, it is removed and the old container continues serving (automatic rollback)

                    ┌─────────────┐
   Traffic ────────>│   Traefik   │
                    └──────┬──────┘
                           │
              ┌────────────┼────────────┐
              v                         v
     ┌────────────────┐      ┌────────────────┐
     │  Old Container  │      │  New Container  │
     │   (healthy)     │      │  (starting...)  │
     └────────────────┘      └────────────────┘
              │                         │
              │   New healthy? ─────────┘
              │         │
              v         v
         Remove old   Keep old
         (success)    (rollback)

When It's Used

Zero-downtime deploys activate automatically when both conditions are met:

The tool is being updated (not a first install)
The tool has a domain configured (uses Traefik reverse proxy)

First installs always use the standard flow since there is no existing traffic to protect.

Automatic Health Checks

Health is verified at two levels during the deploy:

Docker healthcheck -- The container's built-in healthcheck command is monitored to determine when the new container is ready
Traefik healthcheck labels -- Traefik only routes traffic to containers that report a healthy status, preventing users from hitting an unready instance
Configurable health path -- The health endpoint path is derived from the tool's proxy settings
Start period -- Containers are given a configurable start period before health checks begin, allowing time for initialization

Autoheal Recovery

A server-level autoheal container runs once per server and covers all deployed tools:

Monitors all containers that define Docker healthchecks
Automatically restarts containers that enter an unhealthy state
60-second grace period after container start before monitoring begins
Uses the willfarrell/autoheal Docker image

This ensures that transient failures (memory spikes, temporary resource exhaustion) are recovered from without manual intervention.

Restart Policy

All services automatically receive a restart: unless-stopped policy. This ensures containers recover from crashes and survive host reboots without requiring manual restarts.

Compose & Health Checks -- Build detection and health check details
Replicas & Scaling -- Running multiple instances behind load balancing
Production Deploy -- Full production deploy features

PreviousMulti-Instance Tools NextScheduled Jobs

Zero-Downtime Deploys

Zero-Downtime Deploys

How It Works

When It's Used

Automatic Health Checks

Autoheal Recovery

Restart Policy

Related