Zero-Downtime Deploys
How Lift performs zero-downtime deployments using scale-based blue-green strategy with automatic rollback.
Zero-Downtime Deploys
Lift uses a scale-based blue-green strategy to deploy updates without any downtime. Users never see an error page during a deploy.
How It Works
- Current container stays running -- The existing container continues serving traffic throughout the deploy
- New container scales up -- A second container is started alongside the old one (scale 1 to 2)
- Traefik routes to both -- The load balancer distributes traffic across both containers
- Health check is monitored -- The new container's Docker healthcheck is watched for a healthy status
- Healthy: old container removed -- Once the new container passes its health check, the old one is stopped (scale back to 1)
- Unhealthy: new container removed -- If the new container fails its health check, it is removed and the old container continues serving (automatic rollback)
┌─────────────┐
Traffic ────────>│ Traefik │
└──────┬──────┘
│
┌────────────┼────────────┐
v v
┌────────────────┐ ┌────────────────┐
│ Old Container │ │ New Container │
│ (healthy) │ │ (starting...) │
└────────────────┘ └────────────────┘
│ │
│ New healthy? ─────────┘
│ │
v v
Remove old Keep old
(success) (rollback)
When It's Used
Zero-downtime deploys activate automatically when both conditions are met:
- The tool is being updated (not a first install)
- The tool has a domain configured (uses Traefik reverse proxy)
First installs always use the standard flow since there is no existing traffic to protect.
Automatic Health Checks
Health is verified at two levels during the deploy:
- Docker healthcheck -- The container's built-in healthcheck command is monitored to determine when the new container is ready
- Traefik healthcheck labels -- Traefik only routes traffic to containers that report a healthy status, preventing users from hitting an unready instance
- Configurable health path -- The health endpoint path is derived from the tool's proxy settings
- Start period -- Containers are given a configurable start period before health checks begin, allowing time for initialization
Autoheal Recovery
A server-level autoheal container runs once per server and covers all deployed tools:
- Monitors all containers that define Docker healthchecks
- Automatically restarts containers that enter an unhealthy state
- 60-second grace period after container start before monitoring begins
- Uses the
willfarrell/autohealDocker image
This ensures that transient failures (memory spikes, temporary resource exhaustion) are recovered from without manual intervention.
Restart Policy
All services automatically receive a restart: unless-stopped policy. This ensures containers recover from crashes and survive host reboots without requiring manual restarts.
Related
- Compose & Health Checks -- Build detection and health check details
- Replicas & Scaling -- Running multiple instances behind load balancing
- Production Deploy -- Full production deploy features