Skip to content

Fix: Docker container health status unhealthy

FixDevs ·

Quick Answer

How to fix Docker container health check failing with unhealthy status, including HEALTHCHECK syntax, timing issues, missing curl/wget, endpoint problems, and Compose healthcheck configuration.

The Error

You run docker ps and see your container stuck in an unhealthy state:

CONTAINER ID   IMAGE       STATUS                     PORTS
a1b2c3d4e5f6   my-app      Up 2 minutes (unhealthy)   0.0.0.0:8080->8080/tcp

Or you check the container health explicitly:

docker inspect --format='{{.State.Health.Status}}' my-app
unhealthy

Docker Compose may refuse to start dependent services entirely:

dependency failed to start: container my-app is unhealthy

This means Docker ran the HEALTHCHECK instruction defined in your Dockerfile (or docker-compose.yml) and it failed more times than the allowed retry count. The container is running, but Docker considers it unfit to serve traffic or satisfy dependency conditions.

Why This Happens

Docker health checks are commands that Docker runs inside the container at regular intervals. When the command exits with code 0, the container is healthy. When it exits with code 1, it counts as a failure. After a configured number of consecutive failures, the container transitions from starting to unhealthy.

Several things can cause health check failures:

  • Wrong command syntax in the HEALTHCHECK instruction.
  • Timing issues where the health check runs before the application is ready.
  • Missing tools like curl or wget inside the container image.
  • Wrong endpoint or the application listening on a different address than what the health check targets.
  • Network misconfiguration where the health check uses localhost but the app binds to a specific interface.
  • Compose-specific configuration errors in the healthcheck section.
  • Dependency chains where depends_on with condition: service_healthy fails because an upstream service is itself unhealthy.

The key insight is that the health check runs inside the container’s filesystem and network namespace. What works from your host machine may not work inside the container.

Fix 1: Check Health Check Command Syntax in Dockerfile

The most common cause is a syntax error in the HEALTHCHECK instruction. There are two valid forms:

Shell form (runs through /bin/sh -c):

HEALTHCHECK CMD curl -f http://localhost:8080/health || exit 1

Exec form (runs directly, no shell):

HEALTHCHECK CMD ["curl", "-f", "http://localhost:8080/health"]

A frequent mistake is mixing the two forms:

# Wrong - this passes the entire string as argv[0]
HEALTHCHECK CMD ["curl -f http://localhost:8080/health"]

Each argument must be a separate array element in exec form. If you need shell features like || or pipes, use the shell form or explicitly invoke the shell:

HEALTHCHECK CMD ["/bin/sh", "-c", "curl -f http://localhost:8080/health || exit 1"]

Also verify the endpoint path is correct. If your app exposes /healthz but your health check hits /health, it will get a 404 and fail.

Check what health check is currently configured:

docker inspect --format='{{json .Config.Healthcheck}}' my-app | python -m json.tool

This shows the exact command, interval, timeout, and retry settings Docker is using.

Pro Tip: The -f flag on curl makes it return a non-zero exit code on HTTP errors (4xx, 5xx). Without it, curl exits 0 even on a 500 response, and your health check would pass when it should fail.

Fix 2: Fix Health Check Timing (interval, timeout, start-period, retries)

Your application might need time to start up. If the health check runs before the app is ready, it will fail during the startup window. Docker provides four timing parameters:

HEALTHCHECK --interval=30s --timeout=10s --start-period=60s --retries=3 \
  CMD curl -f http://localhost:8080/health || exit 1

Here is what each parameter controls:

  • --interval: Time between health check attempts (default: 30s).
  • --timeout: Maximum time for a single health check to complete (default: 30s). If the command does not finish within this time, it counts as a failure.
  • --start-period: Grace period after container start during which failed checks do not count toward the retry limit (default: 0s). This is the one most people miss.
  • --retries: Number of consecutive failures required to mark the container unhealthy (default: 3).

If your Java application takes 45 seconds to boot, set --start-period=60s to give it room. During the start period, health check failures are not counted. Only after the start period do failures begin counting toward the retry limit.

A common mistake is setting --timeout too low. If your health endpoint queries a database or performs initialization on first call, it may take longer than expected:

# Too aggressive for a heavy app
HEALTHCHECK --interval=5s --timeout=2s --retries=1 CMD curl -f http://localhost:8080/health

# More forgiving
HEALTHCHECK --interval=15s --timeout=10s --start-period=30s --retries=3 CMD curl -f http://localhost:8080/health

Fix 3: Fix curl/wget Not Available in Container

Minimal base images like alpine, distroless, or scratch do not include curl or wget. If your health check uses one of these tools but it is not installed, the check fails immediately.

Check by running the command manually inside the container:

docker exec my-app which curl

If it returns nothing, you have a few options.

Option A: Install curl in the Dockerfile:

# For Alpine
RUN apk add --no-cache curl

# For Debian/Ubuntu
RUN apt-get update && apt-get install -y --no-install-recommends curl && rm -rf /var/lib/apt/lists/*

Option B: Use wget instead (pre-installed on Alpine):

HEALTHCHECK CMD wget --no-verbose --tries=1 --spider http://localhost:8080/health || exit 1

Alpine ships with BusyBox wget, so this works without installing anything extra.

Option C: Use a built-in language tool:

For Node.js apps, avoid adding curl entirely:

HEALTHCHECK CMD node -e "require('http').get('http://localhost:8080/health', (r) => { process.exit(r.statusCode === 200 ? 0 : 1) })"

For Python apps:

HEALTHCHECK CMD python -c "import urllib.request; urllib.request.urlopen('http://localhost:8080/health')"

Option D: Write a tiny health check binary:

For distroless or scratch images, compile a small static binary (in Go, Rust, or C) and copy it into the image. This avoids installing a shell or any extra runtime.

This approach keeps your image small while still supporting health checks.

Fix 4: Fix Health Check Endpoint Not Ready

Sometimes the health check command is correct, but the endpoint it targets is not available. This differs from a timing issue because the endpoint might never become available due to a code or configuration problem.

Check whether the endpoint responds from inside the container:

docker exec my-app curl -v http://localhost:8080/health

If you get Connection refused, the application is either not running, not listening on that port, or listening on a different interface (see Fix 8).

If you get a non-200 response, check the application logs:

docker logs my-app

Common causes:

  • The application crashed after starting but Docker has not restarted it yet.
  • The health endpoint depends on a database connection that is not available. If your /health route queries the database, a database outage will make your container unhealthy. Consider splitting into a liveness check (is the process alive) and a readiness check (can it serve traffic). For Docker’s HEALTHCHECK, use the liveness check — something lightweight like returning 200 OK if the HTTP server is responding.
  • The application listens on a different port internally than what you expect. Verify with:
docker exec my-app ss -tlnp

Or if ss is not available:

docker exec my-app netstat -tlnp

This shows exactly which ports have listeners inside the container.

Fix 5: Fix Docker Compose healthcheck Config

Docker Compose has its own healthcheck syntax that overrides any HEALTHCHECK in the Dockerfile. The YAML structure trips people up:

services:
  web:
    image: my-app
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:8080/health"]
      interval: 30s
      timeout: 10s
      start_period: 60s
      retries: 3

Note these common mistakes:

Wrong: Using CMD-SHELL without a string:

# Wrong
test: ["CMD-SHELL", "curl", "-f", "http://localhost:8080/health"]

# Right - CMD-SHELL takes a single string
test: ["CMD-SHELL", "curl -f http://localhost:8080/health || exit 1"]

CMD-SHELL passes the second element as a single shell command string. CMD passes each element as a separate argument.

Wrong: Using the short form incorrectly:

# This works - short form uses CMD-SHELL implicitly
test: curl -f http://localhost:8080/health || exit 1

Wrong: Indentation or field name errors:

# Wrong - underscore vs hyphen
healthcheck:
  start-period: 60s   # Wrong in some Compose versions

# Right for Compose v2
healthcheck:
  start_period: 60s

To disable a health check inherited from the base image:

healthcheck:
  disable: true

If you are migrating from a Compose build that failed, double-check that your rebuilt image still has the correct health check configuration.

Fix 6: Fix depends_on with condition: service_healthy

Docker Compose lets you delay starting a service until its dependency is healthy:

services:
  db:
    image: postgres:16
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U postgres"]
      interval: 10s
      timeout: 5s
      retries: 5

  web:
    image: my-app
    depends_on:
      db:
        condition: service_healthy

If db never becomes healthy, web will never start, and you will see:

dependency failed to start: container db is unhealthy

Debug this by checking the dependency container first:

docker compose ps
docker inspect --format='{{json .State.Health}}' project-db-1 | python -m json.tool

Common issues with depends_on health chains:

  • The dependency container’s health check is misconfigured (apply Fixes 1-5 to that container).
  • The dependency container has no health check defined at all. In that case condition: service_healthy will hang indefinitely.
  • Circular dependencies where service A depends on B and B depends on A.

If your dependency exits unexpectedly, the container might be running into OOM kills or entrypoint errors before it can ever become healthy.

Common Mistake: Defining depends_on with condition: service_healthy but forgetting to add a healthcheck block to the dependency service. Without a health check, Docker Compose has no way to determine if the service is healthy and will wait forever.

Fix 7: Debug with docker inspect —format health

When the previous fixes do not resolve the issue, you need detailed diagnostic information. Docker stores the last five health check results, including stdout and stderr from each attempt.

Get the full health check state:

docker inspect --format='{{json .State.Health}}' my-app | python -m json.tool

This returns something like:

{
    "Status": "unhealthy",
    "FailingStreak": 5,
    "Log": [
        {
            "Start": "2026-03-10T10:00:00.000000000Z",
            "End": "2026-03-10T10:00:01.500000000Z",
            "ExitCode": 1,
            "Output": "curl: (7) Failed to connect to localhost port 8080: Connection refused\n"
        }
    ]
}

Key fields to examine:

  • FailingStreak: How many consecutive failures have occurred. If it is higher than your retry count, the container is unhealthy.
  • ExitCode: 0 means success, 1 means failure, 2 means reserved (do not use exit code 2 in your health check commands).
  • Output: The stdout/stderr from the health check command. This is where you find the actual error.

If the output is empty, the health check command might be failing to execute at all. Check that the binary exists and has execute permissions:

docker exec my-app ls -la /usr/bin/curl

If you see permission denied errors, the container might be running as a non-root user without access to the health check binary. Either fix the permissions or switch to a tool available to that user.

You can also watch health check results in real time using Docker events:

docker events --filter container=my-app --filter event=health_status

This streams health status changes as they happen, which is useful when you are adjusting timing parameters and want to see the effect immediately.

Fix 8: Fix Network Issues in Health Checks (localhost vs 0.0.0.0)

A subtle but common issue: your application binds to 0.0.0.0 and the health check uses localhost, or vice versa. Inside a container, localhost resolves to 127.0.0.1. If your application only listens on the container’s assigned IP (not the loopback interface), localhost will not reach it.

Check what address your application binds to:

docker exec my-app ss -tlnp

You might see:

State   Recv-Q  Send-Q  Local Address:Port  Peer Address:Port  Process
LISTEN  0       128     0.0.0.0:8080        0.0.0.0:*          users:(("node",pid=1,fd=3))

If Local Address shows 0.0.0.0:8080, then localhost:8080 will work because 0.0.0.0 includes all interfaces.

But if it shows 172.17.0.2:8080 (a specific container IP), then localhost:8080 will fail because the app is not listening on 127.0.0.1.

Fix for common frameworks:

Node.js/Express:

// Wrong - binds to localhost only by default in some setups
app.listen(8080, '127.0.0.1');

// Right - binds to all interfaces
app.listen(8080, '0.0.0.0');

Python/Flask:

# Wrong
app.run(port=8080)

# Right
app.run(host='0.0.0.0', port=8080)

Go:

// Wrong
http.ListenAndServe("localhost:8080", handler)

// Right
http.ListenAndServe(":8080", handler)

Another network issue arises with health checks that call external services. If your health check hits an external URL, it depends on container DNS resolution and outbound network access. Prefer checking local endpoints only. A health check should verify that this container is working, not that the internet is available.

If your container uses a custom Docker network and you reference other services by name, make sure the DNS resolution works inside the container:

docker exec my-app nslookup db

If DNS fails, the container might not be connected to the right network. Verify with:

docker network inspect my-network

Still Not Working?

If none of the fixes above resolved the issue, try these less obvious solutions:

Check for filesystem issues. Some health checks write to a file to signal readiness. If the container’s filesystem is read-only or a volume mount has wrong permissions, the check fails silently:

docker exec my-app touch /tmp/test-write

Check container resource limits. If the container is CPU-throttled or near its memory limit, the health check command itself may time out. Check resource usage:

docker stats my-app --no-stream

If the container is consistently at its memory limit, it might be getting OOM killed intermittently, causing health check failures.

Inspect cgroup throttling. Even without OOM kills, CPU throttling can cause the health check process to stall beyond the timeout:

docker exec my-app cat /sys/fs/cgroup/cpu.stat

Look for a high nr_throttled value.

Try overriding the health check at runtime. To test whether the Dockerfile health check is the problem, override it at run time:

docker run --health-cmd="curl -f http://localhost:8080/health || exit 1" \
  --health-interval=10s \
  --health-timeout=5s \
  --health-start-period=30s \
  --health-retries=3 \
  my-app

Or disable it entirely to confirm the container works without it:

docker run --no-healthcheck my-app

Check Docker version compatibility. The --start-period flag was added in Docker 17.05. If you are running an older version, this option is silently ignored, and your health checks start counting failures immediately. Check your version:

docker version

Review the application’s graceful shutdown. If the container is being restarted by an orchestrator (like Kubernetes) due to being unhealthy, and the application does not shut down gracefully, it may leave stale PID files or lock files that prevent the next start from succeeding, creating a cycle of unhealthy restarts.

Use a dedicated health check script. Instead of inlining the check in the Dockerfile, create a script with better error handling:

#!/bin/sh
set -e

# Check if the HTTP server responds
curl -f http://localhost:8080/health > /dev/null 2>&1 || exit 1

# Optionally check other conditions
if [ ! -f /tmp/app-ready ]; then
  exit 1
fi

exit 0

Copy it into the image and reference it:

COPY healthcheck.sh /usr/local/bin/healthcheck.sh
RUN chmod +x /usr/local/bin/healthcheck.sh
HEALTHCHECK --interval=15s --timeout=10s --start-period=30s --retries=3 CMD /usr/local/bin/healthcheck.sh

This gives you a single place to add logging, check multiple conditions, and debug failures without rebuilding the image for every health check change.

F

FixDevs

Solo developer based in Japan. Every solution is cross-referenced with official documentation and tested before publishing.

Was this article helpful?

Related Articles