Skip to content

Fix: Elasticsearch Cluster Health Red Status

FixDevs ·

Quick Answer

Fix Elasticsearch cluster health red status by resolving unassigned shards, disk watermark issues, node failures, and shard allocation problems.

The Error

You check your Elasticsearch cluster health and see:

curl -X GET "localhost:9200/_cluster/health?pretty"
{
  "cluster_name": "my-cluster",
  "status": "red",
  "number_of_nodes": 3,
  "active_primary_shards": 45,
  "unassigned_shards": 10
}

A red status means one or more primary shards are not allocated. Data in those shards is unavailable for search and indexing. This is the most critical cluster health state.

Why This Happens

Elasticsearch distributes data across shards, which live on nodes in the cluster. When a node goes down, crashes, or runs out of disk space, the shards it hosted become unassigned. If those are primary shards (not replicas), the cluster turns red because data is genuinely missing from the cluster.

Common triggers include disk space exhaustion triggering the watermark threshold, node crashes due to JVM heap issues, network partitions causing split-brain scenarios, or corrupted shard data that Elasticsearch can’t recover.

Fix 1: Identify Unassigned Shards

First, find out which shards are unassigned and why:

curl -X GET "localhost:9200/_cat/shards?v&h=index,shard,prirep,state,unassigned.reason&s=state"

This shows each unassigned shard with its reason code. Common reasons:

  • NODE_LEFT — The node hosting the shard left the cluster
  • ALLOCATION_FAILED — Elasticsearch tried to allocate but failed
  • CLUSTER_RECOVERED — Shard from a previous cluster state
  • INDEX_CREATED — New index, shards not yet assigned

For detailed allocation explanation:

curl -X GET "localhost:9200/_cluster/allocation/explain?pretty"

This tells you exactly why Elasticsearch can’t allocate a specific shard and what you need to fix.

Fix 2: Reroute Unassigned Shards Manually

If shards are stuck as unassigned, you can force allocation:

curl -X POST "localhost:9200/_cluster/reroute?pretty" -H 'Content-Type: application/json' -d'
{
  "commands": [
    {
      "allocate_stale_primary": {
        "index": "my-index",
        "shard": 0,
        "node": "node-1",
        "accept_data_loss": true
      }
    }
  ]
}'

Warning: allocate_stale_primary with accept_data_loss: true may result in data loss if the shard data on that node is outdated. Use this only when the original node is permanently gone.

For replica shards, use allocate_replica instead:

curl -X POST "localhost:9200/_cluster/reroute" -H 'Content-Type: application/json' -d'
{
  "commands": [
    {
      "allocate_replica": {
        "index": "my-index",
        "shard": 0,
        "node": "node-2"
      }
    }
  ]
}'

Fix 3: Fix Disk Watermark Issues

Elasticsearch stops allocating shards when disk usage exceeds thresholds:

  • Low watermark (85%): No new shards allocated to this node
  • High watermark (90%): Elasticsearch starts moving shards off this node
  • Flood stage (95%): Indices become read-only

Check disk usage:

curl -X GET "localhost:9200/_cat/nodes?v&h=name,disk.used_percent,disk.avail"

Free up disk space:

# Delete old indices
curl -X DELETE "localhost:9200/logs-2024-01-*"

# Force merge to reduce segment count
curl -X POST "localhost:9200/my-index/_forcemerge?max_num_segments=1"

# Clear the fielddata cache
curl -X POST "localhost:9200/_cache/clear"

If indices are stuck in read-only mode after the flood stage, unlock them:

curl -X PUT "localhost:9200/_all/_settings" -H 'Content-Type: application/json' -d'
{
  "index.blocks.read_only_allow_delete": null
}'

Pro Tip: Set up disk monitoring alerts before you hit watermarks. The default thresholds are conservative — adjust them if your nodes have large disks where 85% still leaves hundreds of GB free.

Fix 4: Recover from Node Failures

If a node crashed or was shut down, restart it:

sudo systemctl start elasticsearch

Check the node’s logs for the crash reason:

tail -100 /var/log/elasticsearch/my-cluster.log

If the node can’t rejoin, verify:

  • Cluster name matches in elasticsearch.yml
  • Discovery settings point to the correct seed nodes
  • Network binding allows communication between nodes
# elasticsearch.yml
cluster.name: my-cluster
node.name: node-1
network.host: 0.0.0.0
discovery.seed_hosts: ["node-1:9300", "node-2:9300", "node-3:9300"]

After the node rejoins, shard recovery starts automatically. Monitor progress:

curl -X GET "localhost:9200/_cat/recovery?v&active_only=true"

Fix 5: Prevent Split-Brain

Split-brain occurs when nodes can’t communicate and form separate clusters, each believing it’s the primary. This causes data inconsistency and red status when the clusters reconnect.

Configure minimum master nodes properly. In Elasticsearch 7+, this is handled automatically with the initial master nodes setting:

# elasticsearch.yml
cluster.initial_master_nodes: ["node-1", "node-2", "node-3"]

For a 3-node cluster, Elasticsearch requires a quorum of 2 master-eligible nodes to elect a master. Never run a production cluster with only 2 master-eligible nodes — a single node failure loses quorum.

Common Mistake: Setting discovery.zen.minimum_master_nodes in Elasticsearch 7+ has no effect. This setting was removed. The cluster auto-configures quorum based on cluster.initial_master_nodes.

Fix 6: Tune JVM Heap Settings

Insufficient JVM heap causes garbage collection pauses that make nodes appear to leave the cluster:

# Check current heap usage
curl -X GET "localhost:9200/_cat/nodes?v&h=name,heap.percent,heap.max"

Set heap size in jvm.options:

-Xms4g
-Xmx4g

Rules for heap sizing:

  • Set -Xms and -Xmx to the same value to avoid resizing pauses
  • Never exceed 50% of available RAM — the other 50% is for the filesystem cache
  • Never exceed ~30GB — beyond this, the JVM can’t use compressed object pointers
  • For nodes with 64GB RAM, use -Xms31g -Xmx31g

Check for GC issues in the logs:

grep "GC overhead" /var/log/elasticsearch/my-cluster.log
grep "breaker" /var/log/elasticsearch/my-cluster.log

Fix 7: Adjust Replica Configuration

If you have a single-node cluster with replicas configured, the cluster stays yellow or red because replicas can’t be assigned to the same node as the primary:

# Check index settings
curl -X GET "localhost:9200/my-index/_settings?pretty" | grep number_of_replicas

For single-node clusters, set replicas to 0:

curl -X PUT "localhost:9200/my-index/_settings" -H 'Content-Type: application/json' -d'
{
  "index": {
    "number_of_replicas": 0
  }
}'

For all future indices, set a default template:

curl -X PUT "localhost:9200/_template/default" -H 'Content-Type: application/json' -d'
{
  "index_patterns": ["*"],
  "settings": {
    "number_of_replicas": 0
  }
}'

For multi-node clusters, ensure you have enough nodes to host all replicas. The formula: you need at least number_of_replicas + 1 nodes.

Fix 8: Restore from Snapshot

If shard data is corrupted and can’t be recovered, restore from a snapshot:

# List available snapshots
curl -X GET "localhost:9200/_snapshot/my-backup/_all?pretty"

# Close the index before restoring
curl -X POST "localhost:9200/my-index/_close"

# Restore specific index
curl -X POST "localhost:9200/_snapshot/my-backup/snapshot-2024-03-10/_restore" -H 'Content-Type: application/json' -d'
{
  "indices": "my-index",
  "ignore_unavailable": true
}'

If you don’t have snapshots, you may need to delete the corrupted index and re-index the data from your primary data source:

# Last resort: delete and recreate
curl -X DELETE "localhost:9200/corrupted-index"

Set up automated snapshots to prevent this scenario:

# Register a snapshot repository
curl -X PUT "localhost:9200/_snapshot/my-backup" -H 'Content-Type: application/json' -d'
{
  "type": "fs",
  "settings": {
    "location": "/mnt/backups/elasticsearch"
  }
}'

Still Not Working?

  • Check cluster settings overrides. Transient and persistent cluster settings override elasticsearch.yml. Run curl localhost:9200/_cluster/settings?pretty to see active overrides.

  • Look for shard allocation filters. Settings like index.routing.allocation.exclude._name can prevent shards from being assigned. Check with curl localhost:9200/my-index/_settings?pretty.

  • Verify network connectivity between nodes. Test with curl node-2:9200 from each node. Elasticsearch uses port 9200 for HTTP and 9300 for inter-node communication.

  • Check for pending cluster tasks. Run curl localhost:9200/_cluster/pending_tasks?pretty. A large queue indicates the master node is overwhelmed.

  • Monitor with _cat APIs. Use _cat/nodes, _cat/indices, _cat/shards, and _cat/allocation for quick cluster overview without parsing JSON.

  • Consider increasing cluster.routing.allocation.node_concurrent_recoveries from the default of 2 if recovery is too slow on a large cluster with fast disks.

F

FixDevs

Solo developer based in Japan. Every solution is cross-referenced with official documentation and tested before publishing.

Was this article helpful?

Related Articles