ElasticSearch: Unassigned Shards, how to fix

elasticsearchmastersharding

I have an ES cluster with 4 nodes:

number_of_replicas: 1
search01 - master: false, data: false
search02 - master: true, data: true
search03 - master: false, data: true
search04 - master: false, data: true

I had to restart search03, and when it came back, it rejoined the cluster no problem, but left 7 unassigned shards laying about.

{
  "cluster_name" : "tweedle",
  "status" : "yellow",
  "timed_out" : false,
  "number_of_nodes" : 4,
  "number_of_data_nodes" : 3,
  "active_primary_shards" : 15,
  "active_shards" : 23,
  "relocating_shards" : 0,
  "initializing_shards" : 0,
  "unassigned_shards" : 7
}

Now my cluster is in yellow state. What is the best way to resolve this issue?

  • Delete (cancel) the shards?
  • Move the shards to another node?
  • Allocate the shards to the node?
  • Update 'number_of_replicas' to 2?
  • Something else entirely?

Interestingly, when a new index was added, that node started working on it and played nice with the rest of the cluster, it just left the unassigned shards laying about.

Follow on question: am I doing something wrong to cause this to happen in the first place? I don't have much confidence in a cluster that behaves this way when a node is restarted.

NOTE: If you're running a single node cluster for some reason, you might simply need to do the following:

curl -XPUT 'localhost:9200/_settings' -d '
{
    "index" : {
        "number_of_replicas" : 0
    }
}'

Best Answer

By default, Elasticsearch will re-assign shards to nodes dynamically. However, if you've disabled shard allocation (perhaps you did a rolling restart and forgot to re-enable it), you can re-enable shard allocation.

# v0.90.x and earlier
curl -XPUT 'localhost:9200/_settings' -d '{
    "index.routing.allocation.disable_allocation": false
}'

# v1.0+
curl -XPUT 'localhost:9200/_cluster/settings' -d '{
    "transient" : {
        "cluster.routing.allocation.enable" : "all"
    }
}'

Elasticsearch will then reassign shards as normal. This can be slow, consider raising indices.recovery.max_bytes_per_sec and cluster.routing.allocation.node_concurrent_recoveries to speed it up.

If you're still seeing issues, something else is probably wrong, so look in your Elasticsearch logs for errors. If you see EsRejectedExecutionException your thread pools may be too small.

Finally, you can explicitly reassign a shard to a node with the reroute API.

# Suppose shard 4 of index "my-index" is unassigned, so you want to
# assign it to node search03:
curl -XPOST 'localhost:9200/_cluster/reroute' -d '{
    "commands": [{
        "allocate": {
            "index": "my-index",
            "shard": 4,
            "node": "search03",
            "allow_primary": 1
        }
    }]
}'
Related Topic