- All Implemented Interfaces:
Diffable<ClusterState>,Writeable,ChunkedToXContent
Conceptually immutable, but in practice it has a few components like RoutingNodes which are pure functions of the immutable state
but are expensive to compute so they are built on-demand if needed.
The Metadata portion is written to disk on each update so it persists across full-cluster restarts. The rest of this data is
maintained only in-memory and resets back to its initial state on a full-cluster restart, but it is held on all nodes so it persists
across master elections (and therefore is preserved in a rolling restart).
Updates are triggered by submitting tasks to the MasterService on the elected master, typically using a TransportMasterNodeAction to route a request to the master on which the task is submitted via a queue obtained with ClusterService.createTaskQueue(java.lang.String, org.elasticsearch.common.Priority, org.elasticsearch.cluster.ClusterStateTaskExecutor<T>), which has an associated priority. Submitted tasks have an associated
timeout. Tasks are processed in priority order, so a flood of higher-priority tasks can starve lower-priority ones from running.
Therefore, avoid priorities other than Priority.NORMAL where possible. Tasks associated with client actions should typically have
a timeout, or otherwise be sensitive to client cancellations, to avoid surprises caused by the execution of stale tasks long after they
are submitted (since clients themselves tend to time out). In contrast, internal tasks can reasonably have an infinite timeout,
especially if a timeout would simply trigger a retry.
Tasks that share the same ClusterStateTaskExecutor instance are processed as a batch. Each batch of tasks yields a new ClusterState which is published to the cluster by ClusterStatePublisher.publish(org.elasticsearch.cluster.ClusterStatePublicationEvent, org.elasticsearch.action.ActionListener<java.lang.Void>, org.elasticsearch.cluster.coordination.ClusterStatePublisher.AckListener). Publication usually works by sending a diff,
computed via the Diffable interface, rather than the full state, although it will fall back to sending the full state if the
receiving node is new or it has missed out on an intermediate state for some reason. States and diffs are published using the transport
protocol, i.e. the Writeable interface and friends.
When committed, the new state is applied which exposes it to the node via ClusterStateApplier and ClusterStateListener callbacks registered with the ClusterApplierService. The new state is also made available via ClusterService.state(). The appliers are notified (in no particular order) before ClusterService.state() is updated, and the
listeners are notified (in no particular order) afterwards. Cluster state updates run in sequence, one-by-one, so they can be a
performance bottleneck. See the JavaDocs on the linked classes and methods for more details.
Cluster state updates can be used to trigger various actions via a ClusterStateListener rather than using a timer.
Implements ChunkedToXContent to be exposed in REST APIs (e.g. GET _cluster/state and POST _cluster/reroute) and
to be indexed by monitoring, mostly just for diagnostics purposes. The XContent representation does not need to be 100% faithful
since we never reconstruct a cluster state from its XContent representation, but the more faithful it is the more useful it is for
diagnostics. Note that the XContent representation of the Metadata portion does have to be faithful (in Metadata.XContentContext.GATEWAY context) since this is how it persists across full cluster restarts.
Security-sensitive data such as passwords or private keys should not be stored in the cluster state, since the contents of the cluster state are exposed in various APIs.
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionstatic classstatic interfacestatic enumNested classes/interfaces inherited from interface org.elasticsearch.common.io.stream.Writeable
Writeable.Reader<V>, Writeable.Writer<V> -
Field Summary
FieldsModifier and TypeFieldDescriptionstatic final ClusterStatestatic final Stringstatic final longstatic final VersionFields inherited from interface org.elasticsearch.common.xcontent.ChunkedToXContent
EMPTY -
Constructor Summary
ConstructorsConstructorDescriptionClusterState(long version, String stateUUID, ClusterState state) ClusterState(ClusterName clusterName, long version, String stateUUID, Metadata metadata, RoutingTable routingTable, DiscoveryNodes nodes, Map<String, CompatibilityVersions> compatibilityVersions, ClusterFeatures clusterFeatures, ClusterBlocks blocks, Map<String, ClusterState.Custom> customs, boolean wasReadFromDiff, RoutingNodes routingNodes) -
Method Summary
Modifier and TypeMethodDescriptionblocks()static ClusterState.Builderbuilder(ClusterName clusterName) static ClusterState.Builderbuilder(ClusterState state) booleancopyAndUpdate(Consumer<ClusterState.Builder> updater) copyAndUpdateMetadata(Consumer<Metadata.Builder> updater) <T extends ClusterState.Custom>
T<T extends ClusterState.Custom>
Tcustoms()diff(ClusterState previousState) Returns serializable object representing differences between this and previousStategetNodes()Returns a built (on demand) routing nodes view of the routing table.longbooleanvoidinitializeAsync(Executor executor) Initialize data structures that lazy computed for this instance in the background by using the giving executor.metadata()Returns a fresh mutable copy of the routing nodes view.nodes()Returns the set of nodes that should be exposed to things like REST handlers that behave differently depending on the nodes in the cluster and their versions.static Diff<ClusterState> readDiffFrom(StreamInput in, DiscoveryNode localNode) static ClusterStatereadFrom(StreamInput in, DiscoveryNode localNode) This stateUUID is automatically generated for for each version of cluster state.booleansupersedes(ClusterState other) a cluster state supersedes another state if they are from the same master and the version of this state is higher than that of the other state.longterm()toString()Iterator<? extends ToXContent> toXContentChunked(ToXContent.Params outerParams) Create an iterator ofToXContentchunks for a REST response.longversion()voidwriteTo(StreamOutput out) Write this into the StreamOutput.Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, waitMethods inherited from interface org.elasticsearch.common.xcontent.ChunkedToXContent
isFragment, toXContentChunked, toXContentChunkedV8
-
Field Details
-
EMPTY_STATE
-
UNKNOWN_UUID
- See Also:
-
UNKNOWN_VERSION
public static final long UNKNOWN_VERSION- See Also:
-
VERSION_INTRODUCING_TRANSPORT_VERSIONS
-
-
Constructor Details
-
ClusterState
-
ClusterState
public ClusterState(ClusterName clusterName, long version, String stateUUID, Metadata metadata, RoutingTable routingTable, DiscoveryNodes nodes, Map<String, CompatibilityVersions> compatibilityVersions, ClusterFeatures clusterFeatures, ClusterBlocks blocks, Map<String, ClusterState.Custom> customs, boolean wasReadFromDiff, @Nullable RoutingNodes routingNodes)
-
-
Method Details
-
term
public long term() -
version
public long version() -
getVersion
public long getVersion() -
stateUUID
This stateUUID is automatically generated for for each version of cluster state. It is used to make sure that we are applying diffs to the right previous state. -
nodes
-
getNodes
-
nodesIfRecovered
Returns the set of nodes that should be exposed to things like REST handlers that behave differently depending on the nodes in the cluster and their versions. Specifically, if the cluster has properly formed then this is the nodes in the last-applied cluster state, but if the cluster has not properly formed then no nodes are returned.- Returns:
- the nodes in the cluster if the cluster has properly formed, otherwise an empty set of nodes.
-
clusterRecovered
public boolean clusterRecovered() -
compatibilityVersions
-
hasMixedSystemIndexVersions
public boolean hasMixedSystemIndexVersions() -
getMinTransportVersion
- Returns:
- the minimum
TransportVersionthat will be used for all future intra-cluster node-to-node communications. This value only ever increases, so ifv.onOrAfter(cs.getMinTransportVersion())is true once then it will remain true in the future.There are some subtle exceptions:
- The "only ever increases" property is handled by the master node using the in-memory (ephemeral) part of the
ClusterStateonly, so in theory a full restart of a mixed-version cluster may lose that state and allow some nodes to see this value decrease. For this to happen in practice requires some fairly unlucky timing during the initial master election. We tell users not to do this: if something breaks during a rolling upgrade then they should upgrade all remaining nodes to continue. But we do not enforce it. - The "used for all node-to-node communications" is false in a disordered upgrade (an upgrade to a semantically-newer but
chronologically-older version) because for each connection between such nodes we will use
TransportVersion.bestKnownVersion()to pick a transport version which is known by both endpoints. We tell users not to do disordered upgrades too, but do not enforce it.
Note also that node-to-node communications which are not intra-cluster (i.e. they are not between nodes in the same cluster) may sometimes use an earlier
TransportVersionthan this value. This includes remote-cluster communication, and communication with nodes that are just starting up or otherwise are attempting to join this cluster. - The "only ever increases" property is handled by the master node using the in-memory (ephemeral) part of the
-
getMinSystemIndexMappingVersions
-
clusterFeatures
-
metadata
-
getMetadata
-
coordinationMetadata
-
routingTable
-
getRoutingTable
-
blocks
-
getBlocks
-
customs
-
getCustoms
-
custom
-
custom
-
getClusterName
-
getLastAcceptedConfiguration
-
getLastCommittedConfiguration
-
getVotingConfigExclusions
-
getRoutingNodes
Returns a built (on demand) routing nodes view of the routing table. -
mutableRoutingNodes
Returns a fresh mutable copy of the routing nodes view. -
initializeAsync
Initialize data structures that lazy computed for this instance in the background by using the giving executor.- Parameters:
executor- executor to run initialization tasks on
-
toString
-
supersedes
a cluster state supersedes another state if they are from the same master and the version of this state is higher than that of the other state.In essence that means that all the changes from the other cluster state are also reflected by the current one
-
toXContentChunked
Description copied from interface:ChunkedToXContentCreate an iterator ofToXContentchunks for a REST response. Each chunk is serialized with the sameXContentBuilderandToXContent.Params, which is also the same as theToXContent.Paramspassed as theparamsargument. For best results, all chunks should beO(1)size. The last chunk in the iterator must always yield at least one byte of output. See alsoChunkedToXContentHelperfor some handy utilities.Note that chunked response bodies cannot send deprecation warning headers once transmission has started, so implementations must check for deprecated feature use before returning.
- Specified by:
toXContentChunkedin interfaceChunkedToXContent- Returns:
- iterator over chunks of
ToXContent
-
builder
-
builder
-
copyAndUpdate
-
copyAndUpdateMetadata
-
diff
Description copied from interface:DiffableReturns serializable object representing differences between this and previousState- Specified by:
diffin interfaceDiffable<ClusterState>
-
readDiffFrom
public static Diff<ClusterState> readDiffFrom(StreamInput in, DiscoveryNode localNode) throws IOException - Throws:
IOException
-
readFrom
- Throws:
IOException
-
writeTo
Description copied from interface:WriteableWrite this into the StreamOutput.- Specified by:
writeToin interfaceWriteable- Throws:
IOException
-