Class BlobStoreRepository

java.lang.Object
org.elasticsearch.common.component.AbstractLifecycleComponent
org.elasticsearch.repositories.blobstore.BlobStoreRepository
All Implemented Interfaces:
Closeable, AutoCloseable, LifecycleComponent, Releasable, Repository
Direct Known Subclasses:
FsRepository, MeteredBlobStoreRepository

public abstract class BlobStoreRepository extends AbstractLifecycleComponent implements Repository
BlobStore - based implementation of Snapshot Repository

This repository works with any BlobStore implementation. The blobStore could be (and is preferably) lazily initialized in createBlobStore().

For in depth documentation on how exactly implementations of this class interact with the snapshot functionality please refer to the documentation of the package org.elasticsearch.repositories.blobstore.
  • Field Details

  • Constructor Details

  • Method Details

    • getRepositoryDataBlobName

      public static String getRepositoryDataBlobName(long repositoryGeneration)
      Parameters:
      repositoryGeneration - The numeric generation of the RepositoryData blob.
      Returns:
      The name of the blob holding the corresponding RepositoryData.
    • doStart

      protected void doStart()
      Description copied from class: AbstractLifecycleComponent
      Start this component. Typically that means doing things like launching background processes and registering listeners on other components. Other components have been initialized by this point, but may not yet be started.

      If this method throws an exception then the startup process will fail, but this component will not be stopped before it is closed.

      This method is called while synchronized on AbstractLifecycleComponent.lifecycle. It is only called once in the lifetime of a component, although it may not be called at all if the startup process encountered some kind of fatal error, such as the failure of some other component to initialize or start.

      Specified by:
      doStart in class AbstractLifecycleComponent
    • doStop

      protected void doStop()
      Description copied from class: AbstractLifecycleComponent
      Stop this component. Typically that means doing the reverse of whatever AbstractLifecycleComponent.doStart() does.

      This method is called while synchronized on AbstractLifecycleComponent.lifecycle. It is only called once in the lifetime of a component, after calling AbstractLifecycleComponent.doStart(), although it will not be called at all if this component did not successfully start.

      Specified by:
      doStop in class AbstractLifecycleComponent
    • doClose

      protected void doClose()
      Description copied from class: AbstractLifecycleComponent
      Close this component. Typically that means doing the reverse of whatever happened during initialization, such as releasing resources acquired there.

      This method is called while synchronized on AbstractLifecycleComponent.lifecycle. It is called once in the lifetime of a component. If the component was started then it will be stopped before it is closed, and once it is closed it will not be started or stopped.

      Specified by:
      doClose in class AbstractLifecycleComponent
    • awaitIdle

      public void awaitIdle()
      Description copied from interface: Repository
      Block until all in-flight operations for this repository have completed. Must only be called after this instance has been closed by a call to stop Releasable.close(). Waiting for ongoing operations should be implemented here instead of in LifecycleComponent.stop() or Releasable.close() hooks of this interface as these are expected to be called on the cluster state applier thread (which must not block) if a repository is removed from the cluster. This method is intended to be called on node shutdown instead as a means to ensure no repository operations are leaked.
      Specified by:
      awaitIdle in interface Repository
    • cloneShardSnapshot

      public void cloneShardSnapshot(SnapshotId source, SnapshotId target, RepositoryShardId shardId, @Nullable ShardGeneration shardGeneration, ActionListener<ShardSnapshotResult> listener)
      Description copied from interface: Repository
      Clones a shard snapshot.
      Specified by:
      cloneShardSnapshot in interface Repository
      Parameters:
      source - source snapshot
      target - target snapshot
      shardId - shard id
      shardGeneration - shard generation in repo
      listener - listener to complete with new shard generation once clone has completed
    • canUpdateInPlace

      public boolean canUpdateInPlace(Settings updatedSettings, Set<String> ignoredSettings)
      Description copied from interface: Repository
      Check if this instances Settings can be changed to the provided updated settings without recreating the repository.
      Specified by:
      canUpdateInPlace in interface Repository
      Parameters:
      updatedSettings - new repository settings
      ignoredSettings - setting names to ignore even if changed
      Returns:
      true if the repository can be updated in place
    • updateState

      public void updateState(ClusterState state)
      Description copied from interface: Repository
      Update the repository with the incoming cluster state. This method is invoked from RepositoriesService.applyClusterState(org.elasticsearch.cluster.ClusterChangedEvent) and thus the same semantics as with ClusterStateApplier.applyClusterState(org.elasticsearch.cluster.ClusterChangedEvent) apply for the ClusterState that is passed here.
      Specified by:
      updateState in interface Repository
      Parameters:
      state - new cluster state
    • threadPool

      public ThreadPool threadPool()
    • getBlobStore

      protected BlobStore getBlobStore()
    • blobContainer

      protected BlobContainer blobContainer()
      maintains single lazy instance of BlobContainer
    • blobStore

      public BlobStore blobStore()
      Maintains single lazy instance of BlobStore. Public for testing.
    • createBlobStore

      protected abstract BlobStore createBlobStore() throws Exception
      Creates new BlobStore to read and write data.
      Throws:
      Exception
    • basePath

      public BlobPath basePath()
      Returns base path of the repository Public for testing.
    • isCompress

      protected final boolean isCompress()
      Returns true if metadata and snapshot files should be compressed
      Returns:
      true if compression is needed
    • chunkSize

      protected ByteSizeValue chunkSize()
      Returns data file chunk size.

      This method should return null if no chunking is needed.

      Returns:
      chunk size
    • getMetadata

      public RepositoryMetadata getMetadata()
      Description copied from interface: Repository
      Returns metadata about this repository.
      Specified by:
      getMetadata in interface Repository
    • stats

      public RepositoryStats stats()
      Description copied from interface: Repository
      Returns stats on the repository usage
      Specified by:
      stats in interface Repository
    • wrapWithWeakConsistencyProtection

      protected ActionListener<RepositoryData> wrapWithWeakConsistencyProtection(ActionListener<RepositoryData> listener)
      Some repositories (i.e. S3) run at extra risk of corruption when using the pre-7.6.0 repository format, against which we try and protect by adding some delays in between operations so that things have a chance to settle down. This method is the hook that allows the delete process to add this protection when necessary.
    • deleteSnapshots

      public void deleteSnapshots(Collection<SnapshotId> snapshotIds, long repositoryDataGeneration, IndexVersion minimumNodeVersion, ActionListener<RepositoryData> repositoryDataUpdateListener, Runnable onCompletion)
      Description copied from interface: Repository
      Deletes snapshots
      Specified by:
      deleteSnapshots in interface Repository
      Parameters:
      snapshotIds - snapshot ids to delete
      repositoryDataGeneration - the generation of the RepositoryData in the repository at the start of the deletion
      minimumNodeVersion - the minimum IndexVersion across the nodes in the cluster, with which the repository format must remain compatible
      repositoryDataUpdateListener - listener completed when the RepositoryData is updated, or when the process fails without changing the repository contents - in either case, it is now safe for the next operation on this repository to proceed.
      onCompletion - action executed on completion of the cleanup actions that follow a successful RepositoryData update; not called if repositoryDataUpdateListener completes exceptionally.
    • cleanup

      public void cleanup(long repositoryDataGeneration, IndexVersion repositoryFormatIndexVersion, ActionListener<DeleteResult> listener)
      Runs cleanup actions on the repository. Increments the repository state id by one before executing any modifications on the repository. TODO: Add shard level cleanups TODO: Add unreferenced index metadata cleanup
      • Deleting stale indices
      • Deleting unreferenced root level blobs
      Parameters:
      repositoryDataGeneration - Generation of RepositoryData at start of process
      repositoryFormatIndexVersion - Repository format version
      listener - Listener to complete when done
    • finalizeSnapshot

      public void finalizeSnapshot(FinalizeSnapshotContext finalizeSnapshotContext)
      Description copied from interface: Repository
      Finalizes snapshotting process

      This method is called on master after all shards are snapshotted.

      Specified by:
      finalizeSnapshot in interface Repository
      Parameters:
      finalizeSnapshotContext - finalization context
    • getSnapshotInfo

      public void getSnapshotInfo(Collection<SnapshotId> snapshotIds, boolean abortOnFailure, BooleanSupplier isCancelled, CheckedConsumer<SnapshotInfo,Exception> consumer, ActionListener<Void> listener)
      Description copied from interface: Repository
      Reads a collection of SnapshotInfo instances from the repository.
      Specified by:
      getSnapshotInfo in interface Repository
      Parameters:
      snapshotIds - The IDs of the snapshots whose SnapshotInfo instances should be retrieved.
      abortOnFailure - Whether to stop fetching further SnapshotInfo instances if a single fetch fails.
      isCancelled - Supplies whether the enclosing task is cancelled, which should stop fetching SnapshotInfo instances.
      consumer - A consumer for each SnapshotInfo retrieved. Called concurrently from multiple threads. If the consumer throws an exception and abortOnFailure is true then the fetching will stop.
      listener - If abortOnFailure is true and any operation fails then the failure is passed to this listener. Also completed exceptionally on cancellation. Otherwise, completed once all requested SnapshotInfo instances have been processed by the consumer.
    • getSnapshotGlobalMetadata

      public Metadata getSnapshotGlobalMetadata(SnapshotId snapshotId)
      Description copied from interface: Repository
      Returns global metadata associated with the snapshot.
      Specified by:
      getSnapshotGlobalMetadata in interface Repository
      Parameters:
      snapshotId - the snapshot id to load the global metadata from
      Returns:
      the global metadata about the snapshot
    • getSnapshotIndexMetaData

      public IndexMetadata getSnapshotIndexMetaData(RepositoryData repositoryData, SnapshotId snapshotId, IndexId index) throws IOException
      Description copied from interface: Repository
      Returns the index metadata associated with the snapshot.
      Specified by:
      getSnapshotIndexMetaData in interface Repository
      Parameters:
      repositoryData - current RepositoryData
      snapshotId - the snapshot id to load the index metadata from
      index - the IndexId to load the metadata from
      Returns:
      the index metadata about the given index for the given snapshot
      Throws:
      IOException
    • shardContainer

      public BlobContainer shardContainer(IndexId indexId, int shardId)
    • getSnapshotThrottleTimeInNanos

      public long getSnapshotThrottleTimeInNanos()
      Description copied from interface: Repository
      Returns snapshot throttle time in nanoseconds
      Specified by:
      getSnapshotThrottleTimeInNanos in interface Repository
    • getRestoreThrottleTimeInNanos

      public long getRestoreThrottleTimeInNanos()
      Description copied from interface: Repository
      Returns restore throttle time in nanoseconds
      Specified by:
      getRestoreThrottleTimeInNanos in interface Repository
    • startVerification

      public String startVerification()
      Description copied from interface: Repository
      Verifies repository on the master node and returns the verification token.

      If the verification token is not null, it's passed to all data nodes for verification. If it's null - no additional verification is required

      Specified by:
      startVerification in interface Repository
      Returns:
      verification token that should be passed to all Index Shard Repositories for additional verification or null
    • endVerification

      public void endVerification(String seed)
      Description copied from interface: Repository
      Called at the end of repository verification process.

      This method should perform all necessary cleanup of the temporary files created in the repository

      Specified by:
      endVerification in interface Repository
      Parameters:
      seed - verification request generated by Repository.startVerification() command
    • getRepositoryData

      public void getRepositoryData(Executor responseExecutor, ActionListener<RepositoryData> listener)
      Description copied from interface: Repository
      Fetches the RepositoryData and passes it into the listener. May completes the listener with a RepositoryException if there is an error in reading the repository data.
      Specified by:
      getRepositoryData in interface Repository
      Parameters:
      responseExecutor - Executor to use to complete the listener if not using the calling thread. Using EsExecutors.DIRECT_EXECUTOR_SERVICE means to complete the listener on the thread which ultimately resolved the RepositoryData, which might be a low-latency transport or cluster applier thread so make sure not to do anything slow or expensive in that case.
      listener - Listener which is either completed on the calling thread (if the RepositoryData is immediately available, e.g. from an in-memory cache), otherwise it is completed using responseExecutor.
    • isReadOnly

      public boolean isReadOnly()
      Description copied from interface: Repository
      Returns true if the repository supports only read operations
      Specified by:
      isReadOnly in interface Repository
      Returns:
      true if the repository is read/only
    • writeIndexGen

      protected void writeIndexGen(RepositoryData repositoryData, long expectedGen, IndexVersion version, Function<ClusterState,ClusterState> stateFilter, ActionListener<RepositoryData> listener)
      Writing a new index generation (root) blob is a three-step process. Typically, it starts from a stable state where the pending generation RepositoryMetadata.pendingGeneration() is equal to the safe generation RepositoryMetadata.generation(), but after a failure it may be that the pending generation starts out greater than the safe generation.
      1. We reserve ourselves a new root blob generation G, greater than RepositoryMetadata.pendingGeneration(), via a cluster state update which edits the RepositoryMetadata entry for this repository, increasing its pending generation to G without changing its safe generation.
      2. We write the updated RepositoryData to a new root blob with generation G.
      3. We mark the successful end of the update of the repository data with a cluster state update which edits the RepositoryMetadata entry for this repository again, increasing its safe generation to equal to its pending generation G.
      We use this process to protect against problems such as a master failover part-way through. If a new master is elected while we're writing the root blob with generation G then we will fail to update the safe repository generation in the final step, and meanwhile the new master will choose a generation greater than G for all subsequent root blobs so there is no risk that we will clobber its writes. See the package level documentation for org.elasticsearch.repositories.blobstore for more details.

      Note that a failure here does not imply that the process was unsuccessful or the repository is unchanged. Once we have written the new root blob the repository is updated from the point of view of any other clusters reading from it, and if we performed a full cluster restart at that point then we would also pick up the new root blob. Writing the root blob may succeed without us receiving a successful response from the repository, leading us to report that the write operation failed. Updating the safe generation may likewise succeed on a majority of master-eligible nodes which does not include this one, again leading to an apparent failure.

      We therefore cannot safely clean up apparently-dangling blobs after a failure here. Instead, we defer any cleanup until after the next successful root-blob write, which may happen on a different master node or possibly even in a different cluster.

      Parameters:
      repositoryData - RepositoryData to write
      expectedGen - expected repository generation at the start of the operation
      version - version of the repository metadata to write
      stateFilter - filter for the last cluster state update executed by this method
      listener - completion listener
    • snapshotShard

      public void snapshotShard(SnapshotShardContext context)
      Description copied from interface: Repository
      Creates a snapshot of the shard referenced by the given SnapshotShardContext.

      As snapshot process progresses, implementation of this method should update IndexShardSnapshotStatus object returned by SnapshotShardContext.status() and call IndexShardSnapshotStatus.ensureNotAborted() to see if the snapshot process should be aborted.

      Specified by:
      snapshotShard in interface Repository
      Parameters:
      context - snapshot shard context that must be completed via SnapshotShardContext.onResponse(org.elasticsearch.repositories.ShardSnapshotResult) or DelegatingActionListener.onFailure(java.lang.Exception)
    • snapshotFiles

      protected void snapshotFiles(SnapshotShardContext context, BlockingQueue<BlobStoreIndexShardSnapshot.FileInfo> filesToSnapshot, ActionListener<Collection<Void>> allFilesUploadedListener)
    • restoreShard

      public void restoreShard(Store store, SnapshotId snapshotId, IndexId indexId, ShardId snapshotShardId, RecoveryState recoveryState, ActionListener<Void> listener)
      Description copied from interface: Repository
      Restores snapshot of the shard.

      The index can be renamed on restore, hence different shardId and snapshotShardId are supplied.

      Specified by:
      restoreShard in interface Repository
      Parameters:
      store - the store to restore the index into
      snapshotId - snapshot id
      indexId - id of the index in the repository from which the restore is occurring
      snapshotShardId - shard id (in the snapshot)
      recoveryState - recovery state
      listener - listener to invoke once done
    • maybeRateLimitRestores

      public InputStream maybeRateLimitRestores(InputStream stream)
      Wrap the restore rate limiter (controlled by the repository setting `max_restore_bytes_per_sec` and the cluster setting `indices.recovery.max_bytes_per_sec`) around the given stream. Any throttling is reported to the given listener and not otherwise recorded in the value returned by getRestoreThrottleTimeInNanos().
    • maybeRateLimitRestores

      public InputStream maybeRateLimitRestores(InputStream stream, RateLimitingInputStream.Listener throttleListener)
      Wrap the restore rate limiter (controlled by the repository setting `max_restore_bytes_per_sec` and the cluster setting `indices.recovery.max_bytes_per_sec`) around the given stream. Any throttling is recorded in the value returned by getRestoreThrottleTimeInNanos().
    • maybeRateLimitSnapshots

      public InputStream maybeRateLimitSnapshots(InputStream stream)
      Wrap the snapshot rate limiter around the given stream. Any throttling is recorded in the value returned by getSnapshotThrottleTimeInNanos(). Note that speed is throttled by the repository setting `max_snapshot_bytes_per_sec` and, if recovery node bandwidth settings have been set, additionally by the `indices.recovery.max_bytes_per_sec` speed.
    • maybeRateLimitSnapshots

      public InputStream maybeRateLimitSnapshots(InputStream stream, RateLimitingInputStream.Listener throttleListener)
      Wrap the snapshot rate limiter around the given stream. Any throttling is recorded in the value returned by getSnapshotThrottleTimeInNanos(). Note that speed is throttled by the repository setting `max_snapshot_bytes_per_sec` and, if recovery node bandwidth settings have been set, additionally by the `indices.recovery.max_bytes_per_sec` speed.
    • getShardSnapshotStatus

      public IndexShardSnapshotStatus.Copy getShardSnapshotStatus(SnapshotId snapshotId, IndexId indexId, ShardId shardId)
      Description copied from interface: Repository
      Retrieve shard snapshot status for the stored snapshot
      Specified by:
      getShardSnapshotStatus in interface Repository
      Parameters:
      snapshotId - snapshot id
      indexId - the snapshotted index id for the shard to get status for
      shardId - shard id
      Returns:
      snapshot status
    • verify

      public void verify(String seed, DiscoveryNode localNode)
      Description copied from interface: Repository
      Verifies repository settings on data node.
      Specified by:
      verify in interface Repository
      Parameters:
      seed - value returned by Repository.startVerification()
      localNode - the local node information, for inclusion in verification errors
    • toString

      public String toString()
      Overrides:
      toString in class Object
    • loadShardSnapshot

      public BlobStoreIndexShardSnapshot loadShardSnapshot(BlobContainer shardContainer, SnapshotId snapshotId)
      Loads information about shard snapshot
    • getBlobStoreIndexShardSnapshots

      public BlobStoreIndexShardSnapshots getBlobStoreIndexShardSnapshots(IndexId indexId, int shardId, @Nullable ShardGeneration shardGen) throws IOException
      Loads all available snapshots in the repository using the given generation for a shard. When shardGen is null it tries to load it using the BwC mode, listing the available index- blobs in the shard container.
      Throws:
      IOException
    • snapshotFile

      protected void snapshotFile(SnapshotShardContext context, BlobStoreIndexShardSnapshot.FileInfo fileInfo) throws IOException
      Snapshot individual file
      Parameters:
      fileInfo - file to snapshot
      Throws:
      IOException
    • supportURLRepo

      public boolean supportURLRepo()
    • hasAtomicOverwrites

      public boolean hasAtomicOverwrites()
      Returns:
      whether this repository performs overwrites atomically. In practice we only overwrite the `index.latest` blob so this is not very important, but the repository analyzer does test that overwrites happen atomically. It will skip those tests if the repository overrides this method to indicate that it does not support atomic overwrites.
    • getReadBufferSizeInBytes

      public int getReadBufferSizeInBytes()
    • getAnalysisFailureExtraDetail

      public String getAnalysisFailureExtraDetail()
      Returns:
      extra information to be included in the exception message emitted on failure of a repository analysis.
    • getUsageFeatures

      public final Set<String> getUsageFeatures()
      Specified by:
      getUsageFeatures in interface Repository
      Returns:
      a set of the names of the features that this repository instance uses, for reporting in the cluster stats for telemetry collection.
    • getExtraUsageFeatures

      protected Set<String> getExtraUsageFeatures()
      All blob-store repositories include the counts of read-only and read-write repositories in their telemetry. This method returns other features of the repositories in use.
      Returns:
      a set of the names of the extra features that this repository instance uses, for reporting in the cluster stats for telemetry collection.