Class AsyncTwoPhaseIndexer<JobPosition,JobStats extends IndexerJobStats>

java.lang.Object
org.elasticsearch.xpack.core.indexing.AsyncTwoPhaseIndexer<JobPosition,JobStats>
Type Parameters:
JobPosition - Type that defines a job position to be defined by the implementation.

public abstract class AsyncTwoPhaseIndexer<JobPosition,JobStats extends IndexerJobStats> extends Object
An abstract class that builds an index incrementally. A background job can be launched using maybeTriggerAsyncJob(long), it will create the index from the source index up to the last complete bucket that is allowed to be built (based on job position). Only one background job can run simultaneously and onFinish(org.elasticsearch.action.ActionListener<java.lang.Void>) is called when the job finishes. onStop() is called after the current search returns when the job is stopped early via a call to stop(). onFailure(Exception) is called if the job fails with an exception and onAbort() is called if the indexer is aborted while a job is running. The indexer must be started (start() to allow a background job to run when maybeTriggerAsyncJob(long) is called. stop() can be used to stop the background job without aborting the indexer. In a nutshell this is a 2 cycle engine: 1st it sends a query, 2nd it indexes documents based on the response, sends the next query, indexes, queries, indexes, ... until a condition lets the engine pause until the source provides new input.
  • Constructor Details

  • Method Details

    • getState

      public IndexerState getState()
      Get the current state of the indexer.
    • getPosition

      public JobPosition getPosition()
      Get the current position of the indexer.
    • getStats

      public JobStats getStats()
      Get the stats of this indexer.
    • start

      public IndexerState start()
      Sets the internal state to IndexerState.STARTED if the previous state was IndexerState.STOPPED. Setting the state to STARTED allows a job to run in the background when maybeTriggerAsyncJob(long) is called.
      Returns:
      The new state for the indexer (STARTED, INDEXING or ABORTING if the job was already aborted).
    • stop

      public IndexerState stop()
      Sets the internal state to IndexerState.STOPPING if an async job is running in the background, onStop() will be called when the background job detects that the indexer is stopped. If there is no job running when this function is called the returned state is IndexerState.STOPPED and onStop() will not be called.
      Returns:
      The new state for the indexer (STOPPED, STOPPING or ABORTING if the job was already aborted).
    • abort

      public boolean abort()
      Sets the internal state to IndexerState.ABORTING. It returns false if an async job is running in the background and in such case onAbort() will be called as soon as the background job detects that the indexer is aborted. If there is no job running when this function is called, it returns true and onAbort() will never be called.
      Returns:
      true if the indexer is aborted, false if a background job is running and abort is delayed.
    • maybeTriggerAsyncJob

      public boolean maybeTriggerAsyncJob(long now)
      Triggers a background job that builds the index asynchronously iff there is no other job that runs and the indexer is started (IndexerState.STARTED.
      Parameters:
      now - The current time in milliseconds (used to limit the job to complete buckets)
      Returns:
      true if a job has been triggered, false otherwise
    • triggerSaveState

      protected boolean triggerSaveState()
      Checks if the state should be persisted, if true doSaveState is called before continuing. Inherited classes can override this, to provide a better logic, when state should be saved.
      Returns:
      true if state should be saved, false if not.
    • rethrottle

      protected void rethrottle()
      Re-schedules the search request if necessary, this method can be called to apply a change in maximumRequestsPerSecond immediately
    • runSearchImmediately

      protected void runSearchImmediately()
      Re-schedules the current search request to run immediately, iff one is scheduled. Call this if you need the indexer to fast forward a scheduled(in case it's throttled) search once in order to complete a full cycle.
    • getTimeNanos

      protected long getTimeNanos()
    • getScheduledNextSearch

      protected org.elasticsearch.xpack.core.indexing.AsyncTwoPhaseIndexer.ScheduledRunnable getScheduledNextSearch()
    • getMaxDocsPerSecond

      protected float getMaxDocsPerSecond()
      Called to get max docs per second. To be overwritten if throttling is implemented, the default -1 turns off throttling.
      Returns:
      a float with max docs per second, -1 if throttling is off
    • getJobId

      protected abstract String getJobId()
      Called to get the Id of the job, used for logging.
      Returns:
      a string with the id of the job
    • doProcess

      protected abstract IterationResult<JobPosition> doProcess(SearchResponse searchResponse)
      Called to process a response from the 1 search request in order to turn it into a IterationResult.
      Parameters:
      searchResponse - response from the search phase.
      Returns:
      Iteration object to be passed to indexing phase.
    • onStart

      protected abstract void onStart(long now, ActionListener<Boolean> listener)
      Called at startup after job has been triggered using maybeTriggerAsyncJob(long) and the internal state is IndexerState.STARTED.
      Parameters:
      now - The current time in milliseconds passed through from maybeTriggerAsyncJob(long)
      listener - listener to call after done. The argument passed to the listener indicates if the indexer should continue or not true: continue execution as normal false: cease execution. This does NOT call onFinish
    • doNextSearch

      protected abstract void doNextSearch(long waitTimeInNanos, ActionListener<SearchResponse> nextPhase)
      Executes the next search and calls nextPhase with the response or the exception if an error occurs. In case the indexer is throttled waitTimeInNanos can be used as hint for doing a less resource hungry search.
      Parameters:
      waitTimeInNanos - Duration in nanoseconds the indexer has waited due to throttling
      nextPhase - Listener for the next phase
    • doNextBulk

      protected abstract void doNextBulk(BulkRequest request, ActionListener<BulkResponse> nextPhase)
      Executes the BulkRequest and calls nextPhase with the response or the exception if an error occurs.
      Parameters:
      request - The bulk request to execute
      nextPhase - Listener for the next phase
    • doSaveState

      protected abstract void doSaveState(IndexerState state, JobPosition position, Runnable next)
      Called periodically during the execution of a background job. Implementation should persists the state somewhere and continue the execution asynchronously using next.
      Parameters:
      state - The current state of the indexer
      position - The current position of the indexer
      next - Runnable for the next phase
    • onFailure

      protected abstract void onFailure(Exception exc)
      Called when a failure occurs in an async job causing the execution to stop. This is called before the internal state changes from the state in which the failure occurred.
      Parameters:
      exc - The exception
    • onFinish

      protected abstract void onFinish(ActionListener<Void> listener)
      Called when a background job finishes before the internal state changes from IndexerState.INDEXING back to IndexerState.STARTED.
      Parameters:
      listener - listener to call after done
    • afterFinishOrFailure

      protected void afterFinishOrFailure()
      Called after onFinish or after onFailure and all the following steps - in particular state persistence - are completed. This will be called before the internal state changes from IndexerState.INDEXING to IndexerState.STARTED or from IndexerState.STOPPING to IndexerState.STOPPED.
    • onStop

      protected void onStop()
      Called when the indexer is stopped. This is only called when the indexer is stopped via stop() as opposed to onFinish(ActionListener) which is called when the indexer's work is done.
    • onAbort

      protected abstract void onAbort()
      Called when a background job detects that the indexer is aborted causing the async execution to stop.
    • nextSearch

      protected void nextSearch()