Class InferenceOperator
java.lang.Object
org.elasticsearch.compute.operator.AsyncOperator<InferenceOperator.OngoingInferenceResult>
org.elasticsearch.xpack.esql.inference.InferenceOperator
- All Implemented Interfaces:
Closeable,AutoCloseable,Operator,org.elasticsearch.core.Releasable
- Direct Known Subclasses:
CompletionOperator,RerankOperator
public abstract class InferenceOperator
extends AsyncOperator<InferenceOperator.OngoingInferenceResult>
An abstract asynchronous operator that performs throttled bulk inference execution using an
InferenceResolver.
The InferenceOperator integrates with the compute framework supports throttled bulk execution of inference requests. It
transforms input Page into inference requests, asynchronously executes them, and converts the responses into a new Page.
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionstatic final recordRepresents the result of an ongoing inference operation, including the original input page and the list of inference responses.static interfaceAn interface for accumulating inference responses and constructing a resultPage.Nested classes/interfaces inherited from class org.elasticsearch.compute.operator.AsyncOperator
AsyncOperator.StatusNested classes/interfaces inherited from interface org.elasticsearch.compute.operator.Operator
Operator.OperatorFactory -
Field Summary
Fields inherited from interface org.elasticsearch.compute.operator.Operator
MIN_TARGET_PAGE_SIZE, NOT_BLOCKED, TARGET_PAGE_SIZE -
Constructor Summary
ConstructorsConstructorDescriptionInferenceOperator(DriverContext driverContext, BulkInferenceRunner bulkInferenceRunner, String inferenceId, int maxOutstandingPages) Constructs a newInferenceOperator. -
Method Summary
Modifier and TypeMethodDescriptionprotected BlockFactoryReturns theBlockFactoryused to create output data blocks.Returns the next available output page constructed from completed inference results.protected StringReturns the inference model ID used for this operator.protected abstract InferenceOperator.OutputBuilderoutputBuilder(Page input) Creates a newInferenceOperator.OutputBuilderinstance used to build the output page.protected voidperformAsync(Page input, ActionListener<InferenceOperator.OngoingInferenceResult> listener) Initiates asynchronous inferences for the given input page.protected voidreleaseFetchedOnAnyThread(InferenceOperator.OngoingInferenceResult ongoingInferenceResult) Releases resources associated with an ongoing inference.protected abstract BulkInferenceRequestIteratorConverts the given input page into a sequence of inference requests.Methods inherited from class org.elasticsearch.compute.operator.AsyncOperator
addInput, close, doClose, fetchFromBuffer, finish, isBlocked, isFinished, needsInput, releasePageOnAnyThread, status, status
-
Constructor Details
-
InferenceOperator
public InferenceOperator(DriverContext driverContext, BulkInferenceRunner bulkInferenceRunner, String inferenceId, int maxOutstandingPages) Constructs a newInferenceOperator.- Parameters:
driverContext- The driver context.bulkInferenceRunner- Inference runner used to execute inference requests.inferenceId- The ID of the inference model to use.maxOutstandingPages- The number of concurrent pages to process in parallel.
-
-
Method Details
-
blockFactory
Returns theBlockFactoryused to create output data blocks. -
inferenceId
Returns the inference model ID used for this operator. -
performAsync
protected void performAsync(Page input, ActionListener<InferenceOperator.OngoingInferenceResult> listener) Initiates asynchronous inferences for the given input page.- Specified by:
performAsyncin classAsyncOperator<InferenceOperator.OngoingInferenceResult>
-
releaseFetchedOnAnyThread
protected void releaseFetchedOnAnyThread(InferenceOperator.OngoingInferenceResult ongoingInferenceResult) Releases resources associated with an ongoing inference.- Specified by:
releaseFetchedOnAnyThreadin classAsyncOperator<InferenceOperator.OngoingInferenceResult>
-
getOutput
Returns the next available output page constructed from completed inference results. -
requests
Converts the given input page into a sequence of inference requests.- Parameters:
input- The input page to process.
-
outputBuilder
Creates a newInferenceOperator.OutputBuilderinstance used to build the output page.- Parameters:
input- The corresponding input page used to generate the inference requests.
-