- All Superinterfaces:
AutoCloseable,Closeable
-
Nested Class Summary
Nested Classes -
Method Summary
Modifier and TypeMethodDescriptiondefault booleanChecks the task type against the set of supported streaming tasks returned bysupportedStreamingTasks().default voidcheckModelConfig(Model model, ActionListener<Model> listener) Optionally test the new model configuration in the inference service.voidchunkedInfer(Model model, String query, List<String> input, Map<String, Object> taskSettings, InputType inputType, TimeValue timeout, ActionListener<List<ChunkedInference>> listener) Chunk long text.default List<InferenceService.DefaultConfigId> Get the Ids and task type of any default configurations provided by this servicedefault voiddefaultConfigs(ActionListener<List<Model>> defaultsListener) Call the listener with the default model configurations defined by the serviceDefines the version required across all clusters to use this servicedefault booleanWhether this service should be hidden from the API.voidinfer(Model model, String query, List<String> input, boolean stream, Map<String, Object> taskSettings, InputType inputType, TimeValue timeout, ActionListener<InferenceServiceResults> listener) Perform inference on the model.default voidname()default voidCalled after the Elasticsearch node has completed its start up.Parse model configuration fromconfig mapfrom persisted storage and return the parsedModel.parsePersistedConfigWithSecrets(String modelId, TaskType taskType, Map<String, Object> config, Map<String, Object> secrets) Parse model configuration fromconfig mapfrom persisted storage and return the parsedModel.voidparseRequestConfig(String modelId, TaskType taskType, Map<String, Object> config, ActionListener<Model> parsedModelListener) Parse model configuration from theconfig mapfrom a request and return the parsedModel.voidstart(Model model, TimeValue timeout, ActionListener<Boolean> listener) Start or prepare the model for use.default voidstop(UnparsedModel unparsedModel, ActionListener<Boolean> listener) Stop the model deployment.The set of tasks where this service provider supports using the streaming API.The task types supported by the servicevoidunifiedCompletionInfer(Model model, UnifiedCompletionRequest request, TimeValue timeout, ActionListener<InferenceServiceResults> listener) Perform completion inference on the model using the unified schema.default voidupdateModelsWithDynamicFields(List<Model> model, ActionListener<List<Model>> listener) default ModelUpdate a chat completion model's max tokens if required.default ModelupdateModelWithEmbeddingDetails(Model model, int embeddingSize) Update a text embedding model's dimensions based on a provided embedding size and set the default similarity if required.
-
Method Details
-
init
-
name
String name() -
parseRequestConfig
void parseRequestConfig(String modelId, TaskType taskType, Map<String, Object> config, ActionListener<Model> parsedModelListener) Parse model configuration from theconfig mapfrom a request and return the parsedModel. This requires that both the secrets and service settings be contained in theservice_settingsfield. This function modifiesconfig map, fields are removed from the map as they are read.If the map contains unrecognized configuration option an
ElasticsearchStatusExceptionis thrown.- Parameters:
modelId- Model IdtaskType- The model task typeconfig- Configuration options including the secretsparsedModelListener- A listener which will handle the resulting model or failure
-
parsePersistedConfigWithSecrets
Model parsePersistedConfigWithSecrets(String modelId, TaskType taskType, Map<String, Object> config, Map<String, Object> secrets) Parse model configuration fromconfig mapfrom persisted storage and return the parsedModel. This requires that secrets and service settings be in two separate maps. This function modifiesconfig map, fields are removed from the map as they are read. If the map contains unrecognized configuration options, no error is thrown.- Parameters:
modelId- Model IdtaskType- The model task typeconfig- Configuration optionssecrets- Sensitive configuration options (e.g. api key)- Returns:
- The parsed
Model
-
parsePersistedConfig
Parse model configuration fromconfig mapfrom persisted storage and return the parsedModel. This function modifiesconfig map, fields are removed from the map as they are read. If the map contains unrecognized configuration options, no error is thrown.- Parameters:
modelId- Model IdtaskType- The model task typeconfig- Configuration options- Returns:
- The parsed
Model
-
getConfiguration
InferenceServiceConfiguration getConfiguration() -
hideFromConfigurationApi
default boolean hideFromConfigurationApi()Whether this service should be hidden from the API. Should be used for services that are not ready to be used. -
supportedTaskTypes
The task types supported by the service- Returns:
- Set of supported.
-
infer
void infer(Model model, @Nullable String query, List<String> input, boolean stream, Map<String, Object> taskSettings, InputType inputType, TimeValue timeout, ActionListener<InferenceServiceResults> listener) Perform inference on the model.- Parameters:
model- The modelquery- Inference query, mainly for re-rankinginput- Inference inputstream- Stream inference resultstaskSettings- Settings in the request to override the model's defaultsinputType- For search, ingest etctimeout- The timeout for the requestlistener- Inference result listener
-
unifiedCompletionInfer
void unifiedCompletionInfer(Model model, UnifiedCompletionRequest request, TimeValue timeout, ActionListener<InferenceServiceResults> listener) Perform completion inference on the model using the unified schema.- Parameters:
model- The modelrequest- Parameters for the requesttimeout- The timeout for the requestlistener- Inference result listener
-
chunkedInfer
void chunkedInfer(Model model, @Nullable String query, List<String> input, Map<String, Object> taskSettings, InputType inputType, TimeValue timeout, ActionListener<List<ChunkedInference>> listener) Chunk long text.- Parameters:
model- The modelquery- Inference query, mainly for re-rankinginput- Inference inputtaskSettings- Settings in the request to override the model's defaultsinputType- For search, ingest etctimeout- The timeout for the requestlistener- Chunked Inference result listener
-
start
Start or prepare the model for use.- Parameters:
model- The modeltimeout- Start timeoutlistener- The listener
-
stop
Stop the model deployment. The default action does nothing except acknowledge the request (true).- Parameters:
unparsedModel- The unparsed model configurationlistener- The listener
-
checkModelConfig
Optionally test the new model configuration in the inference service. This function should be called when the model is first created, the default action is to do nothing.- Parameters:
model- The new modellistener- The listener
-
updateModelWithEmbeddingDetails
Update a text embedding model's dimensions based on a provided embedding size and set the default similarity if required. The default behaviour is to just return the model.- Parameters:
model- The original model without updated embedding detailsembeddingSize- The embedding size to update the model with- Returns:
- The model with updated embedding details
-
updateModelWithChatCompletionDetails
Update a chat completion model's max tokens if required. The default behaviour is to just return the model.- Parameters:
model- The original model without updated embedding details- Returns:
- The model with updated chat completion details
-
getMinimalSupportedVersion
TransportVersion getMinimalSupportedVersion()Defines the version required across all clusters to use this service- Returns:
TransportVersionspecifying the version
-
supportedStreamingTasks
The set of tasks where this service provider supports using the streaming API.- Returns:
- set of supported task types. Defaults to empty.
-
canStream
Checks the task type against the set of supported streaming tasks returned bysupportedStreamingTasks().- Parameters:
taskType- the task that supports streaming- Returns:
- true if the taskType is supported
-
defaultConfigIds
Get the Ids and task type of any default configurations provided by this service- Returns:
- Defaults
-
defaultConfigs
Call the listener with the default model configurations defined by the service- Parameters:
defaultsListener- The listener
-
updateModelsWithDynamicFields
-
onNodeStarted
default void onNodeStarted()Called after the Elasticsearch node has completed its start up. This allows the service to perform initialization after ensuring the node's internals are set up (for example if this ensures the internal ES client is ready for use).
-