Module org.elasticsearch.xcore
Class RecursiveChunker
java.lang.Object
org.elasticsearch.xpack.core.inference.chunking.RecursiveChunker
- All Implemented Interfaces:
Chunker
Split text into chunks recursively based on a list of separator regex strings.
The maximum chunk size is measured in words and controlled
by
maxNumberWordsPerChunk. For each separator the chunker will go through the following process:
1. Split the text on each regex match of the separator.
2. For each chunk after the merge:
1. Return it if it is within the maximum chunk size.
2. Repeat the process using the next separator in the list if the chunk exceeds the maximum chunk size.
If there are no more separators left to try, run the SentenceBoundaryChunker with the provided
max chunk size and no overlaps.-
Nested Class Summary
Nested classes/interfaces inherited from interface org.elasticsearch.xpack.core.inference.chunking.Chunker
Chunker.ChunkOffset -
Constructor Summary
Constructors -
Method Summary