Module org.elasticsearch.server
Class ES819TSDBDocValuesFormat
java.lang.Object
org.apache.lucene.codecs.DocValuesFormat
org.elasticsearch.index.codec.tsdb.es819.ES819TSDBDocValuesFormat
- All Implemented Interfaces:
org.apache.lucene.util.NamedSPILoader.NamedSPI
public class ES819TSDBDocValuesFormat
extends org.apache.lucene.codecs.DocValuesFormat
Evolved from
ES87TSDBDocValuesFormat and has the following changes:
- Moved numDocsWithField metadata statistic from SortedNumericEntry to NumericEntry. This allows for always summing numDocsWithField during segment merging, otherwise numDocsWithField needs to be computed for each segment merge per field.
- Moved docsWithFieldOffset, docsWithFieldLength, jumpTableEntryCount, denseRankPower metadata properties in the format to be after values metadata. So that the jump table can be stored after the values, which allows for iterating once over the merged view of all values. If index sorting is active merging a doc value field requires a merge sort which can be very cpu intensive. The previous format always has to merge sort a doc values field multiple times, so doing the merge sort just once saves on cpu resources.
- Version 1 adds block-wise compression to binary doc values. Each block contains a variable number of values so that each
block is approximately the same size. To map a given value's index to the block containing the value, there are two parallel
arrays. These contain the starting address for each block, and the starting value index for each block. Additional compression
types may be added by creating a new mode in
BinaryDVCompressionMode.
-
Field Summary
FieldsModifier and TypeFieldDescriptionstatic final booleanstatic final intThese thresholds determine the size of a compressed binary block.static final intstatic final intstatic final intThe block shift used in DirectMonotonicWriter when encoding the start docs of each ordinal with ordinal range encoding.static final intThe default minimum number of documents per ordinal required to use ordinal range encoding. -
Constructor Summary
ConstructorsConstructorDescriptionDefault constructor.ES819TSDBDocValuesFormat(int skipIndexIntervalSize, int minDocsPerOrdinalForRangeEncoding, boolean enableOptimizedMerge, BinaryDVCompressionMode binaryDVCompressionMode, boolean enablePerBlockCompression) Doc values fields format with specified skipIndexIntervalSize.ES819TSDBDocValuesFormat(BinaryDVCompressionMode binaryDVCompressionMode) ES819TSDBDocValuesFormat(BinaryDVCompressionMode binaryDVCompressionMode, boolean enablePerBlockCompression) -
Method Summary
Modifier and TypeMethodDescriptionorg.apache.lucene.codecs.DocValuesConsumerfieldsConsumer(org.apache.lucene.index.SegmentWriteState state) org.apache.lucene.codecs.DocValuesProducerfieldsProducer(org.apache.lucene.index.SegmentReadState state) Methods inherited from class org.apache.lucene.codecs.DocValuesFormat
availableDocValuesFormats, forName, getName, reloadDocValuesFormats, toString
-
Field Details
-
BINARY_DV_COMPRESSION_FEATURE_FLAG
public static final boolean BINARY_DV_COMPRESSION_FEATURE_FLAG -
NUMERIC_BLOCK_SIZE
public static final int NUMERIC_BLOCK_SIZE- See Also:
-
BLOCK_BYTES_THRESHOLD
public static final int BLOCK_BYTES_THRESHOLDThese thresholds determine the size of a compressed binary block. We build a new block if the uncompressed data in the block is 128k, or if the number of values is 1024. These values are a tradeoff between the high compression ratio and decompression speed of large blocks, and the ability to avoid decompressing unneeded values provided by small blocks.- See Also:
-
BLOCK_COUNT_THRESHOLD
public static final int BLOCK_COUNT_THRESHOLD- See Also:
-
ORDINAL_RANGE_ENCODING_MIN_DOC_PER_ORDINAL
public static final int ORDINAL_RANGE_ENCODING_MIN_DOC_PER_ORDINALThe default minimum number of documents per ordinal required to use ordinal range encoding. If the average number of documents per ordinal is below this threshold, it is more efficient to encode doc values in blocks. A much smaller value may be used in tests to exercise ordinal range encoding more frequently.- See Also:
-
ORDINAL_RANGE_ENCODING_BLOCK_SHIFT
public static final int ORDINAL_RANGE_ENCODING_BLOCK_SHIFTThe block shift used in DirectMonotonicWriter when encoding the start docs of each ordinal with ordinal range encoding.- See Also:
-
-
Constructor Details
-
ES819TSDBDocValuesFormat
public ES819TSDBDocValuesFormat()Default constructor. -
ES819TSDBDocValuesFormat
-
ES819TSDBDocValuesFormat
public ES819TSDBDocValuesFormat(BinaryDVCompressionMode binaryDVCompressionMode, boolean enablePerBlockCompression) -
ES819TSDBDocValuesFormat
public ES819TSDBDocValuesFormat(int skipIndexIntervalSize, int minDocsPerOrdinalForRangeEncoding, boolean enableOptimizedMerge, BinaryDVCompressionMode binaryDVCompressionMode, boolean enablePerBlockCompression) Doc values fields format with specified skipIndexIntervalSize.
-
-
Method Details
-
fieldsConsumer
public org.apache.lucene.codecs.DocValuesConsumer fieldsConsumer(org.apache.lucene.index.SegmentWriteState state) throws IOException - Specified by:
fieldsConsumerin classorg.apache.lucene.codecs.DocValuesFormat- Throws:
IOException
-
fieldsProducer
public org.apache.lucene.codecs.DocValuesProducer fieldsProducer(org.apache.lucene.index.SegmentReadState state) throws IOException - Specified by:
fieldsProducerin classorg.apache.lucene.codecs.DocValuesFormat- Throws:
IOException
-