Module org.elasticsearch.server
Class ES819TSDBDocValuesFormat
java.lang.Object
org.apache.lucene.codecs.DocValuesFormat
org.elasticsearch.index.codec.tsdb.es819.ES819TSDBDocValuesFormat
- All Implemented Interfaces:
org.apache.lucene.util.NamedSPILoader.NamedSPI
public class ES819TSDBDocValuesFormat
extends org.apache.lucene.codecs.DocValuesFormat
Evolved from
ES87TSDBDocValuesFormat and has the following changes:
- Moved numDocsWithField metadata statistic from SortedNumericEntry to NumericEntry. This allows for always summing numDocsWithField during segment merging, otherwise numDocsWithField needs to be computed for each segment merge per field.
- Moved docsWithFieldOffset, docsWithFieldLength, jumpTableEntryCount, denseRankPower metadata properties in the format to be after values metadata. So that the jump table can be stored after the values, which allows for iterating once over the merged view of all values. If index sorting is active merging a doc value field requires a merge sort which can be very cpu intensive. The previous format always has to merge sort a doc values field multiple times, so doing the merge sort just once saves on cpu resources.
- Version 1 adds block-wise compression to binary doc values. Each block contains a variable number of values so that each
block is approximately the same size. To map a given value's index to the block containing the value, there are two parallel
arrays. These contain the starting address for each block, and the starting value index for each block. Additional compression
types may be added by creating a new mode in
BinaryDVCompressionMode.
-
Field Summary
FieldsModifier and TypeFieldDescriptionstatic final intThese thresholds determine the size of a compressed binary block.static final intstatic final intThe block shift used in DirectMonotonicWriter when encoding the start docs of each ordinal with ordinal range encoding.static final intThe default minimum number of documents per ordinal required to use ordinal range encoding. -
Constructor Summary
ConstructorsConstructorDescriptionES819TSDBDocValuesFormat(int numericBlockShift) ES819TSDBDocValuesFormat(int skipIndexIntervalSize, int minDocsPerOrdinalForRangeEncoding, boolean enableOptimizedMerge, BinaryDVCompressionMode binaryDVCompressionMode, boolean enablePerBlockCompression) Doc values fields format with specified skipIndexIntervalSize.ES819TSDBDocValuesFormat(int skipIndexIntervalSize, int minDocsPerOrdinalForRangeEncoding, boolean enableOptimizedMerge, BinaryDVCompressionMode binaryDVCompressionMode, boolean enablePerBlockCompression, int numericBlockShift) ES819TSDBDocValuesFormat(BinaryDVCompressionMode binaryDVCompressionMode) ES819TSDBDocValuesFormat(BinaryDVCompressionMode binaryDVCompressionMode, boolean enablePerBlockCompression) -
Method Summary
Modifier and TypeMethodDescriptionorg.apache.lucene.codecs.DocValuesConsumerfieldsConsumer(org.apache.lucene.index.SegmentWriteState state) org.apache.lucene.codecs.DocValuesProducerfieldsProducer(org.apache.lucene.index.SegmentReadState state) static ES819TSDBDocValuesFormatgetInstance(boolean useLargeNumericBlock) Methods inherited from class org.apache.lucene.codecs.DocValuesFormat
availableDocValuesFormats, forName, getName, reloadDocValuesFormats, toString
-
Field Details
-
BLOCK_BYTES_THRESHOLD
public static final int BLOCK_BYTES_THRESHOLDThese thresholds determine the size of a compressed binary block. We build a new block if the uncompressed data in the block is 128k, or if the number of values is 1024. These values are a tradeoff between the high compression ratio and decompression speed of large blocks, and the ability to avoid decompressing unneeded values provided by small blocks.- See Also:
-
BLOCK_COUNT_THRESHOLD
public static final int BLOCK_COUNT_THRESHOLD- See Also:
-
ORDINAL_RANGE_ENCODING_MIN_DOC_PER_ORDINAL
public static final int ORDINAL_RANGE_ENCODING_MIN_DOC_PER_ORDINALThe default minimum number of documents per ordinal required to use ordinal range encoding. If the average number of documents per ordinal is below this threshold, it is more efficient to encode doc values in blocks. A much smaller value may be used in tests to exercise ordinal range encoding more frequently.- See Also:
-
ORDINAL_RANGE_ENCODING_BLOCK_SHIFT
public static final int ORDINAL_RANGE_ENCODING_BLOCK_SHIFTThe block shift used in DirectMonotonicWriter when encoding the start docs of each ordinal with ordinal range encoding.- See Also:
-
-
Constructor Details
-
ES819TSDBDocValuesFormat
public ES819TSDBDocValuesFormat() -
ES819TSDBDocValuesFormat
public ES819TSDBDocValuesFormat(int numericBlockShift) -
ES819TSDBDocValuesFormat
-
ES819TSDBDocValuesFormat
public ES819TSDBDocValuesFormat(BinaryDVCompressionMode binaryDVCompressionMode, boolean enablePerBlockCompression) -
ES819TSDBDocValuesFormat
public ES819TSDBDocValuesFormat(int skipIndexIntervalSize, int minDocsPerOrdinalForRangeEncoding, boolean enableOptimizedMerge, BinaryDVCompressionMode binaryDVCompressionMode, boolean enablePerBlockCompression) Doc values fields format with specified skipIndexIntervalSize. -
ES819TSDBDocValuesFormat
public ES819TSDBDocValuesFormat(int skipIndexIntervalSize, int minDocsPerOrdinalForRangeEncoding, boolean enableOptimizedMerge, BinaryDVCompressionMode binaryDVCompressionMode, boolean enablePerBlockCompression, int numericBlockShift)
-
-
Method Details
-
getInstance
-
fieldsConsumer
public org.apache.lucene.codecs.DocValuesConsumer fieldsConsumer(org.apache.lucene.index.SegmentWriteState state) throws IOException - Specified by:
fieldsConsumerin classorg.apache.lucene.codecs.DocValuesFormat- Throws:
IOException
-
fieldsProducer
public org.apache.lucene.codecs.DocValuesProducer fieldsProducer(org.apache.lucene.index.SegmentReadState state) throws IOException - Specified by:
fieldsProducerin classorg.apache.lucene.codecs.DocValuesFormat- Throws:
IOException
-