Class ES819TSDBDocValuesFormat

java.lang.Object
org.apache.lucene.codecs.DocValuesFormat
org.elasticsearch.index.codec.tsdb.es819.ES819TSDBDocValuesFormat
All Implemented Interfaces:
org.apache.lucene.util.NamedSPILoader.NamedSPI

public class ES819TSDBDocValuesFormat extends org.apache.lucene.codecs.DocValuesFormat
Evolved from ES87TSDBDocValuesFormat and has the following changes:
  • Moved numDocsWithField metadata statistic from SortedNumericEntry to NumericEntry. This allows for always summing numDocsWithField during segment merging, otherwise numDocsWithField needs to be computed for each segment merge per field.
  • Moved docsWithFieldOffset, docsWithFieldLength, jumpTableEntryCount, denseRankPower metadata properties in the format to be after values metadata. So that the jump table can be stored after the values, which allows for iterating once over the merged view of all values. If index sorting is active merging a doc value field requires a merge sort which can be very cpu intensive. The previous format always has to merge sort a doc values field multiple times, so doing the merge sort just once saves on cpu resources.
  • Version 1 adds block-wise compression to binary doc values. Each block contains a variable number of values so that each block is approximately the same size. To map a given value's index to the block containing the value, there are two parallel arrays. These contain the starting address for each block, and the starting value index for each block. Additional compression types may be added by creating a new mode in BinaryDVCompressionMode.
  • Field Details

    • BINARY_DV_COMPRESSION_FEATURE_FLAG

      public static final boolean BINARY_DV_COMPRESSION_FEATURE_FLAG
    • NUMERIC_BLOCK_SIZE

      public static final int NUMERIC_BLOCK_SIZE
      See Also:
    • BLOCK_BYTES_THRESHOLD

      public static final int BLOCK_BYTES_THRESHOLD
      These thresholds determine the size of a compressed binary block. We build a new block if the uncompressed data in the block is 128k, or if the number of values is 1024. These values are a tradeoff between the high compression ratio and decompression speed of large blocks, and the ability to avoid decompressing unneeded values provided by small blocks.
      See Also:
    • BLOCK_COUNT_THRESHOLD

      public static final int BLOCK_COUNT_THRESHOLD
      See Also:
    • ORDINAL_RANGE_ENCODING_MIN_DOC_PER_ORDINAL

      public static final int ORDINAL_RANGE_ENCODING_MIN_DOC_PER_ORDINAL
      The default minimum number of documents per ordinal required to use ordinal range encoding. If the average number of documents per ordinal is below this threshold, it is more efficient to encode doc values in blocks. A much smaller value may be used in tests to exercise ordinal range encoding more frequently.
      See Also:
    • ORDINAL_RANGE_ENCODING_BLOCK_SHIFT

      public static final int ORDINAL_RANGE_ENCODING_BLOCK_SHIFT
      The block shift used in DirectMonotonicWriter when encoding the start docs of each ordinal with ordinal range encoding.
      See Also:
  • Constructor Details

    • ES819TSDBDocValuesFormat

      public ES819TSDBDocValuesFormat()
      Default constructor.
    • ES819TSDBDocValuesFormat

      public ES819TSDBDocValuesFormat(BinaryDVCompressionMode binaryDVCompressionMode)
    • ES819TSDBDocValuesFormat

      public ES819TSDBDocValuesFormat(BinaryDVCompressionMode binaryDVCompressionMode, boolean enablePerBlockCompression)
    • ES819TSDBDocValuesFormat

      public ES819TSDBDocValuesFormat(int skipIndexIntervalSize, int minDocsPerOrdinalForRangeEncoding, boolean enableOptimizedMerge, BinaryDVCompressionMode binaryDVCompressionMode, boolean enablePerBlockCompression)
      Doc values fields format with specified skipIndexIntervalSize.
  • Method Details

    • fieldsConsumer

      public org.apache.lucene.codecs.DocValuesConsumer fieldsConsumer(org.apache.lucene.index.SegmentWriteState state) throws IOException
      Specified by:
      fieldsConsumer in class org.apache.lucene.codecs.DocValuesFormat
      Throws:
      IOException
    • fieldsProducer

      public org.apache.lucene.codecs.DocValuesProducer fieldsProducer(org.apache.lucene.index.SegmentReadState state) throws IOException
      Specified by:
      fieldsProducer in class org.apache.lucene.codecs.DocValuesFormat
      Throws:
      IOException