Enum Class DataPartitioning

java.lang.Object
java.lang.Enum<DataPartitioning>
org.elasticsearch.compute.lucene.DataPartitioning
All Implemented Interfaces:
Serializable, Comparable<DataPartitioning>, Constable

public enum DataPartitioning extends Enum<DataPartitioning>
How we partition the data across Drivers. Each request forks into min(1.5 * cpus, partition_count) threads on the data node. More partitions allow us to bring more threads to bear on CPU intensive data node side tasks.
  • Enum Constant Details

    • AUTO

      public static final DataPartitioning AUTO
      Automatically select the data partitioning based on the query and index. Usually that's SEGMENT, but for small indices it's SHARD. When the additional overhead from DOC is fairly low then it'll pick DOC.
    • SHARD

      public static final DataPartitioning SHARD
      Make one partition per shard. This is generally the slowest option, but it has the lowest CPU overhead.
    • SEGMENT

      public static final DataPartitioning SEGMENT
      Partition on segment boundaries, this doesn't allow forking to as many CPUs as DOC but it has much lower overhead.

      It packs segments smaller than LuceneSliceQueue.MAX_DOCS_PER_SLICE docs together into a partition. Larger segments get their own partition. Each slice contains no more than LuceneSliceQueue.MAX_SEGMENTS_PER_SLICE.

    • DOC

      public static final DataPartitioning DOC
      Partitions into dynamic-sized slices to improve CPU utilization while keeping overhead low. This approach is more flexible than SEGMENT and works as follows:
      1. The slice size starts from a desired size based on task_concurrency but is capped at around LuceneSliceQueue.MAX_DOCS_PER_SLICE. This prevents poor CPU usage when matching documents are clustered together.
      2. For small and medium segments (less than five times the desired slice size), it uses a slightly different SEGMENT strategy, which also splits segments that are larger than the desired size. See IndexSearcher.slices(List, int, int, boolean).
      3. For very large segments, multiple segments are not combined into a single slice. This allows one driver to process an entire large segment until other drivers steal the work after finishing their own tasks. See LuceneSliceQueue.nextSlice(LuceneSlice).
  • Method Details

    • values

      public static DataPartitioning[] values()
      Returns an array containing the constants of this enum class, in the order they are declared.
      Returns:
      an array containing the constants of this enum class, in the order they are declared
    • valueOf

      public static DataPartitioning valueOf(String name)
      Returns the enum constant of this class with the specified name. The string must match exactly an identifier used to declare an enum constant in this class. (Extraneous whitespace characters are not permitted.)
      Parameters:
      name - the name of the enum constant to be returned.
      Returns:
      the enum constant with the specified name
      Throws:
      IllegalArgumentException - if this enum class has no constant with the specified name
      NullPointerException - if the argument is null