java.lang.Object
org.elasticsearch.xpack.core.security.authz.accesscontrol.DocumentSubsetBitsetCache
All Implemented Interfaces:
Closeable, AutoCloseable, org.apache.lucene.index.IndexReader.ClosedListener, org.apache.lucene.util.Accountable

public final class DocumentSubsetBitsetCache extends Object implements org.apache.lucene.index.IndexReader.ClosedListener, Closeable, org.apache.lucene.util.Accountable
This is a cache for BitSet instances that are used with the DocumentSubsetReader. It is bounded by memory size and access time.

DLS uses BitSet instances to track which documents should be visible to the user ("live") and which should not ("dead"). This means that there is a bit for each document in a Lucene index (ES shard). Consequently, an index with 10 million document will use more than 1Mb of bitset memory for every unique DLS query, and an index with 1 billion documents will use more than 100Mb of memory per DLS query. Because DLS supports templating queries based on user metadata, there may be many distinct queries in use for each index, even if there is only a single active role.

The primary benefit of the cache is to avoid recalculating the "live docs" (visible documents) when a user performs multiple consecutive queries across one or more large indices. Given the memory examples above, the cache is only useful if it can hold at least 1 large (100Mb or more ) BitSet during a user's active session, and ideally should be capable of support multiple simultaneous users with distinct DLS queries.

For this reason the default memory usage (weight) for the cache set to 10% of JVM heap (CACHE_SIZE_SETTING), so that it automatically scales with the size of the Elasticsearch deployment, and can provide benefit to most use cases without needing customisation. On a 32Gb heap, a 10% cache would be 3.2Gb which is large enough to store BitSets representing 25 billion docs.

However, because queries can be templated by user metadata and that metadata can change frequently, it is common for the effective lifetime of a single DLS query to be relatively short. We do not want to sacrifice 10% of heap to a cache that is storing BitSets that are no longer needed, so we set the TTL on this cache to be 2 hours (CACHE_TTL_SETTING). This time has been chosen so that it will retain BitSets that are in active use during a user's session, but not be an ongoing drain on memory.

See Also:
  • Constructor Details

    • DocumentSubsetBitsetCache

      public DocumentSubsetBitsetCache(Settings settings, ThreadPool threadPool)
  • Method Details

    • onClose

      public void onClose(org.apache.lucene.index.IndexReader.CacheKey indexKey)
      Specified by:
      onClose in interface org.apache.lucene.index.IndexReader.ClosedListener
    • close

      public void close()
      Specified by:
      close in interface AutoCloseable
      Specified by:
      close in interface Closeable
    • clear

      public void clear(String reason)
    • ramBytesUsed

      public long ramBytesUsed()
      Specified by:
      ramBytesUsed in interface org.apache.lucene.util.Accountable
    • getBitSet

      @Nullable public org.apache.lucene.util.BitSet getBitSet(org.apache.lucene.search.Query query, org.apache.lucene.index.LeafReaderContext context) throws ExecutionException
      Obtain the BitSet for the given query in the given context. If there is a cached entry for that query and context, it will be returned. Otherwise, a new BitSet will be created and stored in the cache. The returned BitSet may be null (e.g. if the query has no results).
      Throws:
      ExecutionException
    • getSettings

      public static List<Setting<?>> getSettings()
    • usageStats

      public Map<String,Object> usageStats()