Module org.elasticsearch.server
Class IVFVectorsFormat
java.lang.Object
org.apache.lucene.codecs.KnnVectorsFormat
org.elasticsearch.index.codec.vectors.IVFVectorsFormat
- All Implemented Interfaces:
org.apache.lucene.util.NamedSPILoader.NamedSPI
public class IVFVectorsFormat
extends org.apache.lucene.codecs.KnnVectorsFormat
Codec format for Inverted File Vector indexes. This index expects to break the dimensional space
into clusters and assign each vector to a cluster generating a posting list of vectors. Clusters
are represented by centroids.
The vector quantization format used here is a per-vector optimized scalar quantization. Also see
OptimizedScalarQuantizer. Some of key features are:
The format is stored in three files:
.cenivf (centroid data) file
Which stores the raw and quantized centroid vectors.
.clivf (cluster data) file
Stores the quantized vectors for each cluster, inline and stored in blocks. Additionally, the docIds of each vector is stored.
.mivf (centroid metadata) file
Stores metadata including the number of centroids and their offsets in the clivf file
-
Field Summary
FieldsModifier and TypeFieldDescriptionstatic final Stringstatic final Stringstatic final intstatic final intstatic final floatstatic final intstatic final intstatic final intstatic final intstatic final Stringstatic final intstatic final intFields inherited from class org.apache.lucene.codecs.KnnVectorsFormat
DEFAULT_MAX_DIMENSIONS, EMPTY -
Constructor Summary
ConstructorsConstructorDescriptionConstructs a format using the given graph construction parameters and scalar quantization.IVFVectorsFormat(int vectorPerCluster, int centroidsPerParentCluster) -
Method Summary
Modifier and TypeMethodDescriptionorg.apache.lucene.codecs.KnnVectorsReaderfieldsReader(org.apache.lucene.index.SegmentReadState state) org.apache.lucene.codecs.KnnVectorsWriterfieldsWriter(org.apache.lucene.index.SegmentWriteState state) intgetMaxDimensions(String fieldName) toString()Methods inherited from class org.apache.lucene.codecs.KnnVectorsFormat
availableKnnVectorsFormats, forName, getName, reloadKnnVectorsFormat
-
Field Details
-
NAME
- See Also:
-
CENTROID_EXTENSION
- See Also:
-
CLUSTER_EXTENSION
- See Also:
-
VERSION_START
public static final int VERSION_START- See Also:
-
VERSION_CURRENT
public static final int VERSION_CURRENT- See Also:
-
DYNAMIC_VISIT_RATIO
public static final float DYNAMIC_VISIT_RATIO- See Also:
-
DEFAULT_VECTORS_PER_CLUSTER
public static final int DEFAULT_VECTORS_PER_CLUSTER- See Also:
-
MIN_VECTORS_PER_CLUSTER
public static final int MIN_VECTORS_PER_CLUSTER- See Also:
-
MAX_VECTORS_PER_CLUSTER
public static final int MAX_VECTORS_PER_CLUSTER- See Also:
-
DEFAULT_CENTROIDS_PER_PARENT_CLUSTER
public static final int DEFAULT_CENTROIDS_PER_PARENT_CLUSTER- See Also:
-
MIN_CENTROIDS_PER_PARENT_CLUSTER
public static final int MIN_CENTROIDS_PER_PARENT_CLUSTER- See Also:
-
MAX_CENTROIDS_PER_PARENT_CLUSTER
public static final int MAX_CENTROIDS_PER_PARENT_CLUSTER- See Also:
-
-
Constructor Details
-
IVFVectorsFormat
public IVFVectorsFormat(int vectorPerCluster, int centroidsPerParentCluster) -
IVFVectorsFormat
public IVFVectorsFormat()Constructs a format using the given graph construction parameters and scalar quantization.
-
-
Method Details
-
fieldsWriter
public org.apache.lucene.codecs.KnnVectorsWriter fieldsWriter(org.apache.lucene.index.SegmentWriteState state) throws IOException - Specified by:
fieldsWriterin classorg.apache.lucene.codecs.KnnVectorsFormat- Throws:
IOException
-
fieldsReader
public org.apache.lucene.codecs.KnnVectorsReader fieldsReader(org.apache.lucene.index.SegmentReadState state) throws IOException - Specified by:
fieldsReaderin classorg.apache.lucene.codecs.KnnVectorsFormat- Throws:
IOException
-
getMaxDimensions
- Specified by:
getMaxDimensionsin classorg.apache.lucene.codecs.KnnVectorsFormat
-
toString
-