Module org.elasticsearch.server
Class HierarchicalKMeans
java.lang.Object
org.elasticsearch.index.codec.vectors.cluster.HierarchicalKMeans
An implementation of the hierarchical k-means algorithm that better partitions data than naive k-means
-
Field Summary
FieldsModifier and TypeFieldDescriptionstatic final floatstatic final intstatic final intstatic final int -
Constructor Summary
ConstructorsConstructorDescriptionHierarchicalKMeans(int dimension) HierarchicalKMeans(int dimension, int maxIterations, int samplesPerCluster, int clustersPerNeighborhood, float soarLambda) -
Method Summary
Modifier and TypeMethodDescriptioncluster(org.apache.lucene.index.FloatVectorValues vectors, int targetSize) clusters or moreso partitions the set of vectors by starting with a rough number of partitions and then recursively refining those lastly a pass is made to adjust nearby neighborhoods and add an extra assignment per vector to nearby neighborhoods
-
Field Details
-
MAXK
public static final int MAXK- See Also:
-
MAX_ITERATIONS_DEFAULT
public static final int MAX_ITERATIONS_DEFAULT- See Also:
-
SAMPLES_PER_CLUSTER_DEFAULT
public static final int SAMPLES_PER_CLUSTER_DEFAULT- See Also:
-
DEFAULT_SOAR_LAMBDA
public static final float DEFAULT_SOAR_LAMBDA- See Also:
-
-
Constructor Details
-
HierarchicalKMeans
public HierarchicalKMeans(int dimension) -
HierarchicalKMeans
public HierarchicalKMeans(int dimension, int maxIterations, int samplesPerCluster, int clustersPerNeighborhood, float soarLambda)
-
-
Method Details
-
cluster
public KMeansResult cluster(org.apache.lucene.index.FloatVectorValues vectors, int targetSize) throws IOException clusters or moreso partitions the set of vectors by starting with a rough number of partitions and then recursively refining those lastly a pass is made to adjust nearby neighborhoods and add an extra assignment per vector to nearby neighborhoods- Parameters:
vectors- the vectors to clustertargetSize- the rough number of vectors that should be attached to a cluster- Returns:
- the centroids and the vectors assignments and SOAR (spilled from nearby neighborhoods) assignments
- Throws:
IOException- is thrown if vectors is inaccessible
-