Class ESVectorUtil

java.lang.Object
org.elasticsearch.simdvec.ESVectorUtil

public class ESVectorUtil extends Object
  • Constructor Summary

    Constructors
    Constructor
    Description
     
  • Method Summary

    Modifier and Type
    Method
    Description
    static int
    andBitCount(byte[] a, byte[] b)
    AND bit count computed over signed bytes.
    static void
    calculateOSQGridPoints(float[] target, int[] quantize, int points, float[] pts)
    Calculate the grid points for optimized-scalar quantization
    static float
    calculateOSQLoss(float[] target, float lowerInterval, float upperInterval, int points, float norm2, float lambda, int[] quantize)
    Calculate the loss for optimized-scalar quantization for the given parameteres
    static void
    centerAndCalculateOSQStatsDp(float[] target, float[] centroid, float[] centered, float[] stats)
    Center the target vector and calculate the optimized-scalar quantization statistics
    static void
    centerAndCalculateOSQStatsEuclidean(float[] target, float[] centroid, float[] centered, float[] stats)
    Center the target vector and calculate the optimized-scalar quantization statistics
    getES91Int4VectorsScorer(org.apache.lucene.store.IndexInput input, int dimension)
     
    getES91OSQVectorsScorer(org.apache.lucene.store.IndexInput input, int dimension)
     
    getES92Int7VectorsScorer(org.apache.lucene.store.IndexInput input, int dimension)
     
    static long
    ipByteBinByte(byte[] q, byte[] d)
     
    static int
    ipByteBit(byte[] q, byte[] d)
    Compute the inner product of two vectors, where the query vector is a byte vector and the document vector is a bit vector.
    static float
    ipFloatBit(float[] q, byte[] d)
    Compute the inner product of two vectors, where the query vector is a float vector and the document vector is a bit vector.
    static float
    ipFloatByte(float[] q, byte[] d)
    Compute the inner product of two vectors, where the query vector is a float vector and the document vector is a byte vector.
    static int
    quantizeVectorWithIntervals(float[] vector, int[] destination, float lowInterval, float upperInterval, byte bit)
    Optimized-scalar quantization of the provided vector to the provided destination array.
    static float
    soarDistance(float[] v1, float[] centroid, float[] originalResidual, float soarLambda, float rnorm)
    calculates the soar distance for a vector and a centroid
    static void
    soarDistanceBulk(float[] v1, float[] c0, float[] c1, float[] c2, float[] c3, float[] originalResidual, float soarLambda, float rnorm, float[] distances)
    Bulk computation of the soar distance for a vector to four centroids
    static void
    squareDistanceBulk(float[] q, float[] v0, float[] v1, float[] v2, float[] v3, float[] distances)
    Bulk computation of square distances between a query vector and four vectors.Result is stored in the provided distances array.
    static void
    subtract(float[] v1, float[] v2, float[] result)
    Calculates the difference between two vectors and stores the result in a third vector.

    Methods inherited from class java.lang.Object

    clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
  • Constructor Details

    • ESVectorUtil

      public ESVectorUtil()
  • Method Details

    • getES91OSQVectorsScorer

      public static ES91OSQVectorsScorer getES91OSQVectorsScorer(org.apache.lucene.store.IndexInput input, int dimension) throws IOException
      Throws:
      IOException
    • getES91Int4VectorsScorer

      public static ES91Int4VectorsScorer getES91Int4VectorsScorer(org.apache.lucene.store.IndexInput input, int dimension) throws IOException
      Throws:
      IOException
    • getES92Int7VectorsScorer

      public static ES92Int7VectorsScorer getES92Int7VectorsScorer(org.apache.lucene.store.IndexInput input, int dimension) throws IOException
      Throws:
      IOException
    • ipByteBinByte

      public static long ipByteBinByte(byte[] q, byte[] d)
    • ipByteBit

      public static int ipByteBit(byte[] q, byte[] d)
      Compute the inner product of two vectors, where the query vector is a byte vector and the document vector is a bit vector. This will return the sum of the query vector values using the document vector as a mask. When comparing the bits with the bytes, they are done in "big endian" order. For example, if the byte vector is [1, 2, 3, 4, 5, 6, 7, 8] and the bit vector is [0b10000000], the inner product will be 1.0.
      Parameters:
      q - the query vector
      d - the document vector
      Returns:
      the inner product of the two vectors
    • ipFloatBit

      public static float ipFloatBit(float[] q, byte[] d)
      Compute the inner product of two vectors, where the query vector is a float vector and the document vector is a bit vector. This will return the sum of the query vector values using the document vector as a mask. When comparing the bits with the floats, they are done in "big endian" order. For example, if the float vector is [1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0] and the bit vector is [0b10000000], the inner product will be 1.0.
      Parameters:
      q - the query vector
      d - the document vector
      Returns:
      the inner product of the two vectors
    • ipFloatByte

      public static float ipFloatByte(float[] q, byte[] d)
      Compute the inner product of two vectors, where the query vector is a float vector and the document vector is a byte vector.
      Parameters:
      q - the query vector
      d - the document vector
      Returns:
      the inner product of the two vectors
    • andBitCount

      public static int andBitCount(byte[] a, byte[] b)
      AND bit count computed over signed bytes. Copied from Lucene's XOR implementation
      Parameters:
      a - bytes containing a vector
      b - bytes containing another vector, of the same dimension
      Returns:
      the value of the AND bit count of the two vectors
    • calculateOSQLoss

      public static float calculateOSQLoss(float[] target, float lowerInterval, float upperInterval, int points, float norm2, float lambda, int[] quantize)
      Calculate the loss for optimized-scalar quantization for the given parameteres
      Parameters:
      target - The vector being quantized, assumed to be centered
      lowerInterval - The lower interval value for which to calculate the loss
      upperInterval - The upper interval value for which to calculate the loss
      points - the quantization points
      norm2 - The norm squared of the target vector
      lambda - The lambda parameter for controlling anisotropic loss calculation
      quantize - array to store the computed quantize vector.
      Returns:
      The loss for the given parameters
    • calculateOSQGridPoints

      public static void calculateOSQGridPoints(float[] target, int[] quantize, int points, float[] pts)
      Calculate the grid points for optimized-scalar quantization
      Parameters:
      target - The vector being quantized, assumed to be centered
      quantize - The quantize vector which should have at least the target vector length
      points - the quantization points
      pts - The array to store the grid points, must be of length 5
    • centerAndCalculateOSQStatsEuclidean

      public static void centerAndCalculateOSQStatsEuclidean(float[] target, float[] centroid, float[] centered, float[] stats)
      Center the target vector and calculate the optimized-scalar quantization statistics
      Parameters:
      target - The vector being quantized
      centroid - The centroid of the target vector
      centered - The destination of the centered vector, will be overwritten
      stats - The array to store the statistics, must be of length 5
    • centerAndCalculateOSQStatsDp

      public static void centerAndCalculateOSQStatsDp(float[] target, float[] centroid, float[] centered, float[] stats)
      Center the target vector and calculate the optimized-scalar quantization statistics
      Parameters:
      target - The vector being quantized
      centroid - The centroid of the target vector
      centered - The destination of the centered vector, will be overwritten
      stats - The array to store the statistics, must be of length 6
    • subtract

      public static void subtract(float[] v1, float[] v2, float[] result)
      Calculates the difference between two vectors and stores the result in a third vector.
      Parameters:
      v1 - the first vector
      v2 - the second vector
      result - the result vector, must be the same length as the input vectors
    • soarDistance

      public static float soarDistance(float[] v1, float[] centroid, float[] originalResidual, float soarLambda, float rnorm)
      calculates the soar distance for a vector and a centroid
      Parameters:
      v1 - the vector
      centroid - the centroid
      originalResidual - the residual with the actually nearest centroid
      soarLambda - the lambda parameter
      rnorm - distance to the nearest centroid
      Returns:
      the soar distance
    • quantizeVectorWithIntervals

      public static int quantizeVectorWithIntervals(float[] vector, int[] destination, float lowInterval, float upperInterval, byte bit)
      Optimized-scalar quantization of the provided vector to the provided destination array.
      Parameters:
      vector - the vector to quantize
      destination - the array to store the result
      lowInterval - the minimum value, lower values in the original array will be replaced by this value
      upperInterval - the maximum value, bigger values in the original array will be replaced by this value
      bit - the number of bits to use for quantization, must be between 1 and 8
      Returns:
      return the sum of all the elements of the resulting quantized vector.
    • squareDistanceBulk

      public static void squareDistanceBulk(float[] q, float[] v0, float[] v1, float[] v2, float[] v3, float[] distances)
      Bulk computation of square distances between a query vector and four vectors.Result is stored in the provided distances array.
      Parameters:
      q - the query vector
      v0 - the first vector
      v1 - the second vector
      v2 - the third vector
      v3 - the fourth vector
      distances - an array to store the computed square distances, must have length 4
      Throws:
      IllegalArgumentException - if the dimensions of the vectors do not match or if the distances array does not have length 4
    • soarDistanceBulk

      public static void soarDistanceBulk(float[] v1, float[] c0, float[] c1, float[] c2, float[] c3, float[] originalResidual, float soarLambda, float rnorm, float[] distances)
      Bulk computation of the soar distance for a vector to four centroids
      Parameters:
      v1 - the vector
      c0 - the first centroid
      c1 - the second centroid
      c2 - the third centroid
      c3 - the fourth centroid
      originalResidual - the residual with the actually nearest centroid
      soarLambda - the lambda parameter
      rnorm - distance to the nearest centroid
      distances - an array to store the computed soar distances, must have length 4