Interface BlockLoaderExpression

All Known Implementing Classes:
CosineSimilarity, DotProduct, Hamming, L1Norm, L2Norm, Length, MvMax, MvMin, VectorSimilarityFunction

public interface BlockLoaderExpression
Expression that can be "pushed" into value loading. Most of the time we load values into Blocks and then run the expressions on them, but sometimes it's worth short-circuiting this process and running the expression in the tight loop we use for loading values.
  • V_COSINE(vector, [constant_vector]) - a vector is ~512 floats and V_COSINE is one double. Better yet, we can use the search index to find the distance without even looking at all the values.
  • ST_CENTROID(shape) - shapes can be quite large. Centroids are just one point.
  • LENGTH(string) - strings can be quite long, but string length is always an int. For more fun, keywords are usually stored using a dictionary, and it's fairly easy to optimize running LENGTH once per dictionary entry.
  • MV_COUNT(anything) - counts are always integers.
  • MV_MIN and MV_MAX - loads a single value instead of multivalued fields.

See the docs for EsqlScalarFunction for how this optimization fits in with all the other optimizations we've implemented.

How to implement

  1. Implement some block loaders
  2. Unit test the block loaders
  3. Plug the BlockLoader into the field mapper
  4. Implement this interface
  5. Add to PushExpressionToLoadIT
  6. Maybe add to csv-spec tests
  7. Get some performance numbers and open a PR

Implement some block loaders

Implement a BlockLoader for each fused code path. There's going to be a BlockLoader per <FUNCTION> x <type> x <storage mechanism>. Examples:

  1. Utf8CodePointsFromOrdsBlockLoader is for LENGTH x keyword x docValues.
  2. MvMaxLongsFromDocValuesBlockLoader is for MV_MAX x long x docValues.
  3. MvMaxBytesRefsFromOrdsBlockLoader is for MV_MAX x (keyword|ip) x doc_values.

If you wanted to push all loads for a function applied to a field type you'd need to optimize all paths which could include:

  1. doc_values
  2. stored
  3. _source
  4. Funky synthetic _source cases
  5. Using the search index

Unless you have a good reason to do otherwise, it's generally fine to start with doc_values. And it might be fine to only implement this fusion for doc_values. Usually, loading stored fields and loading from _source is so slow that this optimization won't buy you much speed proportionally. But this is only a rule of thumb. The first extraction push down we implemented violates the rule! It was directly to the search index for vector fields.

Note: The Object.toString()s are important in these classes. We expose them over the profile API and use them for tests later on.

Unit test the block loaders

Build a randomized unit test that

  1. loads random data
  2. loads using both your new BlockLoader and the non-fused loader
  3. compares the results

See the test for Utf8CodePointsFromOrdsBlockLoader for an example. These tests are usually quite parameterized to make sure we cover things like:

These unit tests cover a ton of different configurations quickly, and we know that we're using the loader.

Plug the BlockLoader into the field mapper

You must implement:

Implement this interface

Implement BlockLoaderExpression. Generally it's enough to check that check if the function is being applied to a FieldAttribute and do something like:


         if (field instanceof FieldAttribute f && f.dataType() == DataType.KEYWORD) {
             return new PushedBlockLoaderExpression(f, BlockLoaderFunctionConfig.Function.WHATEVER);
         }
         return null;
 

The rules system will check MappedFieldType.supportsBlockLoaderConfig(org.elasticsearch.index.mapper.blockloader.BlockLoaderFunctionConfig, org.elasticsearch.index.mapper.MappedFieldType.FieldExtractPreference) for you. See the docs for tryPushToFieldLoading(org.elasticsearch.xpack.esql.stats.SearchStats) for more on how to implement it.

Add to PushExpressionToLoadIT

Add a case or two to PushExpressionToLoadIT to prove that we've plugged everything in properly. These tests make sure that we're really loading the data really using your new BlockLoader. This is where your nice Object.toString()s come into play. That's the key into the profile map that shows that your new BlockLoader is plugged in.

Maybe add to csv-spec tests

Look for your function in the csv-spec tests and make sure there are cases that contain your function processing each data type you are pushing. For each type, make sure the function processes the results of:

  • ROW - these won't use your new code
  • FROM - these will use your new code
  • STATS or another function - these won't use your new code

It's fairly likely we already have tests for all these cases. They are part of our standard practice for adding functions, but there are a lot of them, and we may have forgotten some. And, without the pushdown you are implementing, they are mostly there for healthy paranoia around rules and a hedge against mistakes implementing optimizations in the future. Like the optimization you are implementing now!

Anyway, once there are plenty of these tests you should run them via the ESQL unit tests and via the single-node integration tests. These tests don't prove that your new BlockLoaders are plugged in. You have PushExpressionToLoadIT for that. Instead, they prove that, when your new BlockLoader is plugged in, it produces correct output. So, just like your unit test, but integrated with the entire rest of the world.

Get some performance numbers and open a PR

Now that you can be pretty sure everything is plugged in and working you can get some performance numbers. It's generally good to start with a quick and dirty script. These should show you a performance improvement, and you can use the profile API as a final proof that everything is plugged in. Once that looks right you should generally be ok to open a PR. Attach the results of your bash script to prove that it's faster.

Next, look for a rally track that should improve with your PR. If you find one, and it's in the nightlies already, then you have a choice:

  • Run the rally tests right now to get better numbers
  • Wait for the nightlies to run after merging

If the quick and dirty perf testing looked good you are probably safe waiting on the nightlies. You should look for them in benchmarks.elastic.co.

If there isn't already a rally operation then you should add one like this PR. How you add one of these and how you get it into the nightlies and whether it should be in the nightlies is outside the scope of this document.