Class EsqlScalarFunction
- All Implemented Interfaces:
NamedWriteable,Writeable,Resolvable,EvaluatorMapper
- Direct Known Subclasses:
Atan2,Case,Chunk,CIDRMatch,Clamp,ClampMax,ClampMin,Coalesce,Concat,Contains,CopySign,DateParse,Decay,EndsWith,EsqlConfigurationFunction,ExtractHistogramComponent,FromAggregateMetricDouble,Greatest,Hash,HistogramPercentile,Hypot,In,IpPrefix,Least,Left,Locate,Log,MvAppend,MvPercentile,MvPSeriesWeightedSum,MvSlice,MvSort,MvZip,NetworkDirection,Pow,Repeat,Replace,Right,Round,RoundTo,Scalb,StartsWith,Substring,ToIp,UnaryScalarFunction
ScalarFunction is a Function that makes one output value per
input row. It operates on a whole Page of inputs at a time, building
a Block of results.
You see them in the language everywhere:
| EVAL foo_msg = CONCAT("foo ", message)| EVAL foo_msg = a + b| WHERE STARTS_WITH(a, "rabbit")| WHERE a == b| STATS AGG BY ----> a + b <---- this is a scalar| STATS AGG(----> a + b <---- this is a scalar)
Let's work the example of CONCAT("foo ", message). It's called with a Page
of inputs and resolves both of its parameters, yielding a constant block containing
"foo " and a Block of strings containing message. It can expect to receive
thousands of message values in that block. Then it builds and returns the block
"foo <message>".
foo | message | result
--- | ------- | ----------
foo | bar | foo bar
foo | longer | foo longer
... a thousand rows ...
foo | baz | foo baz
It does this once per input Page.
We have a guide for writing these in the javadoc for
org.elasticsearch.xpack.esql.expression.function.scalar.
Optimizations
Scalars are a huge part of the language, and we have a ton of different classes of optimizations for them that exist on a performance spectrum:
Better Load Less and
than O(rows) Run Faster Run Faster Page-at-a-time Tuple-at-a-time
|----------------|-------------------------|------------------------------|-------------------|
^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^
CF LT ET FP BL MBL SE NO SIMD RR VD EVAL EVE CASE
CF: Constant Folding
| EVAL a = CONCAT("some ", "words")
The fastest way to run a scalar, now and forever, is to run it at compile time. Turn it into a constant and propagate it throughout the query. This is called "constant folding" and all scalars, when their arguments are constants, are "folded" to a constant.
LT: Lucene's TopN
FROM index METADATA _score
| WHERE title:"cat"
| SORT _score DESC
| LIMIT 10
FROM index
| EVAL distance = ST_DISTANCE(point, "POINT(12.5683 55.6761)")
| SORT distance ASC
| LIMIT 10
Fundamentally, Lucene is a tuple-at-a-time engine that flows the
min-competitive
sort key back into the index iteration process, allowing it to skip huge swaths of
documents. It has quite a few optimizations that soften the blow of it being
tuple-at-a-time, so these days "push to a lucene topn" is the fastest way you are going
to run a scalar function. For that to work it has to be a SORT key and all the
filters have to be pushable to lucene and lucene has to know how to run the function
natively. See PushTopNToSource.
ET: Engine TopN (HYPOTHETICAL)
FROM index METADATA _score
| WHERE title:"cat"
| WHERE a < j + LENGTH(candy) // <--- anything un-pushable
| SORT _score DESC
| LIMIT 10
If ESQL's TopNOperator exposed the min-competitive information (see above), and
we fed it back into the lucene query operators then we too could do better than
O(matching_rows) for queries sorting on the results of a scalar. This is like
the LT but without as many limitations. Lucene has a 20-year head start on us
optimizing TopN, so we should continue to use them when
See issue.
BL: Push to BlockLoader
FROM index
| EVAL s = V_COSINE(dense_vector, [0, 1, 2])
| SORT s desc
| LIMIT 10
FROM index
| STATS SUM(LENGTH(message)) // Length is pushed to the BlockLoader
Some functions can take advantage of the on-disk structures to run very fast and should be
"fused" into field loading using BlockLoaderExpression. Functions like V_COSINE
can use the vector search index to compute the result. Functions like MV_MIN can
use the doc_values encoding mechanism to save a ton of work. Functions like the
upcoming ST_SIMPLIFY benefit from this by saving huge numbers of allocations even
if they can't link into the doc_values format. We do this by building a
BlockLoader for each FUNCTION x FIELD_TYPE x storage mechanism combination
so we can get as much speed as possible.
MBL: Push to a "mother ship" BlockLoader (HYPOTHETICAL)
FROM index
| STATS SUM(LENGTH(message)), // All of these are pushed to a single BlockLoader
SUM(SUBSTRING(message, 0, 4)),
BY trail = SUBSTRING(message, 10, 3)
Pushing functions to a BlockLoader can involve building a ton
of distinct BlockLoaders. Which involves a ton of code and testing and, well, work.
But it's worth it if you are applying a single function to a field and every single cycle
counts. Both of these cry out for a more OO-style solution where you build a "mother ship"
BlockLoader that operates on, say FIELD_TYPE x storage mechanism and
then runs a list of FUNCTION operations. In some cases this is a bad idea, which
is why we haven't built it yet. But in plenty of cases it's fine. And, sometimes, we should
be fine skipping the special purpose block loader in favor of the mother ship. We'd spent
a few more cycles on each load, but the maintenance advantage is likely worth it for some
functions.
EVAL: Page-at-a-time evaluation
ESQL evaluates whole pages at once, generally walking a couple of arrays in parallel building a result array. This makes which bits are the "hot path" very obvious - they are the loops that walk these arrays. We put the "slower" stuff outside those loops:
- scratch allocations
- profiling
VD: Vector Dispatch
In Elasticsearch it's normal for fields to sometimes be null or multivalued.
There are no constraints on the schema preventing this and, as a search engine, it's
pretty normal to model things as multivalued fields. We rarely know that a field can
only be single-valued when we're planning a query.
It's much faster to run a scalar when we know that all of its inputs
are single valued and non-null. So every scalar function that uses the code generation
keyed by the Evaluator, ConvertEvaluator, and MvEvaluator
annotations builds two paths:
- The slower "
Block" path that supportsnulls and multivalued fields - The faster "
Vector" path that supports only single-valued, non-nullfields
NO: Native Ordinal Evaluation
FROM index
| STATS MAX(foo) BY TO_UPPER(verb)
keyword and ip fields load their byte[] shaped values as a
lookup table, called "ordinals" because Lucene uses that word for it. Some of our functions,
like TO_UPPER, process the lookup table itself instead of processing each position.
This is especially important when grouping on the field because the hashing done by the
aggregation code also operates on the lookup table.
SE: Sorted Execution
FROM index
| STATS SUM(MV_DEDUPE(file_size))
Some functions can operate on multivalued fields much faster if their inputs are sorted. And
inputs loaded from doc_values are sorted by default. Sometimes even sorted AND
deduplicated. We store this information on each block in Block.MvOrdering.
NOTE: Functions that can take advantage of this sorting also tend to be NOOPs for single-valued inputs. So they benefit hugely from "Vector Dispatch".
SIMD: Single Instruction Multiple Data instructions
FROM index
| STATS MAX(lhs + rhs)
Through a combination of "Page-at-a-time evaluation", and "Vector Dispatch" we often end up with at least one path that can be turned into a sequence of SIMD instructions. These are about as fast as you can go and still be `O(matching_rows)`. A lot of scalars don't lend themselves perfectly to SIMD, but we make sure those that do can take that route.
RR: Range Rewrite
FROM index
| STATS COUNT(*) BY DATE_TRUNC(1 DAY, @timestamp)
Functions like DATE_TRUNC can be quite slow, especially when they are using a
time zone. It can be much faster if it knows the range of dates that it's operating on.
And we do know that on the data node! We use that information to rewrite the possibly-slow
DATE_TRUNC to the always fast ROUND_TO, which rounds down to fixed rounding
points.
At the moment this is only done for DATE_TRUNC which is a very common function,
but is technically possible for anything that could benefit from knowing the range up front.
FP: Filter Pushdown
FROM index
| STATS COUNT(*) BY DATE_TRUNC(1 DAY, @timestamp)
If the "Range Rewrite" optimization works, we can sometimes further push the resulting
ROUND_TO into a sequence of filters. If you are just counting
documents then this can use the LuceneCountOperator which can count the number of
matching documents directly from the cache, technically being faster than
O(num_hits), but only in ideal circumstances. If we can't push the count then it's
still very very fast. See PR.
EVE: Expensive Variable Evaluator
FROM index
| EVAL ts = DATE_PARSE(SUBSTRING(message, 1, 10), date_format_from_the_index)
Functions like DATE_PARSE need to build something "expensive" per input row, like
a DateFormatter. But, often, the expensive thing is constant. In the example above
the date format comes from the index, but that's quite contrived. These functions generally
run in the form:
FROM index
| EVAL ts = DATE_PARSE(SUBSTRING(message, 1, 10), "ISO8601")
These generally have special case evaluators that don't construct the format for each row. The others are "expensive variable evaluators" and we avoid them when we can.
CASE: CASE is evaluated row-by-row
FROM index
| EVAL f = CASE(d > 0, n / d, 0)
FROM index
| EVAL f = COALESCE(d, 1 / j)
CASE and COALESCE short circuit. In the top example above, that
means we don't run n / d unless d > 0. That prevents us from
emitting warnings for dividing by 0. In the second example, we don't run 1 / j
unless d is null. In the worst case, we manage this by running row-by-row
which is super slow. Especially because the engine was designed
for page-at-a-time execution.
In the best case COALESCE can see that an input is either all-null or
all-non-null. Then it never falls back to row-by-row evaluation and is quite fast.
CASE has a similar optimization: For each incoming Page, if the
condition evaluates to a constant, then it executes the corresponding "arm"
Page-at-a-time. Also! If the "arms" are "fast" and can't throw warnings, then
CASE can execute "eagerly" - evaluating all three arguments and just
plucking values back and forth. The "eager" CASE evaluator is effectively
the same as any other page-at-a-time evaluator.
-
Nested Class Summary
Nested classes/interfaces inherited from class org.elasticsearch.xpack.esql.core.expression.Expression
Expression.TypeResolutionNested classes/interfaces inherited from interface org.elasticsearch.xpack.esql.evaluator.mapper.EvaluatorMapper
EvaluatorMapper.ToEvaluatorNested classes/interfaces inherited from interface org.elasticsearch.common.io.stream.Writeable
Writeable.Reader<V>, Writeable.Writer<V> -
Field Summary
Fields inherited from class org.elasticsearch.xpack.esql.core.expression.function.scalar.ScalarFunction
MAX_BYTES_REF_RESULT_SIZEFields inherited from class org.elasticsearch.xpack.esql.core.tree.Node
TO_STRING_MAX_WIDTH -
Constructor Summary
ConstructorsModifierConstructorDescriptionprotectedEsqlScalarFunction(Source source) protectedEsqlScalarFunction(Source source, List<Expression> fields) -
Method Summary
Methods inherited from class org.elasticsearch.xpack.esql.core.expression.function.Function
arguments, equals, functionName, hashCode, nodeString, nullableMethods inherited from class org.elasticsearch.xpack.esql.core.expression.Expression
canonical, canonicalize, childrenResolved, dataType, foldable, propertiesToString, references, resolved, resolveType, semanticEquals, semanticHash, toString, typeResolvedMethods inherited from class org.elasticsearch.xpack.esql.core.tree.Node
anyMatch, children, collect, collect, collect, collectFirstChildren, collectLeaves, doCollectFirst, forEachDown, forEachDown, forEachDownMayReturnEarly, forEachProperty, forEachPropertyDown, forEachPropertyOnly, forEachPropertyUp, forEachUp, forEachUp, info, nodeName, nodeProperties, replaceChildren, replaceChildrenSameSize, source, sourceLocation, sourceText, transformChildren, transformDown, transformDown, transformDown, transformNodeProps, transformPropertiesDown, transformPropertiesOnly, transformPropertiesUp, transformUp, transformUp, transformUpMethods inherited from class java.lang.Object
clone, finalize, getClass, notify, notifyAll, wait, wait, waitMethods inherited from interface org.elasticsearch.xpack.esql.evaluator.mapper.EvaluatorMapper
fold, toEvaluatorMethods inherited from interface org.elasticsearch.common.io.stream.NamedWriteable
getWriteableName
-
Constructor Details
-
EsqlScalarFunction
-
EsqlScalarFunction
-
-
Method Details
-
fold
- Overrides:
foldin classExpression
-