Class TranslateTimeSeriesAggregate
java.lang.Object
org.elasticsearch.xpack.esql.rule.Rule<Aggregate,LogicalPlan>
org.elasticsearch.xpack.esql.optimizer.rules.logical.OptimizerRules.OptimizerRule<Aggregate>
org.elasticsearch.xpack.esql.optimizer.rules.logical.TranslateTimeSeriesAggregate
Time-series aggregation is special because it must be computed per time series, regardless of the grouping keys.
The keys must be `_tsid` or a pair of `_tsid` and `time_bucket`. To support user-defined grouping keys,
we first execute the rate aggregation using the time-series keys, then perform another aggregation with
the resulting rate using the user-specific keys.
This class translates the aggregates in the time-series aggregations to standard aggregates. This approach helps avoid introducing new plans and operators for time-series aggregations specially.
Examples:
TS k8s | STATS max(rate(request)) becomes TS k8s | STATS rate_$1 = rate(request) BY _tsid | STATS max(rate_$1) TS k8s | STATS max(rate(request)) BY host becomes TS k8s | STATS rate_$1=rate(request), VALUES(host) BY _tsid | STATS max(rate_$1) BY host=`VALUES(host)` TS k8s | STATS avg(rate(request)) BY host becomes TS k8s | STATS rate_$1=rate(request), VALUES(host) BY _tsid | STATS sum(rate_$1), count(rate_$1) BY host=`VALUES(host)` | EVAL `avg(rate(request))` = `sum(rate_$1)` / `count(rate_$1)` | KEEP `avg(rate(request))`, host TS k8s | STATS avg(rate(request)) BY host, bucket(@timestamp, 1minute) becomes TS k8s | EVAL `bucket(@timestamp, 1minute)`=datetrunc(@timestamp, 1minute) | STATS rate_$1=rate(request), VALUES(host) BY _tsid,`bucket(@timestamp, 1minute)` | STATS sum=sum(rate_$1), count(rate_$1) BY host=`VALUES(host)`, `bucket(@timestamp, 1minute)` | EVAL `avg(rate(request))` = `sum(rate_$1)` / `count(rate_$1)` | KEEP `avg(rate(request))`, host, `bucket(@timestamp, 1minute)`Non-rate aggregates will be rewritten as a pair of to_partial and from_partial aggregates, where the `to_partial` aggregates will be executed in the first pass and always produce an intermediate output regardless of the aggregate mode. The `from_partial` aggregates will be executed on the second pass and always receive intermediate output produced by `to_partial`. Examples:
TS k8s | STATS max(rate(request)), max(memory_used) becomes:
TS k8s
| STATS rate_$1=rate(request), $p1=to_partial(max(memory_used)) BY _tsid
| STATS max(rate_$1), `max(memory_used)` = from_partial($p1, max($_))
TS k8s | STATS max(rate(request)) avg(memory_used) BY host
becomes
TS k8s
| STATS rate_$1=rate(request), $p1=to_partial(sum(memory_used)), $p2=to_partial(count(memory_used)), VALUES(host) BY _tsid
| STATS max(rate_$1), $sum=from_partial($p1, sum($_)), $count=from_partial($p2, count($_)) BY host=`VALUES(host)`
| EVAL `avg(memory_used)` = $sum / $count
| KEEP `max(rate(request))`, `avg(memory_used)`, host
TS k8s | STATS min(memory_used) sum(rate(request)) BY pod, bucket(@timestamp, 5m)
becomes
TS k8s
| EVAL `bucket(@timestamp, 5m)` = datetrunc(@timestamp, '5m')
| STATS rate_$1=rate(request), $p1=to_partial(min(memory_used)), VALUES(pod) BY _tsid, `bucket(@timestamp, 5m)`
| STATS sum(rate_$1), `min(memory_used)` = from_partial($p1, min($)) BY pod=`VALUES(pod)`, `bucket(@timestamp, 5m)`
| KEEP `min(memory_used)`, `sum(rate_$1)`, pod, `bucket(@timestamp, 5m)`
{agg}_over_time time-series aggregation will be rewritten in the similar way
TS k8s | STATS sum(max_over_time(memory_usage)) BY host, bucket(@timestamp, 1minute)
becomes
FROM k8s
| STATS max_over_time_$1 = max(memory_usage), host_values=VALUES(host) BY _tsid, time_bucket=bucket(@timestamp, 1minute)
| STATS sum(max_over_time_$1) BY host_values, time_bucket
TS k8s | STATS sum(avg_over_time(memory_usage)) BY host, bucket(@timestamp, 1minute)
becomes
FROM k8s
| STATS avg_over_time_$1 = avg(memory_usage), host_values=VALUES(host) BY _tsid, time_bucket=bucket(@timestamp, 1minute)
| STATS sum(avg_over_time_$1) BY host_values, time_bucket
TS k8s | STATS max(rate(post_requests) + rate(get_requests)) BY host, bucket(@timestamp, 1minute)
becomes
FROM k8s
| STATS rate_$1=rate(post_requests), rate_$2=rate(post_requests) BY _tsid, time_bucket=bucket(@timestamp, 1minute)
| STATS max(rate_$1 + rate_$2) BY host_values, time_bucket
-
Field Summary
-
Constructor Summary
Constructors -
Method Summary
Methods inherited from class org.elasticsearch.xpack.esql.optimizer.rules.logical.OptimizerRules.OptimizerRule
apply
-
Constructor Details
-
TranslateTimeSeriesAggregate
public TranslateTimeSeriesAggregate()
-
-
Method Details
-
rule
- Specified by:
rulein classOptimizerRules.OptimizerRule<Aggregate>
-