Class TranslateTimeSeriesAggregate


public final class TranslateTimeSeriesAggregate extends OptimizerRules.OptimizerRule<Aggregate>
Time-series aggregation is special because it must be computed per time series, regardless of the grouping keys. The keys must be `_tsid` or a pair of `_tsid` and `time_bucket`. To support user-defined grouping keys, we first execute the rate aggregation using the time-series keys, then perform another aggregation with the resulting rate using the user-specific keys.

This class translates the aggregates in the time-series aggregations to standard aggregates. This approach helps avoid introducing new plans and operators for time-series aggregations specially.

Examples:

 TS k8s | STATS max(rate(request))

 becomes

 TS k8s
 | STATS rate_$1 = rate(request) BY _tsid
 | STATS max(rate_$1)

 TS k8s | STATS max(rate(request)) BY host

 becomes

 TS k8s
 | STATS rate_$1=rate(request), VALUES(host) BY _tsid
 | STATS max(rate_$1) BY host=`VALUES(host)`

 TS k8s | STATS avg(rate(request)) BY host

 becomes

 TS k8s
 | STATS rate_$1=rate(request), VALUES(host) BY _tsid
 | STATS sum(rate_$1), count(rate_$1) BY host=`VALUES(host)`
 | EVAL `avg(rate(request))` = `sum(rate_$1)` / `count(rate_$1)`
 | KEEP `avg(rate(request))`, host

 TS k8s | STATS avg(rate(request)) BY host, bucket(@timestamp, 1minute)

 becomes

 TS k8s
 | EVAL  `bucket(@timestamp, 1minute)`=datetrunc(@timestamp, 1minute)
 | STATS rate_$1=rate(request), VALUES(host) BY _tsid,`bucket(@timestamp, 1minute)`
 | STATS sum=sum(rate_$1), count(rate_$1) BY host=`VALUES(host)`, `bucket(@timestamp, 1minute)`
 | EVAL `avg(rate(request))` = `sum(rate_$1)` / `count(rate_$1)`
 | KEEP `avg(rate(request))`, host, `bucket(@timestamp, 1minute)`
 
Non-rate aggregates will be rewritten as a pair of to_partial and from_partial aggregates, where the `to_partial` aggregates will be executed in the first pass and always produce an intermediate output regardless of the aggregate mode. The `from_partial` aggregates will be executed on the second pass and always receive intermediate output produced by `to_partial`. Examples:
 TS k8s | STATS max(rate(request)), max(memory_used) becomes:

 TS k8s
 | STATS rate_$1=rate(request), $p1=to_partial(max(memory_used)) BY _tsid
 | STATS max(rate_$1), `max(memory_used)` = from_partial($p1, max($_))

 TS k8s | STATS max(rate(request)) avg(memory_used) BY host

 becomes

 TS k8s
 | STATS rate_$1=rate(request), $p1=to_partial(sum(memory_used)), $p2=to_partial(count(memory_used)), VALUES(host) BY _tsid
 | STATS max(rate_$1), $sum=from_partial($p1, sum($_)), $count=from_partial($p2, count($_)) BY host=`VALUES(host)`
 | EVAL `avg(memory_used)` = $sum / $count
 | KEEP `max(rate(request))`, `avg(memory_used)`, host

 TS k8s | STATS min(memory_used) sum(rate(request)) BY pod, bucket(@timestamp, 5m)

 becomes

 TS k8s
 | EVAL `bucket(@timestamp, 5m)` = datetrunc(@timestamp, '5m')
 | STATS rate_$1=rate(request), $p1=to_partial(min(memory_used)), VALUES(pod) BY _tsid, `bucket(@timestamp, 5m)`
 | STATS sum(rate_$1), `min(memory_used)` = from_partial($p1, min($)) BY pod=`VALUES(pod)`, `bucket(@timestamp, 5m)`
 | KEEP `min(memory_used)`, `sum(rate_$1)`, pod, `bucket(@timestamp, 5m)`

 {agg}_over_time time-series aggregation will be rewritten in the similar way

 TS k8s | STATS sum(max_over_time(memory_usage)) BY host, bucket(@timestamp, 1minute)

 becomes

 FROM k8s
 | STATS max_over_time_$1 = max(memory_usage), host_values=VALUES(host) BY _tsid, time_bucket=bucket(@timestamp, 1minute)
 | STATS sum(max_over_time_$1) BY host_values, time_bucket


 TS k8s | STATS sum(avg_over_time(memory_usage)) BY host, bucket(@timestamp, 1minute)

 becomes

 FROM k8s
 | STATS avg_over_time_$1 = avg(memory_usage), host_values=VALUES(host) BY _tsid, time_bucket=bucket(@timestamp, 1minute)
 | STATS sum(avg_over_time_$1) BY host_values, time_bucket

 TS k8s | STATS max(rate(post_requests) + rate(get_requests)) BY host, bucket(@timestamp, 1minute)

 becomes

 FROM k8s
 | STATS rate_$1=rate(post_requests), rate_$2=rate(post_requests) BY _tsid, time_bucket=bucket(@timestamp, 1minute)
 | STATS max(rate_$1 + rate_$2) BY host_values, time_bucket