Package org.elasticsearch.xpack.esql.expression.function.scalar


package org.elasticsearch.xpack.esql.expression.function.scalar
Functions that take a row of data and produce a row of data without holding any state between rows. This includes both the ScalarFunction subclass to link into the ESQL core infrastructure and the EvalOperator.ExpressionEvaluator implementation to run the actual function.

Guide to adding new function

Adding functions is fairly easy and should be fun! This is a step by step list of how to do it.

  1. Fork the Elasticsearch repo.
  2. Clone your fork locally.
  3. Add Elastic’s remote, it should look a little like:
    
     [remote "elastic"]
     url = git@github.com:elastic/elasticsearch.git
     fetch = +refs/heads/*:refs/remotes/elastic/*
     [remote "nik9000"]
     url = git@github.com:nik9000/elasticsearch.git
     fetch = +refs/heads/*:refs/remotes/nik9000/*
             
  4. Feel free to use git as a scratch pad. We're going to squash all commits before merging and will only keep the PR subject line and description in the commit message.
  5. Open Elasticsearch in IntelliJ.
  6. Run the csv tests (see x-pack/plugin/esql/src/test/java/org/elasticsearch/xpack/esql/CsvTests.java) from within Intellij or, alternatively, via Gradle: ./gradlew :x-pack:plugin:esql:test --tests "org.elasticsearch.xpack.esql.CsvTests" IntelliJ will take a few minutes to compile everything but the test itself should take only a few seconds. This is a fast path to running ESQL’s integration tests.
  7. Pick one of the csv-spec files in x-pack/plugin/esql/qa/testFixtures/src/main/resources/ and add a test for the function you want to write. These files are roughly themed but there isn’t a strong guiding principle in the organization.
  8. Rerun the CsvTests and watch your new test fail. Yay, TDD doing it’s job.
  9. Find a function in this package similar to the one you are working on and copy it to build yours. There’s some ceremony required in each function class to make it constant foldable, and return the right types. Take a stab at these, but don’t worry too much about getting it right. Your function might extend from one of several abstract base classes, all of those are fine for this guide, but might have special instructions called out later. Known good base classes:
  10. There are also methods annotated with Evaluator that contain the actual inner implementation of the function. They are usually named "process" or "processInts" or "processBar". Modify those to look right and run the CsvTests again. This should generate an EvalOperator.ExpressionEvaluator implementation calling the method annotated with Evaluator. . To make it work with IntelliJ, also click Build->Recompile 'FunctionName.java'. Please commit the generated evaluator before submitting your PR.

    NOTE: The function you copied may have a method annotated with ConvertEvaluator or MvEvaluator instead of Evaluator. Those do similar things and the instructions should still work for you regardless. If your function contains an implementation of EvalOperator.ExpressionEvaluator written by hand then please stop and ask for help. This is not a good first function.

    NOTE 2: Regardless of which annotation is on your "process" method you can learn more about the options for generating code from the javadocs on those annotations.

  11. Once your evaluator is generated you can have your function return it, generally by implementing EvaluatorMapper.toEvaluator(org.elasticsearch.xpack.esql.evaluator.mapper.EvaluatorMapper.ToEvaluator). It’s possible that your abstract base class implements that function and will need you to implement something else:
  12. Add your function to EsqlFunctionRegistry. This links it into the language and META FUNCTIONS.
  13. Implement serialization for your function by implementing NamedWriteable.getWriteableName(), Writeable.writeTo(org.elasticsearch.common.io.stream.StreamOutput), and a deserializing constructor. Then add an NamedWriteableRegistry.Entry constant and register it. To register it, look for a method like ScalarFunctionWritables.getNamedWriteables() in your function’s class hierarchy. Keep going up until you hit a function with that name. Then add your new "ENTRY" constant to the list it returns.
  14. Rerun the CsvTests. They should find your function and maybe even pass. Add a few more tests in the csv-spec tests. They run quickly so it isn’t a big deal having half a dozen of them per function. In fact, it’s useful to add more complex combinations of things here, just to catch any accidental strange interactions. For example, have your function take its input from an index like FROM employees | EVAL foo=MY_FUNCTION(emp_no). It’s probably a good idea to have your function passed as a parameter to another function like EVAL foo=MOST(0, MY_FUNCTION(emp_no)). And likely useful to try the reverse like EVAL foo=MY_FUNCTION(MOST(languages + 10000, emp_no).
  15. Now it’s time to make a unit test! The infrastructure for these is under some flux at the moment, but it’s good to extend AbstractScalarFunctionTestCase. All of these tests are parameterized and expect to spend some time finding good parameters. Also add serialization tests that extend AbstractExpressionSerializationTests<>. And also add type error tests that extends ErrorsForCasesWithoutExamplesTestCase.
  16. Once you are happy with the tests run the auto formatter: ./gradlew -p x-pack/plugin/esql/ spotlessApply
  17. Now you can run all of the ESQL tests like CI: ./gradlew -p x-pack/plugin/esql/ test
  18. We need to tag to what release the function applies to so we can generate docs in the next step! On the constructor of your function class you very likely have an annotation @FunctionInfo. Add the attribute appliesTo with availability information. For example a GA function available in 9.2.0 would be tagged as { @FunctionAppliesTo(lifeCycle = FunctionAppliesToLifecycle.GA, version = "9.2.0") }
  19. Now it’s time to generate some docs! Actually, running the tests in the example above should have done it for you. The generated files are
    • docs/reference/query-languages/esql/_snippets/functions/description/myfunction.md
    • docs/reference/query-languages/esql/_snippets/functions/examples/myfunction.md
    • docs/reference/query-languages/esql/_snippets/functions/layout/myfunction.md
    • docs/reference/query-languages/esql/_snippets/functions/parameters/myfunction.md
    • docs/reference/query-languages/esql/_snippets/functions/types/myfunction.md
    • docs/reference/query-languages/esql/kibana/definition/functions/myfunction.json
    • docs/reference/query-languages/esql/kibana/docs/functions/myfunction.md
    Make sure to commit them. Add a reference to the docs/reference/query-languages/esql/_snippets/functions/layout/myfunction.md in the function list docs. There are plenty of examples on how to reference those files e.g. if you are writing a Math function, you will want to list it in docs/reference/query-languages/esql/functions-operators/math-functions.md.

    You can generate the docs for just your function by running ./gradlew :x-pack:plugin:esql:test -Dtests.class='*SinTests'. It’s just running your new unit test. You should see something like:

    
                  > Task :x-pack:plugin:esql:test
                  ESQL Docs: Only files related to [sin.md], patching them into place
             
  20. Build the docs by cloning the docs repo and running:
    
     ../docs/build_docs --doc docs/reference/index.md --open --chunk 1
              
    from the elasticsearch directory. The first time you run the docs build it does a bunch of things with docker to get itself ready. Hopefully you can sit back and watch the show. It won’t need to do it a second time unless some poor soul updates the Dockerfile in the docs repo.
  21. When it finishes building it'll open a browser window. Go to the functions page to see your function in the list and follow it’s link to get to the page you built. Make sure it looks ok.
  22. Let’s finish up the code by making the tests backwards compatible. Since this is a new feature we just have to convince the tests not to run in a cluster that includes older versions of Elasticsearch. We do that with a capability on the REST handler. ESQL has a ton of capabilities so we list them all in EsqlCapabilities. Add a new one for your function. Now add something like required_capability: my_function to all of your csv-spec tests. Run those csv-spec tests as integration tests to double check that they run on the main branch.

    **Note:** you may notice tests gated based on Elasticsearch version. This was the old way of doing things. Now, we use specific capabilities for each function.
  23. Open the PR. The subject and description of the PR are important because those'll turn into the commit message we see in the commit history. Good PR descriptions make me very happy. But functions don’t need an essay.
  24. Add the >enhancement and :Analytics/ES|QL tags if you are able. Request a review if you can, probably from one of the folks that github proposes to you.
  25. CI might fail for random looking reasons. The first thing you should do is merge main into your PR branch. That’s usually just:
    
     git checkout main && git pull elastic main && git checkout mybranch && git merge main
             
    Don’t worry about the commit message. It'll get squashed away in the merge.