Module org.elasticsearch.server
Package org.elasticsearch.index.analysis
Interface TokenFilterFactory
- All Known Subinterfaces:
NormalizingTokenFilterFactory
- All Known Implementing Classes:
AbstractTokenFilterFactory,HunspellTokenFilterFactory,ShingleTokenFilterFactory,ShingleTokenFilterFactory.Factory,StopTokenFilterFactory
public interface TokenFilterFactory
-
Field Summary
FieldsModifier and TypeFieldDescriptionstatic final TokenFilterFactoryA TokenFilterFactory that does no filtering to its TokenStream -
Method Summary
Modifier and TypeMethodDescriptiondefault booleanDoes this analyzer mess up theOffsetAttributes in such as way as to break theFastVectorHighlighter? If this istruethen the FastVectorHighlighter will attempt to work around the broken offsets.org.apache.lucene.analysis.TokenStreamcreate(org.apache.lucene.analysis.TokenStream tokenStream) default AnalysisModeGet theAnalysisModethis filter is allowed to be used in.default TokenFilterFactorygetChainAwareTokenFilterFactory(IndexService.IndexCreationContext context, TokenizerFactory tokenizer, List<CharFilterFactory> charFilters, List<TokenFilterFactory> previousTokenFilters, Function<String, TokenFilterFactory> allFilters) Rewrite the TokenFilterFactory to take into account the preceding analysis chain, or refer to other TokenFilterFactories If the token filter is part of the definition of aReloadableCustomAnalyzer, this function is called twice, once at index creation withIndexService.IndexCreationContext.CREATE_INDEXand then later withIndexService.IndexCreationContext.RELOAD_ANALYZERSon shard recovery.default StringGet the name of the resource that this filter is based on.default TokenFilterFactoryReturn a version of this TokenFilterFactory appropriate for synonym parsing Filters that should not be applied to synonyms (for example, those that produce multiple tokens) should throw an exceptionname()default org.apache.lucene.analysis.TokenStreamnormalize(org.apache.lucene.analysis.TokenStream tokenStream) Normalize a tokenStream for use in multi-term queries The default implementation is a no-op
-
Field Details
-
IDENTITY_FILTER
A TokenFilterFactory that does no filtering to its TokenStream
-
-
Method Details
-
name
String name() -
create
org.apache.lucene.analysis.TokenStream create(org.apache.lucene.analysis.TokenStream tokenStream) -
normalize
default org.apache.lucene.analysis.TokenStream normalize(org.apache.lucene.analysis.TokenStream tokenStream) Normalize a tokenStream for use in multi-term queries The default implementation is a no-op -
breaksFastVectorHighlighter
default boolean breaksFastVectorHighlighter()Does this analyzer mess up theOffsetAttributes in such as way as to break theFastVectorHighlighter? If this istruethen the FastVectorHighlighter will attempt to work around the broken offsets. -
getChainAwareTokenFilterFactory
default TokenFilterFactory getChainAwareTokenFilterFactory(IndexService.IndexCreationContext context, TokenizerFactory tokenizer, List<CharFilterFactory> charFilters, List<TokenFilterFactory> previousTokenFilters, Function<String, TokenFilterFactory> allFilters) Rewrite the TokenFilterFactory to take into account the preceding analysis chain, or refer to other TokenFilterFactories If the token filter is part of the definition of aReloadableCustomAnalyzer, this function is called twice, once at index creation withIndexService.IndexCreationContext.CREATE_INDEXand then later withIndexService.IndexCreationContext.RELOAD_ANALYZERSon shard recovery. TheIndexService.IndexCreationContext.RELOAD_ANALYZERScontext should be used to load expensive resources on a generic thread pool. SeeSynonymGraphFilterFactoryfor an example of how this context is used.- Parameters:
context- the IndexCreationContext for the underlying indextokenizer- the TokenizerFactory for the preceding chaincharFilters- any CharFilterFactories for the preceding chainpreviousTokenFilters- a list of TokenFilterFactories in the preceding chainallFilters- access to previously defined TokenFilterFactories
-
getSynonymFilter
Return a version of this TokenFilterFactory appropriate for synonym parsing Filters that should not be applied to synonyms (for example, those that produce multiple tokens) should throw an exception -
getAnalysisMode
Get theAnalysisModethis filter is allowed to be used in. The default isAnalysisMode.ALL. Instances need to override this method to define their own restrictions. -
getResourceName
Get the name of the resource that this filter is based on. Used to reload analyzers on this resource changes. For an example, see @SynonymGraphTokenFilterFactory#getResourceName()- Returns:
- the name of the resource that this filter was loaded from if any
-