ArangoDB v3.4 reached End of Life (EOL) and is no longer supported.
This documentation is outdated. Please see the most recent version here: Latest Docs
ArangoSearch Scorers
ArangoSearch Scorers are special functions that allow to sort documents from a view by their score regarding the analyzed fields.
Details about their usage in AQL can be found in the
ArangoSearch SORT
section.
-
BM25: order results based on the BM25 algorithm
-
TFIDF: order results based on the TFIDF algorithm
BM25()
- Best Matching 25 Algorithm
IResearch provides a ‘bm25’ scorer implementing the BM25 algorithm. Optionally, free parameters k and b of the algorithm typically using for advanced optimization can be specified as floating point numbers.
BM25(doc, k, b)
-
doc (document): must be emitted by
FOR doc IN someView
-
k (number, optional): term frequency, the default is 1.2. k calibrates the text term frequency scaling. A k value of 0 corresponds to a binary model (no term frequency), and a large value corresponds to using raw term frequency.
-
b (number, optional): determines the scaling by the total text length, the default is 0.75. b determines the scaling by the total text length.
- b = 1 corresponds to fully scaling the term weight by the total text length
- b = 0 corresponds to no length normalization.
At the extreme values of the coefficient b, BM25 turns into the ranking functions known as BM11 (for b = 1) and BM15 (for b = 0).
TFIDF()
- Term Frequency – Inverse Document Frequency Algorithm
Sorts documents using the term frequency–inverse document frequency algorithm.
TFIDF(doc, withNorms)
- doc (document): must be emitted by
FOR doc IN someView
- withNorms (bool, optional): specifying whether norms should be used via with-norms, the default is false