Let's discuss the upcoming FlexSearch v0.8 here: https://github.com/nextapps-de/flexsearch/discussions/415
Build | File | CDN |
flexsearch.bundle.js | Download | https://rawcdn.githack.com/nextapps-de/flexsearch/0.7.31/dist/flexsearch.bundle.js |
flexsearch.light.js | Download | https://rawcdn.githack.com/nextapps-de/flexsearch/0.7.31/dist/flexsearch.light.js |
flexsearch.compact.js | Download | https://rawcdn.githack.com/nextapps-de/flexsearch/0.7.31/dist/flexsearch.compact.js |
flexsearch.es5.js * | Download | https://rawcdn.githack.com/nextapps-de/flexsearch/0.7.31/dist/flexsearch.es5.js |
ES6 Modules | Download | The /dist/module/ folder of this Github repository |
Feature | flexsearch.bundle.js | flexsearch.compact.js | flexsearch.light.js |
Presets | ✓ | ✓ | - |
Async Search | ✓ | ✓ | - |
Workers (Web + Node.js) | ✓ | - | - |
Contextual Indexes | ✓ | ✓ | ✓ |
Index Documents (Field-Search) | ✓ | ✓ | - |
Document Store | ✓ | ✓ | - |
Partial Matching | ✓ | ✓ | ✓ |
Relevance Scoring | ✓ | ✓ | ✓ |
Auto-Balanced Cache by Popularity | ✓ | - | - |
Tags | ✓ | - | - |
Suggestions | ✓ | ✓ | - |
Phonetic Matching | ✓ | ✓ | - |
Customizable Charset/Language (Matcher, Encoder, Tokenizer, Stemmer, Filter, Split, RTL) | ✓ | ✓ | ✓ |
Export / Import Indexes | ✓ | - | - |
File Size (gzip) | 6.8 kb | 5.3 kb | 2.9 kb |
Rank | Library | Memory | Query (Single Term) | Query (Multi Term) | Query (Long) | Query (Dupes) | Query (Not Found) |
1 | FlexSearch | 17 | 7084129 | 1586856 | 511585 | 2017142 | 3202006 |
2 | JSii | 27 | 6564 | 158149 | 61290 | 95098 | 534109 |
3 | Wade | 424 | 20471 | 78780 | 16693 | 225824 | 213754 |
4 | JS Search | 193 | 8221 | 64034 | 10377 | 95830 | 167605 |
5 | Elasticlunr.js | 646 | 5412 | 7573 | 2865 | 23786 | 13982 |
6 | BulkSearch | 1021 | 3069 | 3141 | 3333 | 3265 | 21825569 |
7 | MiniSearch | 24348 | 4406 | 10945 | 72 | 39989 | 17624 |
8 | bm25 | 15719 | 1429 | 789 | 366 | 884 | 1823 |
9 | Lunr.js | 2219 | 255 | 271 | 272 | 266 | 267 |
10 | FuzzySearch | 157373 | 53 | 38 | 15 | 32 | 43 |
11 | Fuse | 7641904 | 6 | 2 | 1 | 2 | 3 |
Option | Values | Description | Default |
preset |
"memory" "performance" "match" "score" "default" |
The configuration profile as a shortcut or as a base for your custom settings. |
"default" |
tokenize |
"strict" "forward" "reverse" "full" |
The indexing mode (tokenizer). Choose one of the built-ins or pass a custom tokenizer function. |
"strict" |
cache |
Boolean Number |
Enable/Disable and/or set capacity of cached entries. When passing a number as a limit the cache automatically balance stored entries related to their popularity. Note: When just using "true" the cache has no limits and growth unbounded. |
false |
resolution | Number | Sets the scoring resolution (default: 9). | 9 |
context |
Boolean Context Options |
Enable/Disable contextual indexing. When passing "true" as value it will take the default values for the context. | false |
optimize | Boolean | When enabled it uses a memory-optimized stack flow for the index. | true |
boost | function(arr, str, int) => float | A custom boost function used when indexing contents to the index. The function has this signature: Function(words[], term, index) => Float . It has 3 parameters where you get an array of all words, the current term and the current index where the term is placed in the word array. You can apply your own calculation e.g. the occurrences of a term and return this factor (<1 means relevance is lowered, >1 means relevance is increased).Note: this feature is currently limited by using the tokenizer "strict" only. |
null |
Language-specific Options and Encoding: | |||
charset |
Charset Payload String (key) |
Provide a custom charset payload or pass one of the keys of built-in charsets. | "latin" |
language |
Language Payload String (key) |
Provide a custom language payload or pass in language shorthand flag (ISO-3166) of built-in languages. | null |
encode |
false "default" "simple" "balance" "advanced" "extra" function(str) => [words] |
The encoding type. Choose one of the built-ins or pass a custom encoding function. |
"default" |
stemmer |
false String Function |
false | |
filter |
false String Function |
false | |
matcher |
false String Function |
false | |
Additional Options for Document Indexes: | |||
worker |
Boolean | Enable/Disable and set count of running worker threads. | false |
document |
Document Descriptor | Includes definitions for the document index and storage. |
Option | Values | Description | Default |
resolution | Number | Sets the scoring resolution for the context (default: 1). | 1 |
depth |
false Number |
Enable/Disable contextual indexing and also sets contextual distance of relevance. Depth is the maximum number of words/tokens away a term to be considered as relevant. | 1 |
bidirectional | Boolean | Sets bidirectional search result. If enabled and the source text contains "red hat", it will be found for queries "red hat" and "hat red". | true |
Option | Values | Description | Default |
id |
String | "id"" | |
tag |
false String |
"tag" | |
index |
String Array<String> Array<Object> |
||
store |
Boolean String Array<String> |
false |
Option | Values | Description | Default |
split |
false RegExp String |
The rule to split words when using non-custom tokenizer (built-ins e.g. "forward"). Use a string/char or use a regular expression (default: /\W+/ ). |
/[\W_]+/ |
rtl |
Boolean | Enables Right-To-Left encoding. | false |
encode |
function(str) => [words] | The custom encoding function. | /lang/latin/default.js |
Option | Values | Description |
stemmer |
false String Function |
Disable or pass in language shorthand flag (ISO-3166) or a custom object. |
filter |
false String Function |
Disable or pass in language shorthand flag (ISO-3166) or a custom array. |
matcher |
false String Function |
Disable or pass in language shorthand flag (ISO-3166) or a custom array. |
Option | Values | Description | Default |
limit | number | Sets the limit of results. | 100 |
offset | number | Apply offset (skip items). | 0 |
suggest | Boolean | Enables suggestions in results. | false |
Option | Values | Description | Default |
index | String Array<String> Array<Object> |
Sets the document fields which should be searched. When no field is set, all fields will be searched. Custom options per field are also supported. | |
tag | String Array<String> |
Sets the document fields which should be searched. When no field is set, all fields will be searched. Custom options per field are also supported. | false |
enrich | Boolean | Enrich IDs from the results with the corresponding documents. | false |
bool | "and" "or" |
Sets the used logical operator when searching through multiple fields or tags. | "or" |
Option | Description | Example | Memory Factor (n = length of word) |
"strict" | index whole words | foobar |
* 1 |
"forward" | incrementally index words in forward direction | fo obarfoob ar |
* n |
"reverse" | incrementally index words in both directions | foobar fo obar |
* 2n - 1 |
"full" | index every possible combination | fooba rf oob ar |
* n * (n - 1) |
Option | Description | False-Positives | Compression |
false | Turn off encoding | no | 0% |
"default" | Case in-sensitive encoding | no | 0% |
"simple" | Case in-sensitive encoding Charset normalizations |
no | ~ 3% |
"balance" | Case in-sensitive encoding Charset normalizations Literal transformations |
no | ~ 30% |
"advanced" | Case in-sensitive encoding Charset normalizations Literal transformations Phonetic normalizations |
no | ~ 40% |
"extra" | Case in-sensitive encoding Charset normalizations Literal transformations Phonetic normalizations Soundex transformations |
yes | ~ 65% |
function() | Pass custom encoding via function(string):[words] |
Field | Category | Description |
encode | charset | The encoder function. Has to return an array of separated words (or an empty string). |
rtl | charset | A boolean property which indicates right-to-left encoding. |
filter | language | Filter are also known as "stopwords", they completely filter out words from being indexed. |
stemmer | language | Stemmer removes word endings and is a kind of "partial normalization". A word ending just matched when the word length is bigger than the matched partial. |
matcher | language | Matcher replaces all occurrences of a given string regardless of its position and is also a kind of "partial normalization". |
Query | default | simple | advanced | extra |
björn | yes | yes | yes | yes |
björ | yes | yes | yes | yes |
bjorn | no | yes | yes | yes |
bjoern | no | no | yes | yes |
philipp | no | no | yes | yes |
filip | no | no | yes | yes |
björnphillip | no | yes | yes | yes |
meier | no | no | yes | yes |
björn meier | no | no | yes | yes |
meier fhilip | no | no | yes | yes |
byorn mair | no | no | no | yes |
(false positives) | no | no | no | yes |
Modifier | Memory Impact * | Performance Impact ** | Matching Impact ** | Scoring Impact ** |
resolution | +1 (per level) | +1 (per level) | 0 | +2 (per level) |
depth | +4 (per level) | -1 (per level) | -10 + depth | +10 |
minlength | -2 (per level) | +2 (per level) | -3 (per level) | +2 (per level) |
bidirectional | -2 | 0 | +3 | -1 |
fastupdate | +1 | +10 (update, remove) | 0 | 0 |
optimize: true | -7 | -1 | 0 | -3 |
encoder: "icase" | 0 | 0 | 0 | 0 |
encoder: "simple" | -2 | -1 | +2 | 0 |
encoder: "advanced" | -3 | -2 | +4 | 0 |
encoder: "extra" | -5 | -5 | +6 | 0 |
encoder: "soundex" | -6 | -2 | +8 | 0 |
tokenize: "strict" | 0 | 0 | 0 | 0 |
tokenize: "forward" | +3 | -2 | +5 | 0 |
tokenize: "reverse" | +5 | -4 | +7 | 0 |
tokenize: "full" | +8 | -5 | +10 | 0 |
document index | +3 (per field) | -1 (per field) | 0 | 0 |
document tags | +1 (per tag) | -1 (per tag) | 0 | 0 |
store: true | +5 (per document) | 0 | 0 | 0 |
store: [fields] | +1 (per field) | 0 | 0 | 0 |
cache: true | +10 | +10 | 0 | 0 |
cache: 100 | +1 | +9 | 0 | 0 |
type of ids: number | 0 | 0 | 0 | 0 |
type of ids: string | +3 | -3 | 0 | 0 |