OP here, I wanted to add one technical note I forgot to mention in the post.
The prefix search uses the same suffix array as the substring search. This approach might also be useful for other search libraries that rely on suffix arrays. It can improve the search experience with minimal additional effort.
Happy to discuss the implementation details if anyone’s curious!
μFuzzy has a great comparison project that could serve as a reference for all fuzzy search implementations. My fuzzy searcher (v1) is already included and will soon be updated to v2 (PR is open).
i will caveat that the demo really only tests a specific set of options for each lib that try to closely match what uFuzzy does; and you can only adjust uFuzzy options in the ui. so do your own testing :)
Hi, thank you for your questions. Unfortunately there are no comparisons yet. The gif is simply a looped screen recording created with Camtasia. Best regards!
Thank you for your comment! It is indeed a problem in plain fuzzy search libraries (like this one) that substring matches can have a lower quality than unequal strings of similar length. A solution to that is to implement a higher level search controller that queries the fuzzy searcher, as well as a suffix array searcher. The controller than mixes the matches and returns the best matches across the two searchers. One can even add more searchers, e.g. a phonetic one. With the correct parameters this approach works well.
Thank you for your kind feedback! That's a great idea. I have implemented a memento object that can be used for transferring the serialized state of the searcher. The intent of the implementation was to transfer the searcher between a web worker and the main thread. You could try to serve the Memento from the server and store it in an index db. You may have a look at the web worker example I provide in the repository.
> You could try to serve the Memento from the server and store it in an index db
I am not sure what you mean by "store it in an index db", but I was thinking about using the searcher on a static website (no real backend, only a fileserver serving pre-generated html files). So if I understand you correctly, in order for this to work I would have to cache Memento via a local storage and load it on every page load/search request.
Unfortunately the index would change over time, thus one would have to detect this somehow and regenerate Memento as well.
Sorry for the confusion. I think you could generate the memento each time you compile your blog into HTML. The memento can be stored as a json file and served statically by your fileserver. When a user visits your page, retrieve the memento from the server. Then, initialize the searcher with it. In this way you avoid indexing content at runtime.
As a bonus you could cache the memento in the local storage or session storage.
reply