SQLite is used byDash to search throughdocset indexes. Originally, Dash used
queries which were fast enough, but became increasingly slower as more docsets were added.
is amazingly fast, but allows only prefix (e.g.
) matches by default. For Dash, I needed to persuade it to also perform contains matches (e.g.
) or suffix matches (e.g.
How it works
It’s simple, for each term I want to be able to search, I store all of its suffixes.
First of all, the table structure:
Add the term
Search using suffix queries:
Or contains queries:
The only downside I could find was that the database got too large. To avoid this, I compress
the data into its actual term.
The compress and uncompress functions behave in this way:
This compression reduces the database size to what it would be if only the actual terms were added (without all the suffixes).
Searching over 1,110,381 terms (in 102 docsets) using contains queries:
I chose SQLite FTS because I was already familiar with SQLite and I also needed to work around some Dash-specific edge cases (e.g. how symbols are treated).
Depending on your project, these may be suitable alternatives: