Under the hood · Hybrid search
Two searches, one merged result
Korely runs a keyword search and a meaning-based search at the same time, then merges the two rankings into one ordered result list. The merging algorithm today is Reciprocal Rank Fusion. This page covers how it works, the other fusion methods in the family, and when each one fits.
Concept
Two helpers, one shortlist
Picture walking into a library with two helpers. The first one checks the catalog: they take your exact words and find every book that contains them in the title or description. The second one knows what each book is actually about: they walk you to the shelves that match the topic, even when your words and the book titles do not overlap perfectly.
Both helpers hand you a shortlist of ten books. The books that appear high on both lists are the ones you should pick up first. That last move (merging the two lists into one ordered pile) is what fusion does.
Hybrid search runs this every time you type into the Korely search bar. The catalog helper is keyword search. The topic helper is vector search. The merging step is the fusion algorithm.
The fusion today
The fusion method Korely uses today
Korely uses Reciprocal Rank Fusion (RRF), an algorithm described by Cormack, Clarke and Buttcher in their 2009 SIGIR paper. The intuition is easy: only the position of a result in each list matters, not the raw score. A book ranked first by either helper gets a big bump. A book ranked tenth by both still beats one ranked only by a single helper.
Concretely, each helper produces a ranked list. RRF
hands every document a contribution from each list
equal to 1 / (k + rank), where
k is a small constant
(Korely uses the standard value of 60). The
contributions are summed across lists and the documents
are re-sorted by total score.
The reason RRF is the default in most modern hybrid search systems is that it does not care about score scales. Keyword search and vector search produce numbers on completely different scales (one is a BM25 score, the other is a cosine similarity). RRF ignores those entirely. Picture two referees giving figure-skating scores out of different number ranges: instead of averaging the scores, you average the rank each referee gave. Same idea.
In the same family
Other fusion methods in the same family
RRF is one option on the shelf. Three others come up regularly in retrieval literature.
- CombSUM and CombMNZ (Fox and Shaw, TREC-2, 1993). The older score-based cousins of RRF. CombSUM adds the normalised scores from each helper. CombMNZ does the same and multiplies by how many helpers returned the document. Sensitive to score normalisation, where RRF is not. Picture averaging the figure-skating scores directly: only works if both referees use the same scale.
- Weighted linear combination of scores. The simple "give vector search a 70% weight and keyword search a 30% weight" approach used in some systems like Elasticsearch hybrid queries. Easy to explain, easy to tune, but you have to pick the weights yourself. Picture two referees where you decide in advance which one you trust more.
- Cross-encoder reranking (e.g. BGE-reranker-v2-m3, BAAI, Apache 2.0). Not strictly a fusion method, but the most common "stage two" alternative. The first stage retrieves a pool of candidates with any method. A small neural model then reads each (query, passage) pair and scores them one by one. Higher accuracy, much higher latency. Picture a senior librarian who personally reads the first ten books before handing them back to you in a new order.
Korely uses RRF as the default because it is simple, robust, and fast enough that searching a vault of ten thousand notes completes in milliseconds. If a future release adds cross-encoder reranking on top, RRF is the first stage that feeds it.
Inside Korely
Inside Korely, end to end
When you type into the search bar, this is what happens:
- Keyword side. SQLite's built-in full text search engine (FTS5) reads the live query, with prefix matching turned on, and returns the top matches. Typing "graph" already surfaces "GraphRAG" the moment you press space.
- Meaning side. The same query is passed through the local embedding model (currently Nomic embed v1.5), which produces a 768-number coordinate. The sqlite-vec extension finds the nearest neighbours in the same SQLite database file.
- Fusion. Both ranked lists go through RRF. The merged list is what you see in the search results.
The whole pipeline runs in milliseconds on a vault of ten thousand notes. Search updates as you type. No outbound calls, no API key needed, no cost per query. The fusion happens on your CPU, the same one you are reading this on. More on the embedding side →
Frequently asked
What is hybrid search? +
Hybrid search is the technique of running a keyword query and a meaning-based query in parallel, then combining their rankings into one result list. Korely uses it to find notes by both the exact words you typed and the meaning you had in mind.
Why not just use vector search alone? +
Vector search is great at meaning but weak at unique identifiers like project codes, file names, and proper nouns. Keyword search catches those reliably. Running both in parallel and fusing the results covers both cases.
What fusion method does Korely use today? +
Reciprocal Rank Fusion (RRF), a rank-only fusion method introduced by Cormack, Clarke and Buttcher in 2009. It is simple to implement, score-scale free, and works well as a default.
Could Korely use a different fusion method? +
Yes. CombSUM and CombMNZ are the older score-based alternatives. Cross-encoder reranking is the more recent (and more expensive) alternative that adds a small model to score each query-passage pair. Each fits a different priority.
Try the search yourself
Free forever for the local vault. Keyword plus meaning, fused on your CPU.