Fuzzy-Match Scoring in Ruby, with Highlights

We were building a ⌘K command palette for Levelbrook Workspace, our public Hotwire demo, and hit a wall that was smaller than it looked. The palette needs to take whatever a user half-types — amf, srcmatch, newproj — and do two things at once: rank the candidate list so the best match floats to the top, and tell the UI which characters matched so it can bold them. That second requirement is the one that turns a list of strings into something that feels like a real fuzzy finder. And it is precisely the requirement Ruby’s existing gems don’t serve.

Why the existing gems didn’t fit

The mature Ruby fuzzy libraries solve a different problem. amatch and fuzzy_match are record-linkage tools — Levenshtein, Dice, Jaro-Winkler — built to answer “how similar are these two strings?” for deduping customer records or matching misspelled names. They return a similarity number, but they don’t model the subsequence behavior you want in a picker, where typing gemfile should match app/Gemfile by skipping over the path. Other gems answer only the boolean question — does this candidate match at all? — which is enough to filter a list but tells you nothing about ordering or highlighting.

What we wanted is the behavior of fzf: type a few characters, get a ranked list where matches at word boundaries and path components rank above matches buried mid-token, with the matched letters lit up. fzf gets that ranking from fzy’s scoring algorithm, which is also what fzf-for-js uses in the browser. There was a clean, well-documented algorithm sitting right there — just no faithful Ruby implementation of it. So we wrote one.

To be upfront about what this is: fzy_score is a port, not an invention. The scoring constants are copied verbatim from fzy’s config.def.h, and the dynamic program is a faithful translation of fzy’s match() and match_positions(). The value isn’t a novel algorithm — it’s bringing fzy’s exact, reference-matching ranking to Ruby behind a clean API, with zero dependencies. Credit goes to John Hawthorn (fzy) and Junegunn Choi (fzf); we just translated.

The API

There are three entry points, and they layer on top of each other. The cheapest is a plain score:

require "fzy_score"

FzyScore.score("amf", "app/models/foo.rb")  # => 3.58  (higher is better)
FzyScore.score("zzz", "app/models/foo.rb")  # => -Infinity  (no match)
FzyScore.score("abc", "abc")                # => Infinity  (exact match)

score returns a Float. A non-match returns SCORE_MIN (negative infinity) so it always sorts last; an exact case-insensitive match returns SCORE_MAX (positive infinity) so it always sorts first. Everything real lives in between.

When you need to highlight, reach for match, which hands back a small Match struct carrying both the score and the indices that matched:

m = FzyScore.match("amu", "app/models/user.rb")
m.score       # => Float
m.positions   # => [0, 4, 11]   indices into the haystack to highlight
m.matched?    # => true  (i.e. score > SCORE_MIN)

# Light up the matched characters in a terminal:
def highlight(haystack, positions)
  set = positions.to_set
  haystack.each_char.with_index.map { |c, i|
    set.include?(i) ? "\e[33m#{c}\e[0m" : c
  }.join
end
puts highlight("app/models/user.rb", m.positions)

Those positions are the whole point. With them, the same data drives both ranking and rendering — a Turbo Frame can wrap each matched index in a <mark>, a terminal picker can color them, a React dropdown can bold them. The struct’s matched? predicate is just sugar over score > SCORE_MIN, so you never have to remember which infinity means “no.”

And when you have a whole list to rank, filter does it in one call, best-first:

files = ["spec/match_spec.rb", "src/match.rb", "README.md"]

FzyScore.filter("srcmatch", files)
# => [["src/match.rb", 6.97, nil]]   (non-matches dropped, sorted best-first)

# Ask for positions when you intend to highlight:
FzyScore.filter("srcmatch", files, positions: true)
# => [["src/match.rb", 6.97, [0,1,2,4,5,6,7,8]]]

# Ranking objects, not strings? Pass a key extractor — the object comes back.
people = [{ name: "Alice" }, { name: "Bob" }, { name: "Albert" }]
FzyScore.filter("al", people, key: ->(p) { p[:name] })
# => [[{name: "Albert"}, ...], [{name: "Alice"}, ...]]

Each row is [candidate, score, positions], non-matches are dropped, and the sort is stable: when two candidates tie on score, the one that came first in the input wins. That stability matters more than it sounds — in a picker, items jumping around on a tie reads as a bug, and a stable sort gives you the predictable, “respects my original order” behavior users expect. The key: extractor lets you rank arbitrary objects while still getting the original object back in each row, so you don’t have to maintain a parallel lookup table.

How the scoring actually works

The algorithm is fzy’s modified Smith–Waterman dynamic program, and it has two layers. First, a cheap O(n) pre-filter, exposed as FzyScore.match?, which just walks the needle through the haystack checking that every character appears, in order, case-insensitively. If that fails there’s no match at all and the expensive part never runs. filter uses this internally to skip the dynamic program for candidates that can’t possibly match.

When a candidate does pass the pre-filter, the dynamic program scores it. The core idea is that a matched character earns a bonus based on the character that precedes it — because matches at meaningful boundaries should rank higher than matches buried inside a word. The bonus table is taken straight from fzy:

Situation	Bonus / penalty
Consecutive characters	`+1.0`
First char after `/` (path component)	`+0.9`
First char after `-` `_` space (word start)	`+0.8`
Uppercase after lowercase (camelCase)	`+0.7`
First char after `.` (file extension)	`+0.6`
Leading / trailing gap	`-0.005` per char
Inner gap	`-0.01` per char

So amu against app/models/user.rb scores well because the m lands right after a slash and the u right after a slash too — the matched characters sit at path boundaries, which is exactly where a human reading the path would expect them. Gaps cost a little, consecutive runs are worth the most, and the constants are tuned (by fzy’s author, not us) so the ranking feels right across real file lists. Worth noting: the bonus is computed once per haystack position, up front, by walking the string and recording for each character what kind of boundary precedes it. The first character is treated as if it were preceded by a slash — so a needle that matches the very start of a candidate gets the full path-component bonus, which is why prefix matches tend to win, exactly as they do in fzf.

There’s a deliberate split in the implementation that mirrors fzy. The score-only path keeps just two rolling rows of the matrix — current and previous — because that’s all you need to compute the final number, which keeps memory flat regardless of candidate length. The positions path keeps the full matrix and then backtracks through it to recover which haystack index each needle character matched. You only pay for the full matrix when you actually asked for positions, which is why score and filter-without-positions stay lean and you opt into the extra work only when you’re going to highlight.

Two guard rails round it out. Candidates longer than MATCH_MAX_LEN (1024 characters) are treated as non-matches rather than scored — fzy’s way of refusing to let one pathologically long entry stall the whole UI. And as noted, an exact case-insensitive match short-circuits straight to SCORE_MAX without running the dynamic program at all.

What it is, honestly

fzy_score is brand new — v0.1.0. It is pure Ruby, has zero runtime dependencies, ships under the MIT license, and has 27 tests passing across Ruby 3.0 through 3.4 in CI. It is a faithful port of fzy, and the only reason we extracted it into a gem is the one that started this post: we needed it for our own demo’s command palette, found Ruby had no good option, and figured the next person building a palette or a quick-open or a CLI picker would rather gem install it than re-derive a dynamic program from a C header. No downloads to brag about yet, no stars, no users but us — just a small, correct, well-tested tool doing one job.

That last part is the part we care about most. A scoring algorithm is the kind of code that’s easy to get almost right and hard to get exactly right — an off-by-one in the backtrack, a bonus applied to the wrong character, a tie broken the wrong way, and the ranking quietly drifts from the reference tool in ways that are maddening to debug from the outside. The point of porting fzy verbatim, constants and all, is that the test suite can assert the behavior matches the implementation everyone already trusts, rather than asking you to trust our judgment about what “good ranking” means. When the answer is already known, the honest engineering move is to match it, not to reinvent it — and then to test that you did.

Install it

It’s on RubyGems and takes no setup:

# add it to a project
bundle add fzy_score

# or install it directly
gem install fzy_score

The source lives at github.com/tachyurgy/fzy_score (MIT) — issues and PRs welcome. If you’re building a fuzzy finder in Ruby and want the same ranking your users already know from fzf, that’s exactly what this is for.