Compiling Bash Globs into Ruby Regexps
If you write JavaScript, you have probably used picomatch or minimatch without thinking twice about it. You hand it a glob pattern — src/**/*.js, say — and it compiles that pattern into a regular expression you can run against any string. Not files on disk: any string. A route path, an S3 object key, a queue topic, a Git branch name, a line out of a log. The glob is just a compact, familiar pattern language, and picomatch turns it into something you can keep and reuse.
Ruby has no clean counterpart. It ships Dir.glob, which walks the actual filesystem, and File.fnmatch, which does a one-shot match against a path-like string but hands you nothing reusable — no Regexp you can inspect, combine, or store. Its brace handling needs a flag, it has no extended-glob syntax at all, and its dotfile and separator rules aren’t configurable per call. The moment your “paths” aren’t files — and in a Rails app they very often aren’t — you’re left hand-rolling a regex and hoping you got the escaping right.
So we built the missing piece: picoglob, a small gem that compiles bash-style glob patterns into reusable Ruby Regexp objects. Pure Ruby, zero dependencies, MIT-licensed. It is a tiny utility. But where its value lives — and where this kind of utility usually breaks — is in the edge cases, and those are worth writing down.
What it looks like
The surface is exactly what you’d hope for. Compile a pattern, get a matcher, run it against whatever strings you have:
require "picoglob"
# Match S3 keys without ever touching the filesystem
g = Picoglob.new("uploads/**/*.{jpg,png}")
g.match?("uploads/2026/06/cat.png") # => true
g.match?("uploads/cat.png") # => true (** can be zero segments)
g.match?("uploads/cat.gif") # => false
# It's just a Regexp underneath — inspect it, store it, reuse it
Picoglob.to_regexp("*.rb")
# => /\A(?:(?!\.)[^\/]*)\.rb\z/
A compiled Picoglob::Matcher also implements ===, so it drops straight into a case:
case branch
when Picoglob.new("release/*") then deploy_to(:production)
when Picoglob.new("feature/**") then deploy_to(:preview)
else skip_deploy
end
And there are the convenience one-shots — Picoglob.match?(pattern, string) and Picoglob.filter(pattern, list) — for when you don’t need to hold onto the matcher. The recommendation in a hot path is to compile once and match many; the Regexp is built at construction time, so reusing the matcher means you pay the parse cost exactly once.
The design: a recursive-descent compiler
Under the hood, picoglob is a small recursive-descent compiler. It walks the pattern character by character, and each construct emits a fragment of regular-expression source. The top-level result is wrapped in \A...\z so the match is anchored to the whole string — a glob describes the entire key, not a substring of it — and then handed to Regexp.new.
The supported syntax is the bash vocabulary you already know:
| Pattern | Meaning |
|---|---|
* | any run of non-separator characters |
** | globstar — any run of characters, including separators |
? | exactly one non-separator character |
[abc] [a-z] | character class |
[!abc] [^abc] | negated character class |
{a,b,c} | brace alternation |
{1..5} | numeric range expansion |
@(a|b) | exactly one of (extglob) |
?(a|b) *(a|b) +(a|b) | zero-or-one / zero-or-more / one-or-more (extglob) |
!(a|b) | anything except (extglob) |
\* | a literal * — escape any metacharacter |
Two of those entries — braces and extglobs — aren’t flat. {a,b{c,d}} nests; image.@(jp?(e)g|png) nests an extglob inside an extglob. Picoglob handles them by sub-compiling: when it hits a brace alternative or an extglob branch, it recursively runs the same compiler over that fragment with the same options. There’s no special case for “how deep” — the recursion is the implementation, so arbitrary nesting works for free, and a numeric range like {1..5} simply expands to (?:1|2|3|4|5) before the alternation logic ever sees it.
Where the real bugs live
Anyone can map * to .* and call it a glob library. The reason a glob library is worth being a library — rather than a one-line regex you write inline — is that the shell’s actual semantics have several quiet rules that are easy to get subtly wrong, and getting them wrong produces matches that look right until the one input that doesn’t. Three of them are worth calling out.
1. * doesn’t cross the separator; ** does
A single * matches a run of characters within one path segment — it stops at the separator. So *.rb matches foo.rb but not lib/foo.rb. The globstar ** is the one that crosses separators. In regex terms, * compiles to [^/]* (non-separators only) while ** compiles to .*. Conflate them and your pattern silently matches across directory boundaries it shouldn’t.
Picoglob.match?("*.rb", "lib/foo.rb") # => false (single * stops at "/")
Picoglob.match?("**/*.rb", "lib/foo.rb") # => true (globstar crosses it)
2. **/ matches zero or more whole segments
This is the rule people get wrong most often. The shell-standard **/ form should match zero or more whole path segments, which means src/**/*.rb has to match both src/app/user.rb and — crucially — src/foo.rb, where there are no intermediate directories at all. A naive globstar that demands at least one segment will quietly miss the top-level file, and that’s a bug you find in production when a file at the root of a tree mysteriously doesn’t get picked up.
Picoglob handles it by swallowing the trailing separator after ** and emitting a group shaped like (?:segment/)* — a repetition that can match the empty string. Zero segments is just zero iterations of that group:
g = Picoglob.new("src/**/*.rb")
g.match?("src/app/models/user.rb") # => true
g.match?("src/foo.rb") # => true (zero intermediate segments)
3. The leading-dot rule
In the shell, a wildcard at the start of a segment won’t match a hidden dotfile — * doesn’t expand to .bashrc unless you ask it to. Picoglob mirrors that: at a segment start (the beginning of the string, or right after a separator), a leading *, ?, or [ emits a negative lookahead (?!\.) so it won’t consume a leading dot. The dot: true option turns that protection off when you actually want to match hidden entries.
Picoglob.match?("*", ".hidden") # => false (dotfile protected)
Picoglob.match?("*", ".hidden", dot: true) # => true
None of these three rules is hard once you’ve named it. The point is that you have to know to name them — and that the only way the library earns trust is by getting all three right at once, which is what the test suite exists to nail down.
Options, and an extglob in anger
The compiler takes four options, and they’re the levers that make it useful off the filesystem. separator defaults to "/" but can be anything — set it to "." and you can glob over dotted names like config keys or Java package paths. dot (default false) controls the leading-dot rule above. extglob (default true) toggles the @()/?()/*()/+()/!() constructs. nocase (default false) compiles the Regexp with the case-insensitive flag.
# Glob over a dotted namespace by changing the separator
Picoglob.match?("app.*.timeout", "app.cache.timeout", separator: ".")
# => true (the inner * won't cross a ".")
# Extglob: match a couple of image extensions, with optional "e" in jpeg
Picoglob.match?("photo.@(jp?(e)g|png)", "photo.jpeg") # => true
Picoglob.match?("photo.@(jp?(e)g|png)", "photo.gif") # => false
# Case-insensitive
Picoglob.match?("*.RB", "foo.rb", nocase: true) # => true
Finally, the compiler is strict about malformed input rather than silently guessing. An unbalanced brace, an unterminated character class, an extglob that’s never closed, or a dangling backslash at the end of the pattern all raise Picoglob::ParseError. A pattern is data, and data is sometimes wrong; a parser that fails loudly on bad input is far easier to debug than one that compiles a quietly-incorrect regex and matches the wrong things at runtime.
What this small gem is really about
picoglob is, by design, a small thing: one compiler, four options, a fistful of public methods. It is brand new — v0.1.0, pure Ruby, zero dependencies, MIT-licensed, with 42 tests passing across Ruby 3.0 through 3.4 in CI. We’re not going to pretend it has a download counter worth bragging about; it doesn’t yet. What it has is the thing that actually matters in a utility like this: the edge cases are right, and there are tests that prove they stay right.
That is the part that mirrors the work we do on client engagements. Most of the code that breaks in production isn’t the headline feature — it’s the small, “obviously simple” utility in the middle of the path that everyone assumed was correct. A glob matcher that crosses a separator it shouldn’t, or skips the file at the root of a tree, or matches a dotfile nobody intended, is exactly the kind of quiet defect that survives a code review and surfaces three weeks later as “sometimes the deploy skips a file.” The careful version costs an afternoon more up front and saves the incident. We’d rather spend the afternoon.
Install it
picoglob is on its way to RubyGems. Add it the usual way:
# with Bundler
bundle add picoglob
# or directly
gem install picoglob
The source is on GitHub under the MIT license — the full supported-syntax table, the options, and the test suite are all there. If you’ve been hand-rolling a glob-to-regex conversion in a Ruby project, this is the small dependency that lets you stop.