Retry Backoff You Can Actually Test

Exactly-once delivery is a lie you tell yourself before the network reminds you otherwise. Any call that leaves the process — an HTTP request, a database connection, an SQS poll, an S3 put — can fail transiently, and the standard answer is “retry with exponential backoff.” That advice is correct and incomplete. Naive exponential backoff makes things worse under load: every client that failed at the same instant also backs off by the same amount and retries at the same instant, so they collide again, in lockstep. That’s a thundering herd, and doubling the wait between collisions doesn’t break the synchronization — it just spaces the collisions further apart.

The fix is jitter: randomize each client’s wait so the herd de-synchronizes. AWS’s well-known article Exponential Backoff And Jitter benchmarks three jitter strategies and shows, with numbers, that the jittered variants complete the same work in far fewer total calls than plain exponential. So jitter is not optional. But the moment you add it, you’ve put a random number generator in the middle of your reliability code — and that is exactly where retry logic becomes untestable and starts to flake.

That tension is what backofflite is built to resolve. Pure standard library, zero runtime dependencies, MIT-licensed. It does two things and keeps them deliberately apart.

The design: pure policy, impure execution

The central idea is a clean seam between two responsibilities that most retry libraries tangle together:

Policy — compute a delay schedule, including jitter. These are pure functions of (attempt, rng). No I/O, no clock, no globals.
Execution — a thin retry loop, iterator, and decorator on top. The one impure thing it does — time.sleep — is injectable everywhere.

That separation is the whole reason it’s testable. The math never touches the outside world, so you can run it in a test and check the answer. The execution layer’s only side effect is sleeping, and you can hand it a fake sleeper. tenacity is powerful and does far more than this, but its jitter is awkward to assert on in a unit test; backofflite trades breadth for a property that matters more in reliability code — you can prove what it does.

The strategies

A strategy computes the delay for a given attempt. attempt is 1-indexed, and every strategy accepts an optional cap that clamps the result. Where exp = base * factor**(attempt-1):

Strategy	Formula	Jitter
`Constant(base)`	`base`	none
`Linear(base)`	`base * attempt`	none
`Exponential(base, factor)`	`base * factor**(attempt-1)`	none
`Fibonacci(base)`	`base * fib(attempt)`	none
`FullJitter(base, factor)`	`uniform(0, exp)`	AWS Full
`EqualJitter(base, factor)`	`exp/2 + uniform(0, exp/2)`	AWS Equal
`DecorrelatedJitter(base, cap)`	`min(cap, uniform(base, prev*3))`	AWS Decorrelated

The three jittered strategies are the AWS family, and they make different trade-offs:

Full Jitter — uniform(0, exp). Spreads retries uniformly across the entire exponential window. In AWS’s benchmarks this gives the best contention behavior (fewest total calls), at the cost of the occasional very short wait.
Equal Jitter — exp/2 + uniform(0, exp/2). Keeps half the window fixed as a guaranteed minimum wait and jitters the other half. The middle ground when you want de-synchronization but also a floor under each delay.
Decorrelated Jitter — min(cap, uniform(base, prev*3)). Stateful: each delay is a random walk off the previous sleep rather than the attempt index, starting from base. Low client-side variance with good throughput. backofflite threads the previous value for you across a run, so you never manage that state by hand.

Each strategy is available both as a class (FullJitter, Exponential, …) and as a lowercase factory function (full_jitter, exponential, …) — use whichever reads better at the call site.

Why “testable” is the headline feature

Here is the part that makes the design pay off. Pass a seeded random.Random and the entire schedule is deterministic — you can assert the exact floats, because reproducing the same draws independently is the contract:

import random
from backofflite import FullJitter

def test_full_jitter_is_exactly_reproducible():
    seed = 42
    s = FullJitter(base=0.1, factor=2.0, cap=10.0)

    rng = random.Random(seed)
    got = [s.delay(n, rng=rng) for n in (1, 2, 3, 4)]

    # Reproduce the same draws independently — this is the contract.
    ref = random.Random(seed)
    expected = [min(10.0, ref.uniform(0.0, 0.1 * 2.0 ** (n - 1)))
                for n in (1, 2, 3, 4)]

    assert got == expected

This is the test you cannot write against a library that reaches for the global RNG internally. Because the randomness is injected, the schedule is a value you can compare; the test isn’t “assert it’s roughly exponential and probably within bounds,” it’s “assert it equals exactly this.”

The same discipline applies to time. The sleeper is injectable, so no test ever sleeps — instead of waiting on a real clock, you hand the retry loop a fake sleeper and assert the sequence of delays it received:

from backofflite import retry, Constant

def test_my_fetch_retries_three_times():
    slept = []                                   # fake sleeper, no real waiting
    calls = {"n": 0}

    @retry(Constant(0.5), max_attempts=3,
           exceptions=(ConnectionError,), sleeper=slept.append)
    def flaky():
        calls["n"] += 1
        if calls["n"] < 3:
            raise ConnectionError
        return "ok"

    assert flaky() == "ok"
    assert calls["n"] == 3
    assert slept == [0.5, 0.5]                    # exact sleep sequence asserted

That test runs in microseconds and never flakes, because there is no real clock and no unseeded randomness anywhere in the path. Reliability code that you can’t test confidently tends to rot — people stop trusting it, wrap it in their own ad-hoc retry, and now you have two. Making the schedule assertable is what keeps the retry layer honest as the codebase grows.

Three ways to use it

The execution layer offers the same policy through three interfaces, depending on how much control you want.

1. As a schedule iterator (pure)

When you just want the numbers — to log them, to plot them, to feed them to something else — Backoff.delays() returns a deterministic list[float]:

import random
from backofflite import Backoff, FullJitter

bo = Backoff(FullJitter(base=0.1, factor=2.0, cap=10.0),
             max_attempts=5, rng=random.Random(42))

for delay in bo.delays():
    print(delay)            # the same five floats every run

2. As an attempt loop

When you want to own the try/except yourself, Backoff.attempts() yields Attempt objects that each know their precomputed delay and can sleep it:

from backofflite import Backoff, Exponential

bo = Backoff(Exponential(0.2, factor=2.0, cap=5.0), max_attempts=5)

for attempt in bo.attempts():
    try:
        do_thing()
        break
    except TransientError:
        if attempt.last:
            raise
        attempt.backoff()    # sleeps this attempt's computed delay
        # attempt.number -> 1-indexed; attempt.delay -> the float; attempt.last

3. As a decorator

For the common case, the @retry decorator wraps a function and handles the loop:

from backofflite import retry, FullJitter

@retry(
    FullJitter(0.1, cap=5.0),
    max_attempts=4,
    exceptions=(ConnectionError,),               # only these are retried
    on_retry=lambda exc, n, delay: log.warning("retry %s after %.3fs", n, delay),
)
def fetch():
    ...

The decorator’s semantics are worth stating precisely, because the defaults are where retry libraries most often surprise you:

It calls the function and returns its result on success immediately.
It catches only the configured exceptions (default (Exception,)). Anything not in that tuple propagates immediately with no retry — you don’t want to retry a ValueError from a bug as if it were a network blip.
It retries up to max_attempts total tries, sleeping the computed delay between them via the injectable sleeper (defaults to time.sleep).
It calls the optional on_retry(exc, attempt_number, delay) hook before each retry sleep — a clean place to emit a metric or a log line.
When attempts are exhausted, it re-raises the original last exception, not a wrapper. Your caller’s except ConnectionError still works; you don’t have to learn a new exception type to handle a failed retry.

A fresh schedule is computed per call, so each invocation of the decorated function gets its own sequence — and if you pass a seed, that sequence is reproducible too.

What this small library is really about

backofflite is intentionally small: a handful of strategies, one Backoff binder, one decorator. It’s brand new — v0.1.0, pure stdlib, MIT-licensed, with 32 tests passing across Python 3.9 through 3.12. We’re not going to point at a download count; there isn’t one worth pointing at yet. What it has is the property that actually matters in reliability code: the behavior is pinned down by tests that assert exact schedules and exact sleep sequences, and those tests run instantly without ever waiting on a clock.

That instinct — make the impure part injectable so the logic underneath becomes a pure, assertable function — is one we bring to client work constantly. Retry and backoff code is precisely the kind of thing that everyone writes once, inline, and never tests, because “you’d have to actually wait for it to retry.” Then it misfires under real load — too aggressive and it amplifies an outage, too timid and it gives up on a recoverable blip — and nobody can reproduce it. Pulling the math out where a seeded test can hold it still is an afternoon of work that turns the least-trusted code in the system into the most-verified.

Install it

pip install backofflite

No runtime dependencies. The source is on GitHub under the MIT license — the full strategy table, the AWS jitter formulas, the three usage modes, and the test suite are all there. If you’ve been hand-rolling a retry loop you’ve never written a test for, this is the small dependency that lets you finally write one.