A Sans-I/O SSE Parser for Python
If you have consumed a streaming response from an LLM in Python, you have parsed Server-Sent Events — probably without noticing, because a client library did it for you. The format is the one every major provider uses to stream tokens: a text/event-stream response body where each event is a few field: value lines followed by a blank line. It looks like the kind of thing you could parse with response.iter_lines() and a couple of if statements. The first time you try, it works. The tenth time, in production, it drops half a token because a multibyte emoji landed across two TCP reads.
The format is small but genuinely fiddly, and that fiddliness is exactly the kind of thing that should be written once, gotten right, and never re-implemented. So we pulled the parser out into its own library: sansio-sse. Pure standard library, zero runtime dependencies, MIT-licensed. It does no networking at all. That last part is the whole design, and it deserves an explanation.
What “sans-I/O” means
A sans-I/O library implements a protocol as a pure state machine and performs none of its own input or output. You feed it bytes; it hands you parsed results. It never opens a socket, never reads a file, never knows or cares where the bytes came from. This is the design philosophy behind h11 (HTTP/1.1) and h2 (HTTP/2): keep the protocol logic in one place, decoupled from the transport, so it can be tested exhaustively and reused everywhere.
The Python SSE landscape today does the opposite. sseclient-py is built around a requests response. httpx-sse is built around httpx. Both bundle a correct-enough parser inside a transport adapter, which means the parsing logic — the part that’s actually hard — is welded to a client you may not be using. If you are on aiohttp, or reading from a raw socket, or replaying a captured stream from a file, you re-implement the parser. And re-implementing it is exactly where the bugs come from.
sansio-sse is just the parser. Bring your own bytes:
from sansio_sse import SSEParser
parser = SSEParser()
for event in parser.feed(b"event: greeting\ndata: hello\n\n"):
print(event.event, "->", event.data) # greeting -> hello
The same parser drops onto any transport unchanged. The only thing that varies is the loop that hands it chunks:
import httpx
from sansio_sse import SSEParser
parser = SSEParser()
with httpx.stream("GET", "https://example.com/stream") as r:
for raw in r.iter_bytes(): # requests, aiohttp, a socket: same shape
for event in parser.feed(raw):
handle(event)
The motivating case: a streaming LLM response
The reason this matters right now is that every major model provider — OpenAI, Anthropic, Google — streams completions as text/event-stream. Each token (or small group of tokens) arrives as its own SSE event carrying a JSON payload, terminated by a sentinel. Consuming one is exactly the feed() loop above:
import json, httpx
from sansio_sse import SSEParser
parser = SSEParser()
with httpx.stream(
"POST", "https://api.openai.com/v1/chat/completions",
headers={"Authorization": f"Bearer {api_key}"},
json={"model": "gpt-4o", "stream": True, "messages": [...]},
) as response:
for raw in response.iter_bytes():
for event in parser.feed(raw):
if event.data == "[DONE]":
break
chunk = json.loads(event.data)
print(chunk["choices"][0]["delta"].get("content", ""), end="")
What makes the LLM case the honest stress test is that it exercises every awkward reality of a live byte stream at once: a JSON payload that arrives across several TCP reads, a line terminator that lands exactly on a chunk boundary, and an emoji in the model’s output whose UTF-8 encoding gets split between two byte chunks. A parser that works on a single in-memory string but mishandles any of those will look correct in a unit test and corrupt output in production.
The spec is the easy part to underestimate
sansio-sse implements the WHATWG HTML event-stream interpretation algorithm exactly. The rules are individually simple and collectively easy to get subtly wrong:
| Rule | Behavior |
|---|---|
| Line splitting | Lines end on \r\n, \r, or \n. |
| BOM | A single leading U+FEFF at the start of the stream is stripped once. |
| Comment | A line starting with : is ignored entirely. |
| No colon | The whole line is the field name; the value is empty. |
field: value | Exactly one leading space after the colon is removed. |
event | Sets the event type for the next dispatched event. |
data | Appends the value plus "\n" to the data buffer. |
id | Sets the last event id — unless the value contains a NUL, in which case the field is ignored. |
retry | Parsed as integer milliseconds only if all ASCII digits; otherwise ignored. |
| Blank line | Dispatches an event — but only if the data buffer is non-empty. |
| Trailing newline | A single trailing \n is stripped from the data before dispatch. |
| Reset | After dispatch, data and event type reset; lastEventId persists. |
Several of these are the kind of rule you only discover you got wrong via a confusing bug report. The id-with-embedded-NUL rule, for instance, exists so a corrupted id can’t poison reconnection; a naive parser stores it anyway and sends a garbage Last-Event-ID header on the next connect. The “blank line dispatches only with non-empty data” rule means a comment-only keep-alive (: ping followed by a blank line) correctly produces nothing, where a sloppy parser emits a spurious empty event your handler then has to defend against. And lastEventId persisting across events — while the data and event-type buffers reset — is the single rule that makes resumable streams work at all.
The API surface that exposes all of this is deliberately tiny:
from dataclasses import dataclass
@dataclass(frozen=True)
class ServerSentEvent:
event: str = "message" # the event type; default is "message"
data: str = "" # data lines joined by "\n"
id: str | None = None # last event id in effect at dispatch
retry: int | None = None # reconnection time in ms, if specified
class SSEParser:
def feed(self, chunk: str | bytes) -> list[ServerSentEvent]: ...
@property
def last_event_id(self) -> str: ...
The event is a frozen dataclass, so it’s hashable and can’t be mutated out from under you. There’s also an iter_sse(chunks) convenience for when you just want to pour an iterable of chunks in and get an iterator of events out.
Where the correctness actually lives: chunk boundaries
Everything above is the part you can read off the spec. The part that makes this worth being a library is that feed() gets called with whatever the transport happened to hand you, and the boundaries between calls fall in arbitrary, hostile places. SSEParser buffers across calls and handles three boundary traps that account for most real-world SSE bugs.
1. A line split mid-way through a feed()
The obvious one: a field or its value is cut in half by a chunk boundary. feed("data: hel") followed by feed("lo\n\n") has to produce a single event with data="hello", not two broken half-events. The parser holds an incomplete trailing line in an internal buffer and only emits a line once it has actually seen a terminator.
2. The CRLF-split-across-chunks trap
This is the classic one, and it’s the bug a from-scratch parser almost always ships with. SSE accepts \r\n as a single line terminator. But if a chunk ends with \r and the next chunk begins with \n, a naive parser sees the \r as one terminator and the \n as another — which in SSE means a spurious blank line, which means it dispatches an event early, splitting one event into two. sansio-sse remembers a trailing \r and absorbs a leading \n on the next chunk so a CRLF straddling the boundary counts as exactly one terminator:
# "\r" ends chunk one, "\n" begins chunk two: ONE terminator, not two.
parser = SSEParser()
events = parser.feed("data: a\r")
events += parser.feed("\ndata: b\n\n")
assert len(events) == 1
assert events[0].data == "a\nb" # NOT a premature dispatch of "a"
3. A multibyte character split across byte chunks
When you feed bytes, the parser decodes them as UTF-8 incrementally, using codecs.getincrementaldecoder. If a chunk ends in the middle of a multibyte code point — the first byte of a two-byte é, or part of a four-byte 🚀 — the decoder holds the partial unit until the rest arrives, instead of raising or emitting a replacement character. This is precisely the failure mode that corrupts an LLM’s emoji or non-Latin output, and it’s invisible until it isn’t:
# "café" — the 'é' is two bytes (0xC3 0xA9); split between them.
raw = "data: café\n\n".encode("utf-8")
idx = raw.index(b"\xc3\xa9")
parser = SSEParser()
events = parser.feed(raw[:idx + 1]) # includes only 0xC3
events += parser.feed(raw[idx + 1:]) # starts with 0xA9
assert events[0].data == "café" # reassembled, not mangled
These aren’t hypotheticals chosen to look clever; each one corresponds directly to a test in the suite, because each one is a bug we’d rather catch in CI than in a customer’s log. The library ships with 45 tests covering the spec rules, every line-ending combination, the BOM (including a BOM split across two byte chunks), and the chunk-boundary cases above, all passing on Python 3.9 through 3.12.
Why we built it this way
sansio-sse is brand new — v0.1.0, pure stdlib, MIT-licensed. We’re not going to wave a download counter at you; it doesn’t have one worth mentioning yet. What it has is the thing that actually matters in a parser: the boundary cases are right, and there are tests that keep them right.
That choice — separate the protocol from the transport — is the same instinct we bring to client work. The expensive bugs are almost never in the headline feature; they’re in the “obviously simple” glue everyone assumed was correct. An SSE parser that dispatches one event as two on a CRLF boundary, or drops an emoji on a byte split, is a defect that survives review and surfaces weeks later as “streaming sometimes garbles output.” Writing the state machine once, in isolation, with a test for every edge, is what keeps that bug from ever being yours to debug under load.
Install it
pip install sansio-sse
No runtime dependencies. The source is on GitHub under the MIT license — the full spec-rule table, the API, and the test suite are all there. If you’ve been hand-rolling SSE parsing inside a client adapter, this is the small dependency that lets you stop.