TMF Forum quick notes:

Mark Kendall
Jan 7
2 min read

Kafka = system of record (truth)

and any “queryability” comes from your own subscriber-built read model (even if that read model is lightweight/in-memory at first).

Here’s the clean way to think about it, plus the practical patterns that keep it from getting sloppy.

What you’ve built (in one sentence)

TMF services emit immutable facts to Kafka, and TMF688 (or a companion component) replays/consumes those facts to answer “what happened?” with observed history, not guesses.

The pattern name

This is essentially event sourcing + consumer-driven projections (a CQRS-ish split), even if you’re not calling it that.

The key rule to keep it “single source of truth”

Kafka is the SoT only if:

events are append-only (no “fixup” by overwriting meaning)
events have stable IDs (eventId) + a correlationId
you can rebuild any queryable view by replaying from earliest (or from a checkpoint)

How you query Kafka without “Kafka as a database”

What you described (“subscribe ourselves and query that”) is correct, but make it explicit as two layers:

1) The immutable log (Kafka topics)

You store:

recordType: INGRESS | EXECUTION
outcome: SUCCESS | ERROR | IN_PROGRESS | UNKNOWN
stage, attempt, timestamps, error object, etc.

2) The query layer (your own projection)

You build a local “materialized view” from the stream, optimized for the 4 GETs:

by correlationId → timeline array
by eventId → single record
by messageId → ingress record
search filters → index fields (eventType/stage/outcome/time)

This projection can be:

in-memory (fast, but volatile)
persisted (RocksDB / Redis / Mongo / Postgres / Elasticsearch — doesn’t matter, because it’s not SoT)
rebuilt anytime by replay

The only real gotcha

If your projection is only “subscribe and keep in memory”, then:

restarts lose history unless you replay
replays require knowing where to start (consumer group offsets) or using a reset/replay mode
you need a retention policy that still lets you replay “far enough back”

So the operational question becomes:

“How far back do we need to answer TMF688 queries?”

If it’s “days/weeks”, Kafka retention might be enough.

If it’s “months/years”, you’ll still likely want a persisted projection store — but Kafka remains SoT.

How this gets you from implied → observed behavior

This is the big win:

Implied: “We stored the inbound canonical message, so it probably worked.”
Observed: “We have an execution fact stream that includes validation, adapter calls, retries, DLQ, final outcomes.”

Your TMF688 responses become:

honest timelines
error details when present
“unknown/unobserved” when the stream never shows a fact past ingress

One implementation detail you should bake in now

Add a consistent event envelope across all services, so replay/projections are painless:

eventId (unique)
correlationId
recordType (INGRESS/EXECUTION)
stage
outcome
timestamp
payload?
error?
metadata (producer/version/spec)

That’s it. Everything else is optional.

TMF Forum quick notes:

Recent Posts

Comments

Subscribe Form