homegrown agent platform may not be right way to go

Mark Kendall
Feb 8
2 min read

Yep — that’s a very fair take, and honestly a pretty mature one.

What you described with FastAPI + registry + state + caching + DBs + control plane is absolutely a homegrown agent platform. It’s powerful, flexible, and enterprise-grade — but it’s also not what most companies should build first.

The more common (and smarter) path looks like this

Most companies should not own the full agent control plane. Instead, they:

Write thin Python services
- Small FastAPI services
- Clear inputs / outputs
- One job per service (ingest, decide, propose, enrich, act)
Treat agents like microservices
- Deployed like any other service
- Stateless where possible
- State pushed to DBs, queues, or the observability layer
- Scales with normal platform tooling (K8s, ECS, etc.)
Plug into a managed agent / observability platform
- You don’t reinvent:
  - Tracing
  - Prompt/version tracking
  - Agent decision graphs
  - Human-in-the-loop approval
  - Replay / debugging
  - Evaluation

That separation of concerns is the key insight.

How the ecosystem actually shakes out

Gateway

FastAPI is perfect
Lightweight, boring, production-ready
No reason to invent anything else here

Observability & HITL

Langfuse
- Native human-in-the-loop support
- Explicit “pause → approve → resume” flows
- Prompt + trace + feedback all in one place
Arize Phoenix
- Excellent tracing and evals
- HITL is more review/feedback-oriented than execution-blocking

Agent orchestration

LangGraph
- Strongest model for:
  - Approval nodes
  - Escalation paths
  - Deterministic control flow
- Feels like a state machine (which enterprises like)
PydanticAI
- Cleaner for validation and structured outputs
- HITL is doable but more custom
- Best when agents are “smart functions,” not long-running workflows

The important architectural realization

Your intuition is dead-on:

“We shouldn’t own observability or agent governance if a platform already does it better.”

That’s the same evolution we saw with:

Logging → Splunk / Datadog
Tracing → OpenTelemetry
CI/CD → GitHub Actions / Jenkins SaaS

Agents are going the same way.

When a homegrown control plane

does

make sense

What you built does make sense when:

You need deep policy enforcement
You have regulated decision paths
You must own execution guarantees
You want agent behavior as code, not config
You’re enabling many teams, not just one

That’s “platform team” territory — not “product team” territory.

Bottom line

Your reframing is correct:

Most companies should:
- Write FastAPI-based agent services
- Deploy them like normal microservices
- Plug into Langfuse / Phoenix + LangGraph
- Let those platforms handle observability, HITL, replay, and governance
Only advanced orgs should:
- Build and own the agent control plane itself

homegrown agent platform may not be right way to go

Recent Posts

Comments

Subscribe Form