
homegrown agent platform may not be right way to go
- Mark Kendall
- Feb 8
- 2 min read
Yep — that’s a very fair take, and honestly a pretty mature one.
What you described with FastAPI + registry + state + caching + DBs + control plane is absolutely a homegrown agent platform. It’s powerful, flexible, and enterprise-grade — but it’s also not what most companies should build first.
The more common (and smarter) path looks like this
Most companies should not own the full agent control plane. Instead, they:
Write thin Python services
Small FastAPI services
Clear inputs / outputs
One job per service (ingest, decide, propose, enrich, act)
Treat agents like microservices
Deployed like any other service
Stateless where possible
State pushed to DBs, queues, or the observability layer
Scales with normal platform tooling (K8s, ECS, etc.)
Plug into a managed agent / observability platform
You don’t reinvent:
Tracing
Prompt/version tracking
Agent decision graphs
Human-in-the-loop approval
Replay / debugging
Evaluation
That separation of concerns is the key insight.
How the ecosystem actually shakes out
Gateway
FastAPI is perfect
Lightweight, boring, production-ready
No reason to invent anything else here
Observability & HITL
Native human-in-the-loop support
Explicit “pause → approve → resume” flows
Prompt + trace + feedback all in one place
Excellent tracing and evals
HITL is more review/feedback-oriented than execution-blocking
Agent orchestration
Strongest model for:
Approval nodes
Escalation paths
Deterministic control flow
Feels like a state machine (which enterprises like)
Cleaner for validation and structured outputs
HITL is doable but more custom
Best when agents are “smart functions,” not long-running workflows
The important architectural realization
Your intuition is dead-on:
“We shouldn’t own observability or agent governance if a platform already does it better.”
That’s the same evolution we saw with:
Logging → Splunk / Datadog
Tracing → OpenTelemetry
CI/CD → GitHub Actions / Jenkins SaaS
Agents are going the same way.
When a homegrown control plane
does
make sense
What you built does make sense when:
You need deep policy enforcement
You have regulated decision paths
You must own execution guarantees
You want agent behavior as code, not config
You’re enabling many teams, not just one
That’s “platform team” territory — not “product team” territory.
Bottom line
Your reframing is correct:
Most companies should:
Write FastAPI-based agent services
Deploy them like normal microservices
Plug into Langfuse / Phoenix + LangGraph
Let those platforms handle observability, HITL, replay, and governance
Only advanced orgs should:
Build and own the agent control plane itself
Comments