top of page
Search

Python Agent Control Plane – Clean Architecture Blueprint

  • Writer: Mark Kendall
    Mark Kendall
  • Feb 8
  • 3 min read

Python Agent Control Plane – Clean Architecture Blueprint


## Purpose

Design a durable, enterprise-grade agent platform that avoids past orchestration failures (BizTalk-style opacity) while enabling safe, scalable AI-driven workflows.


---


## Core Principles (Non-Negotiable)


1. **Agents are products**

- Versioned

- Tested

- Observable

- Owned


2. **LLMs reason, code executes**

- No direct side effects without deterministic code paths


3. **State is explicit**

- Every run has IDs, steps, inputs, outputs

- Replayable and inspectable


4. **Everything is diffable**

- Git is the source of truth

- UI is a view, not authority


5. **Failure is designed**

- Timeouts, retries, fallbacks, escalation paths


---


## High-Level Architecture


### Entry Layer

**Agent Gateway (FastAPI)**


from fastapi import FastAPI, HTTPException, Depends

from pydantic import BaseModel

from typing import List, Optional


app = FastAPI(title="Python Agent Control Plane")


# 1. DATA MODELS (Registry/Knowledge Management)

class WorkRequest(BaseModel):

task_id: str

intent: str

payload: dict


# 2. POLICY ENGINE (Mocking the Policy & Permissions block)

def check_permissions(request: WorkRequest):

# In a real app, this queries the 'Policy & Permissions' block

if "admin" in request.intent.lower():

raise HTTPException(status_code=403, detail="Unauthorized Agent Action")

return True


# 3. ORCHESTRATOR (The logic brain)

class Orchestrator:

def route_to_agent(self, request: WorkRequest):

# Logic to pick an agent from the 'Registry'

return {"status": "Executing", "agent": "Python_Agent_01", "task": request.intent}


orchestrator = Orchestrator()


# 4. AGENT GATEWAY (The Entry Point)

@app.post("/gateway/dispatch")

async def dispatch_work(request: WorkRequest, authorized: bool = Depends(check_permissions)):

"""

This endpoint represents the 'Agent Gateway' in your diagram.

It validates, checks policy, and hands off to the Orchestrator.

"""

result = orchestrator.route_to_agent(request)

return result

- Accepts work requests

- Issues `run_id`

- AuthN/AuthZ

- Rate limits and quotas


### Control Plane

**Run Registry**

- Stores run metadata

- Tracks lifecycle state

- Links audit and observability data


**Policy & Permissions**

- Tool allowlists

- Environment boundaries

- Cost and token budgets

- Data classification rules


---


## Orchestration Layer


**State Machine–Driven Orchestrator**

- Explicit step types:

- Plan

- ToolCall

- Validate

- Decide

- HumanApproval

- Retry / Fallback

- Complete / Fail


- Deterministic transitions

- No hidden state


Recommended:

- LangGraph or custom FSM

- Celery / Arq / Kafka workers


---


## Agent Execution


**Python Agents**

- Stateless by default

- Receive:

- Context

- Constraints

- Tool contracts

- Return structured outputs only


Agents never:

- Self-assign permissions

- Bypass policy

- Persist hidden memory


---


## Tooling Layer


**Tool Adapters**

- Wrap all external systems

- Enforce:

- Schemas

- Idempotency

- Timeouts

- Retries with backoff

- Circuit breakers


Tools must be independently testable without LLMs.


---


## Memory & Knowledge


### Short-Term Memory

- Per-run state

- Stored in Postgres or Redis


### Long-Term Knowledge

- Explicit facts only

- Vector search is optional, not default

- No hallucinated persistence


---


## Observability & Governance


**Required Signals**

- Traces: every LLM + tool call

- Logs: structured JSON with `run_id`

- Metrics:

- latency

- failure rate

- retries

- cost


**Audit**

- Who initiated

- What changed

- When and why


OpenTelemetry from day one.


---


## Agent Contract


### Input Schema

- task

- context (structured)

- constraints (time, cost, scope)

- output_spec (JSON schema)


### Output Schema

- result

- evidence

- actions_taken

- warnings


---


## Anti-BizTalk Guardrails


- No visual-only logic

- No implicit state

- No “platform magic”

- Manual fallback path always exists

- Engineers can debug with logs + code alone


---


## Recommended Initial Build Order


1. Agent Gateway + Run Registry

2. Core Orchestrator FSM

3. 3 Production Tool Adapters

4. Full Observability

5. Offline Evaluation Harness


---


## Final Reminder


Agents don’t remove engineering.

They **concentrate it**.


Judgment, ownership, and clarity remain the system’s load-bearing walls.

 
 
 

Recent Posts

See All

Comments

Rated 0 out of 5 stars.
No ratings yet

Add a rating
Post: Blog2_Post

Subscribe Form

Thanks for submitting!

©2020 by LearnTeachMaster DevOps. Proudly created with Wix.com

bottom of page