AI/ML Agents Engineering

Beyond Chatbots: Hard-Won Realities from OpenAI's Agent Guide


Last week, I waded through OpenAI's 80-page guide on building agents. It's incredibly bloated, but there's a lot of useful advice hidden inside if you know where to look.


The Core Mental Model: A While-Loop with Judgment

The biggest point of confusion is defining what an agent actually is. Most people build a slightly complex chatbot or an LLM-based classifier and call it an agent. It is not.

An agent is fundamentally a while-loop equipped with judgment. Unlike a standard single-turn LLM call or a hardcoded workflow, an agent dynamically directs its own control flow. It assesses the current state, selects a tool, evaluates the outcome, self-corrects if things go wrong, and decides when the task is officially done.

System Type Control of Flow Tool Interaction State Management
Classifier Static (single-turn) None None (stateless)
Chatbot Conversational (user-led) Optional (ad-hoc) Linear history
Workflow (DAG) Deterministic (code-led) Hardcoded steps Structured transition
Agent Dynamic (model-led) Dynamic / Loop Dynamic memory & feedback

If you do not need the system to dynamically select its next action based on unpredictable inputs, do not build an agent. Use standard deterministic code or a structured Directed Acyclic Graph (DAG) instead. Non-determinism introduces a massive testing and maintenance tax - only pay it when you genuinely need judgment.


When to Build (and When to Run)

In practice, we have found that agents shine in environments that display three distinct characteristics:

  1. Brittle Rulesystems: Your traditional if-else ruleset has become a massive, unmaintainable spiderweb of edge cases.
  2. Messy inputs: The incoming data is highly unstructured (natural language emails, audio transcripts, or raw documents).
  3. Nuanced exceptions: The decision-making requires context-aware trade-offs that cannot be clean-coded.

If you are processing structured CSVs with predefined business rules, keep it boring. Stick to Python. If you are handling complex customer billing exceptions involving three external systems and messy support threads, build an agent.


The Holy Trinity: Model, Tools, and Instructions

An operational agent relies on three balanced components. If any single one is weak, the entire agent behaves erratically.

1. Models: Pick Capability first, then Optimise

The trap most engineers fall into is trying to optimise for latency and cost on day one. They spend weeks trying to cram complex reasoning routines into cheap, fast models and get frustrated by tool-selection failures.

Do this instead:

  1. Prototype the entire workflow using the absolute smartest model available to establish an accuracy baseline.
  2. Deconstruct the run logs to isolate which specific turns required deep logic versus simple parsing.
  3. Swap in smaller, specialised models only for the simple tasks (like routing or data extraction) while keeping the smart model in charge of high-stakes execution.

2. Tools: Production-Grade Contracts

Tools are your agent's hands. If the interfaces are poorly designed, your agent will constantly make mistakes. In production systems, we treat tools exactly like external API integrations:

  • Strict Typings: Use clear schemas (Pydantic, JSON Schema) rather than open text fields. Explicit typings prevent the model from guessing parameters.
  • Idempotency Keys: Ensure any action tool (e.g., initiating refunds, sending emails) supports idempotency. Since agents retry on errors, a lack of idempotency keys will eventually result in double-charges or duplicate tickets.
  • Explicit Side Effects: Document exactly what the tool does in its system description. The agent needs to know that cancel_order triggers a real email to the customer.

3. Instructions: Standard Operating Procedures (SOPs)

Writing agent instructions is not about prompt engineering; it is about writing clear Standard Operating Procedures. If your prompt includes vague lines like "Handle billing issues politely and escalate if needed," the agent will fail.

Great instructions look like military manuals:

Routine: Customer Billing Dispute

Steps:
1. Verify identity by calling check_user_status(user_id).
2. Fetch the billing ledger using get_billing_history(user_id).
3. If the disputed charge is older than 60 days:
   - Explain the policy constraint clearly.
   - Stop and offer a 10% goodwill discount.
4. If the dispute is under $50, call initiate_refund(dispute_id, amount).
5. If the dispute is over $50, call request_human_escalation(context).

Explicit steps, zero ambiguity, and defined branches for edge cases. If you cannot explain the workflow cleanly to a junior engineer, you cannot explain it to an agent.


Orchestration: Start Simple, Stay Simple

The orchestrator is the control loop that runs the agent. Keep this logic in code, not in prompts. A clean, minimal control loop looks like this:

# A minimal, robust single-agent loop
def run_agent_loop(initial_state, max_turns=10):
    state = initial_state.copy()
    
    for turn in range(max_turns):
        # 1. Ask model for the next step based on state
        step = model.decide_next_step(state)
        
        # 2. Check exit conditions
        if step.type == "final_response":
            return step.content
            
        if step.type == "tool_call":
            # 3. Handle high-risk actions with validation
            if is_high_risk(step.tool_name):
                step = request_human_approval(step)
                if not step.approved:
                    return f"Action blocked: {step.rejection_reason}"
            
            # 4. Append assistant's intent to call (required by API contracts)
            state.append_message(
                role="assistant", 
                tool_calls=[{
                    "id": step.tool_call_id, 
                    "type": "function", 
                    "function": {"name": step.tool_name, "arguments": step.arguments}
                }]
            )
            
            # 5. Execute tool and append the corresponding tool output
            result = tools.execute(step.tool_name, step.arguments)
            state.append_message(
                role="tool", 
                tool_call_id=step.tool_call_id, 
                content=result
            )
            
    return "Reached maximum turn limit without resolving."

The Multi-Agent Trap

Do not split your system into multiple coordinating agents just because it sounds advanced. Multi-agent systems introduce severe handoff errors, prompt duplication, and make tracing/debugging a nightmare.

Stay single-agent until the single agent splits at the seams. The only valid triggers to split into a multi-agent architecture are:

  1. Tool Selection Overload: You have 20+ tools, and the model is starting to confuse similar-looking tools.
  2. Conflicting Instructions: The SOP has so many complex conditional branches that the model's system prompt starts diluting its focus.

When you do split, use either the Manager Pattern (one orchestrating agent delegates tasks to sub-agents via tool calls) or the Triage/Handoff Pattern (a simple router agent assesses the user's intent and hands the entire state over to a dedicated specialist agent).


Guardrails: Layered Defense is Non-Negotiable

Guardrails are the programmatic tripwires that keep your agent from causing damage. A single guardrail layer is never enough; you must build defense-in-depth.

Layer What It Protects Against Implementation Mechanism
Input Filter Jailbreaks, prompt injection, off-topic spam Deterministic regex, fast classification models
Risk Classifier Uncontrolled side effects, high-cost actions Tool risk rating system (Low / Medium / High)
PII Masker Leaking customer data or secrets Named Entity Recognition (NER), regex scrubbers
Output Validator Hallucinations, brand violations, policy slips Self-correction checks, strict JSON parsing

The Tripwire Pattern (Optimistic Parallel Execution)

To prevent guardrails from ruining your latency, use the Tripwire Pattern. Instead of running every check sequentially, run the primary agent and the guardrails in parallel. If a guardrail detects a violation (e.g., the agent generated an output containing an active token or violated a safety policy), it immediately triggers a "tripwire" exception, kills the active thread, and gracefully redirects the output.

flowchart TD U["User input"] --> A["Agent Core"] U --> G["Guardrail Classifiers (Parallel)"] A --> T{"Tripwire triggered?"} G --> T T -- "Yes" --> H["Kill execution & trigger handoff"] T -- "No" --> C{"Need tool?"} C -- "Yes" --> RISK{"Is tool high-risk?"} RISK -- "Yes" --> HR["Pause for human approval"] RISK -- "No" --> TC["Execute tool"] TC --> A HR --> TC C -- "No" --> O["Return safe response to user"]

Human-in-the-Loop is a Feature

Designing for human intervention is not a failure mode; it is a core operational requirement. Your agent should programmatically pause and yield execution to a human operator when:

  • The model hits its maximum loop turn limit (detecting a circular loop).
  • The model triggers a high-risk tool (like sending wire payments or deleting tables).
  • The self-correction logic fails to resolve a tool error after two retries.

When this happens, write a structured handoff payload (the user's original goal, what tools were called, and the exact error encountered) so the human operator can take over without losing context.


The Blueprint Pipeline

If you are building an agent from scratch, follow this exact sequence to avoid getting stuck in prototype hell:

  1. Define the Boundary: Write down exactly what "done" means. Identify 10-20 edge-case test queries.
  2. Build the Tools: Create explicit, structured API wrappers with robust error returns.
  3. Draft the SOP: Write down the step-by-step routine. Run it through a highly capable model.
  4. Establish the Loop: Build a clean, single-agent run loop in Python with strict max-turn limits.
  5. Layer the Guardrails: Add input filters and tool-risk checks.
  6. Implement Human Handoffs: Create the deterministic pause states for high-risk actions.

Once this pipeline is solid, you have a production-grade agent. Only then should you look into advanced optimizations like multi-agent handoffs or fine-tuning.

Building reliable agents is not about magic prompts. It is about treating LLMs as volatile reasoning engines that must be constrained by strict software engineering practices, stable API boundaries, and layered safety tripwires.


References