ai-agentstutorialpythontoolsapi

Build Your First AI Agent from Scratch - Part 4: Tooling and API Integrations

By AgentForge Hub2/10/20256 min read

Intermediate

📚 Build Your First AI Agent

Part 4 of 5

Part 3: Memory, Context, and Retrieval

All Tutorials

Part 5: Testing, Simulation, and Deployment

Series Progress80% Complete

View All Parts in This Series

Environment Setup and Safety Rails

Architecting the Core Agent Loop

Memory, Context, and Retrieval

Tooling and API IntegrationsCurrent

Testing, Simulation, and Deployment

Ad Space

Build Your First AI Agent from Scratch - Part 4: Tooling and API Integrations

Our agent now remembers context, but it still only talks. In the Lumenly pilot, that meant a human had to execute every suggestion manually: "Run this SQL," "Send this Slack message," etc. Part 4 closes the loop by connecting the agent to external systems safely.

Thesis: structured tool definitions + guarded execution paths turn language into action. If you rely on free-form instructions ("call the billing API somehow"), you invite bugs and compliance nightmares. We'll design a tool registry, planner middleware, and execution harness that we can trust in production.

Architectural Overview

We'll introduce three components:

Tool Registry: JSON/YAML definitions describing function signatures, auth, and cost.
Planner Middleware: intercepts LLM responses, extracts tool intents, validates against the registry.
Executor: performs the API call or local action, logs metrics, and feeds the result back into the conversation.

Sequence diagram:

User -> Agent -> LLM Planner -> Tool Registry -> Executor -> External API
   ^            |            |            |
   \_------------+-- memory --+------------\_

Takeaway: Never let the model call APIs directly--always route through structured registries.

Define Tool Schemas

Create tools/registry.yaml:

- name: search_docs
  description: "Search the internal documentation index"
  request:
    method: GET
    url: "https://docs.lumenly.ai/search"
    query:
      q: string
  response:
    format: json
  auth: bearer
  cost_estimate:
    tokens: 0
    latency_ms: 400
- name: create_support_ticket
  description: "Create a Zendesk ticket for follow-up"
  request:
    method: POST
    url: "https://api.zendesk.com/v2/tickets"
    body:
      subject: string
      description: string
      priority: ["low", "normal", "high"]
  auth: oauth
  cost_estimate:
    tokens: 0
    latency_ms: 1200

Parse it:

# src/agent_lab/tools/registry.py
import yaml
from dataclasses import dataclass
from pathlib import Path

@dataclass
class Tool:
    name: str
    description: str
    spec: dict

class ToolRegistry:
    def __init__(self, path: Path):
        raw = yaml.safe_load(path.read_text())
        self.tools = {item["name"]: Tool(item["name"], item["description"], item)}

    def list(self):
        return list(self.tools.values())

    def get(self, name: str) -> Tool:
        return self.tools[name]

Takeaway: Schemas live outside code so ops can review them like any other API contract.

Teach the Planner About Tools

Modify LLMClient.complete to include tool metadata. Many providers (OpenAI, Anthropic) support function-calling natively. Example using OpenAI tool specs:

def complete(self, messages: list[Message], tools: list[Tool]) -> dict:
    tool_defs = [
        {
            "type": "function",
            "function": {
                "name": tool.name,
                "description": tool.description,
                "parameters": tool.spec["request"]["body"] or {},
            },
        }
        for tool in tools
    ]
    return self._client.responses.create(
        model=self._model,
        input=[m.__dict__ for m in messages],
        tools=tool_defs,
    )

The response may include tool_calls. Parse them:

result = self.llm.complete(history, registry.list())
tool_calls = getattr(result.output[0].content[0], "tool_calls", [])

Takeaway: Let the model propose actions, but only execute them after validation.

Validate and Execute Tool Calls

Create executor.py:

import httpx, json
from agent_lab.tools.registry import Tool, ToolRegistry
from agent_lab.telemetry import emit_metric, logger

class ToolExecutor:
    def __init__(self, registry: ToolRegistry):
        self.registry = registry

    def execute(self, call: dict) -> str:
        tool = self.registry.get(call["name"])
        args = json.loads(call["arguments"])
        self._validate(tool, args)
        response = self._dispatch(tool, args)
        emit_metric("tool.call", name=tool.name, status=response.status_code)
        return response.text

    def _validate(self, tool: Tool, args: dict):
        required = tool.spec["request"].get("body", {})
        for field in required:
            if field not in args:
                raise ValueError(f"Missing {field} for {tool.name}")

    def _dispatch(self, tool: Tool, args: dict):
        request = tool.spec["request"]
        method = request["method"].lower()
        url = request["url"]
        auth_header = {"Authorization": f"Bearer {self._token(tool)}"}
        return httpx.request(method, url, timeout=10, headers=auth_header, json=args)

Wrap in try/except and surface failures back to the agent as system messages ("Tool call failed: ..."). Later we can add retries or circuit breakers per tool.

Takeaway: Executors enforce policy: rate limiting, auth scopes, logging.

Close the Loop in `CoreAgent`

Update send:

result = self.llm.complete(history, self.registry.list())
tool_calls = getattr(result.output[0].content[0], "tool_calls", [])
if tool_calls:
    for call in tool_calls:
        logger.info("tool requested", extra={"tool": call.name})
        try:
            outcome = self.executor.execute(call.to_dict())
            self.store.append(Message("system", f"Tool {call.name} result: {outcome}"))
        except Exception as exc:
            self.store.append(Message("system", f"Tool {call.name} failed: {exc}"))
else:
    reply = Message("assistant", result.output_text)
    self.store.append(reply)
    return reply

This recursion (tool result -> new system message -> next LLM call) lets the agent chain actions.

Takeaway: Treat tool responses as first-class context; the LLM should reason about outcomes explicitly.

Example: Weather Lookup Tool

Add a simple tool to illustrate:

- name: weather_lookup
  description: "Get current weather for a city"
  request:
    method: GET
    url: "https://api.weatherapi.com/v1/current.json"
    query:
      key: env:WEATHER_API_KEY
      q: string
  response:
    format: json
  auth: none

Implement _dispatch to substitute env:... tokens automatically. This pattern keeps secrets out of prompt text.

CLI demo:

You: Should I bring an umbrella to Seattle tonight?
Agent: Checking the latest weather data for Seattle...
Agent: Forecast retrieved: light rain, 45degF. Carrying an umbrella is recommended.

Takeaway: Use real APIs sparingly during development; add mock servers or VCR-style fixtures for tests.

Testing Strategy

Schema validation: run python scripts/validate_tools.py to ensure every registry entry has required fields.
Mocked executor tests: use pytest + respx to simulate HTTP calls.

@respx.mock
def test_tool_execution_success(tmp_path):
    respx.get("https://docs.lumenly.ai/search").mock(return_value=httpx.Response(200, json={"hits": []}))
    registry = ToolRegistry(tmp_path / "registry.yaml")
    executor = ToolExecutor(registry)
    call = {"name": "search_docs", "arguments": json.dumps({"q": "billing"})}
    out = executor.execute(call)
    assert "hits" in out

End-to-end dry runs: set TOOL_MODE=mock in .env to bypass real APIs and return canned responses. Your CLI should expose a flag (--mock-tools) for testers.

Takeaway: Tooling without tests is a production outage waiting to happen.

Operational Safeguards

Rate Limiting: track tool call frequency, throttle within executor.
Audit Logs: log tool_name, arguments_hash, status, latency_ms.
Permissions: map tools to roles; some workflows may restrict which agent persona can call which tool.
Rollbacks: for mutating actions (e.g., create_support_ticket), emit reversible events (store response IDs so Part 5 can roll them back).

Document these in docs/tooling.md. SRE and security teams should review the registry in the same way they review Terraform changes.

What You Can Do Now

Your agent can:

Inspect the tool catalog.
Ask the model to select tools.
Validate and execute API calls.
Feed tool responses back into context.
Log every action for audits.

Run a quick test:

python -m agent_lab.cli chat --mock-tools
python -m agent_lab.cli tools list
python -m agent_lab.cli tools describe weather_lookup

If everything looks good, move to Part 5: Testing, Debugging, and Deployment. We'll wire simulations, CI pipelines, and deployment playbooks so these tools run safely in staging and prod.

Ad Space

Recommended Tools & Resources

* This section contains affiliate links. We may earn a commission when you purchase through these links at no additional cost to you.

📚 Featured AI Books

The Agentic AI Bible

The AI Revolution in Project Management

The AI Engineering Bible

OpenAI API

AI Platform

Access GPT-4 and other powerful AI models for your agent development.

Pay-per-use

LangChain Plus

Framework

Advanced framework for building applications with large language models.

Free + Paid

Pinecone Vector Database

Database

High-performance vector database for AI applications and semantic search.

Free tier available

AI Agent Development Course

Education

Complete course on building production-ready AI agents from scratch.

$199

💡 Pro Tip

Start with the free tiers of these tools to experiment, then upgrade as your AI agent projects grow. Most successful developers use a combination of 2-3 core tools rather than trying everything at once.