Build Your First AI Agent from Scratch - Part 4: Tooling and API Integrations

📚 Build Your First AI Agent
View All Parts in This Series
Ad Space
Build Your First AI Agent from Scratch - Part 4: Tooling and API Integrations
Our agent now remembers context, but it still only talks. In the Lumenly pilot, that meant a human had to execute every suggestion manually: "Run this SQL," "Send this Slack message," etc. Part 4 closes the loop by connecting the agent to external systems safely.
Thesis: structured tool definitions + guarded execution paths turn language into action. If you rely on free-form instructions ("call the billing API somehow"), you invite bugs and compliance nightmares. We'll design a tool registry, planner middleware, and execution harness that we can trust in production.
Architectural Overview
We'll introduce three components:
- Tool Registry: JSON/YAML definitions describing function signatures, auth, and cost.
- Planner Middleware: intercepts LLM responses, extracts tool intents, validates against the registry.
- Executor: performs the API call or local action, logs metrics, and feeds the result back into the conversation.
Sequence diagram:
User -> Agent -> LLM Planner -> Tool Registry -> Executor -> External API
^ | | |
\_------------+-- memory --+------------\_
Takeaway: Never let the model call APIs directly--always route through structured registries.
Define Tool Schemas
Create tools/registry.yaml:
- name: search_docs
description: "Search the internal documentation index"
request:
method: GET
url: "https://docs.lumenly.ai/search"
query:
q: string
response:
format: json
auth: bearer
cost_estimate:
tokens: 0
latency_ms: 400
- name: create_support_ticket
description: "Create a Zendesk ticket for follow-up"
request:
method: POST
url: "https://api.zendesk.com/v2/tickets"
body:
subject: string
description: string
priority: ["low", "normal", "high"]
auth: oauth
cost_estimate:
tokens: 0
latency_ms: 1200
Parse it:
# src/agent_lab/tools/registry.py
import yaml
from dataclasses import dataclass
from pathlib import Path
@dataclass
class Tool:
name: str
description: str
spec: dict
class ToolRegistry:
def __init__(self, path: Path):
raw = yaml.safe_load(path.read_text())
self.tools = {item["name"]: Tool(item["name"], item["description"], item)}
def list(self):
return list(self.tools.values())
def get(self, name: str) -> Tool:
return self.tools[name]
Takeaway: Schemas live outside code so ops can review them like any other API contract.
Teach the Planner About Tools
Modify LLMClient.complete to include tool metadata. Many providers (OpenAI, Anthropic) support function-calling natively. Example using OpenAI tool specs:
def complete(self, messages: list[Message], tools: list[Tool]) -> dict:
tool_defs = [
{
"type": "function",
"function": {
"name": tool.name,
"description": tool.description,
"parameters": tool.spec["request"]["body"] or {},
},
}
for tool in tools
]
return self._client.responses.create(
model=self._model,
input=[m.__dict__ for m in messages],
tools=tool_defs,
)
The response may include tool_calls. Parse them:
result = self.llm.complete(history, registry.list())
tool_calls = getattr(result.output[0].content[0], "tool_calls", [])
Takeaway: Let the model propose actions, but only execute them after validation.
Validate and Execute Tool Calls
Create executor.py:
import httpx, json
from agent_lab.tools.registry import Tool, ToolRegistry
from agent_lab.telemetry import emit_metric, logger
class ToolExecutor:
def __init__(self, registry: ToolRegistry):
self.registry = registry
def execute(self, call: dict) -> str:
tool = self.registry.get(call["name"])
args = json.loads(call["arguments"])
self._validate(tool, args)
response = self._dispatch(tool, args)
emit_metric("tool.call", name=tool.name, status=response.status_code)
return response.text
def _validate(self, tool: Tool, args: dict):
required = tool.spec["request"].get("body", {})
for field in required:
if field not in args:
raise ValueError(f"Missing {field} for {tool.name}")
def _dispatch(self, tool: Tool, args: dict):
request = tool.spec["request"]
method = request["method"].lower()
url = request["url"]
auth_header = {"Authorization": f"Bearer {self._token(tool)}"}
return httpx.request(method, url, timeout=10, headers=auth_header, json=args)
Wrap in try/except and surface failures back to the agent as system messages ("Tool call failed: ..."). Later we can add retries or circuit breakers per tool.
Takeaway: Executors enforce policy: rate limiting, auth scopes, logging.
Close the Loop in CoreAgent
Update send:
result = self.llm.complete(history, self.registry.list())
tool_calls = getattr(result.output[0].content[0], "tool_calls", [])
if tool_calls:
for call in tool_calls:
logger.info("tool requested", extra={"tool": call.name})
try:
outcome = self.executor.execute(call.to_dict())
self.store.append(Message("system", f"Tool {call.name} result: {outcome}"))
except Exception as exc:
self.store.append(Message("system", f"Tool {call.name} failed: {exc}"))
else:
reply = Message("assistant", result.output_text)
self.store.append(reply)
return reply
This recursion (tool result -> new system message -> next LLM call) lets the agent chain actions.
Takeaway: Treat tool responses as first-class context; the LLM should reason about outcomes explicitly.
Example: Weather Lookup Tool
Add a simple tool to illustrate:
- name: weather_lookup
description: "Get current weather for a city"
request:
method: GET
url: "https://api.weatherapi.com/v1/current.json"
query:
key: env:WEATHER_API_KEY
q: string
response:
format: json
auth: none
Implement _dispatch to substitute env:... tokens automatically. This pattern keeps secrets out of prompt text.
CLI demo:
You: Should I bring an umbrella to Seattle tonight?
Agent: Checking the latest weather data for Seattle...
Agent: Forecast retrieved: light rain, 45degF. Carrying an umbrella is recommended.
Takeaway: Use real APIs sparingly during development; add mock servers or VCR-style fixtures for tests.
Testing Strategy
- Schema validation: run
python scripts/validate_tools.pyto ensure every registry entry has required fields. - Mocked executor tests: use
pytest+respxto simulate HTTP calls.
@respx.mock
def test_tool_execution_success(tmp_path):
respx.get("https://docs.lumenly.ai/search").mock(return_value=httpx.Response(200, json={"hits": []}))
registry = ToolRegistry(tmp_path / "registry.yaml")
executor = ToolExecutor(registry)
call = {"name": "search_docs", "arguments": json.dumps({"q": "billing"})}
out = executor.execute(call)
assert "hits" in out
- End-to-end dry runs: set
TOOL_MODE=mockin.envto bypass real APIs and return canned responses. Your CLI should expose a flag (--mock-tools) for testers.
Takeaway: Tooling without tests is a production outage waiting to happen.
Operational Safeguards
- Rate Limiting: track tool call frequency, throttle within executor.
- Audit Logs: log
tool_name,arguments_hash,status,latency_ms. - Permissions: map tools to roles; some workflows may restrict which agent persona can call which tool.
- Rollbacks: for mutating actions (e.g.,
create_support_ticket), emit reversible events (store response IDs so Part 5 can roll them back).
Document these in docs/tooling.md. SRE and security teams should review the registry in the same way they review Terraform changes.
What You Can Do Now
Your agent can:
- Inspect the tool catalog.
- Ask the model to select tools.
- Validate and execute API calls.
- Feed tool responses back into context.
- Log every action for audits.
Run a quick test:
python -m agent_lab.cli chat --mock-tools
python -m agent_lab.cli tools list
python -m agent_lab.cli tools describe weather_lookup
If everything looks good, move to Part 5: Testing, Debugging, and Deployment. We'll wire simulations, CI pipelines, and deployment playbooks so these tools run safely in staging and prod.
Ad Space
Recommended Tools & Resources
* This section contains affiliate links. We may earn a commission when you purchase through these links at no additional cost to you.
📚 Featured AI Books
OpenAI API
AI PlatformAccess GPT-4 and other powerful AI models for your agent development.
LangChain Plus
FrameworkAdvanced framework for building applications with large language models.
Pinecone Vector Database
DatabaseHigh-performance vector database for AI applications and semantic search.
AI Agent Development Course
EducationComplete course on building production-ready AI agents from scratch.
💡 Pro Tip
Start with the free tiers of these tools to experiment, then upgrade as your AI agent projects grow. Most successful developers use a combination of 2-3 core tools rather than trying everything at once.
📚 Build Your First AI Agent
View All Parts in This Series
🚀 Join the AgentForge Community
Get weekly insights, tutorials, and the latest AI agent developments delivered to your inbox.
No spam, ever. Unsubscribe at any time.



