Back to Notes

Building an AI Operating System for a Public Company

How I built 10 specialized AI agents to run operations across 6 departments at a NASDAQ-listed company. The architecture, code patterns, and what actually worked.


The Problem

As COO of a NASDAQ-listed company, I oversee six departments: Customer Support, Sales, Marketing, Design, Supply Chain, and App Development. Every Monday morning, I need to know: What happened last week? What's on fire? What needs my attention?

The information lives across multiple systems:

Getting a complete picture meant logging into 5+ systems, running reports, copying data into spreadsheets, and synthesizing it into something actionable. This took hours every week. And by the time I had the data, it was already stale.

I needed a system that could:

  1. Pull data from multiple sources automatically
  2. Synthesize it into executive-ready summaries
  3. Alert me to anomalies in real-time
  4. Run without constant babysitting

The Architecture

The solution is a collection of 10 specialized AI agents, each with a single responsibility. They run on Claude Code via Model Context Protocol (MCP) servers that connect directly to enterprise systems.

┌─────────────────────────────────────────────────────────────────────────────┐ │ AGENTIC OPS ARCHITECTURE │ ├─────────────────────────────────────────────────────────────────────────────┤ │ │ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │ │ │ NETSUITE │ │ SHOPIFY │ │ AIRTABLE │ │ SUPABASE │ │ │ │ (Revenue, │ │ (Orders, │ │ (Projects, │ │ (State, │ │ │ │ Inventory) │ │ Customers) │ │ Rx Orders) │ │ Logs) │ │ │ └──────┬──────┘ └──────┬──────┘ └──────┬──────┘ └──────┬──────┘ │ │ │ │ │ │ │ │ └────────────┬─────┴─────────────┬────┴──────────────────┘ │ │ │ │ │ │ ▼ ▼ │ │ ┌─────────────────────────────────────┐ │ │ │ MCP SERVERS │ │ │ │ (NetSuite, Zapier, Airtable) │ │ │ └─────────────────┬───────────────────┘ │ │ │ │ │ ▼ │ │ ┌─────────────────────────────────────────────────────────────────────┐ │ │ │ CLAUDE CODE │ │ │ │ ┌───────────┐ ┌───────────┐ ┌───────────┐ ┌───────────┐ │ │ │ │ │ DAILY │ │ PERIODIC │ │ REACTIVE │ │ UTILITY │ │ │ │ │ │ AGENTS │ │ AGENTS │ │ AGENTS │ │ AGENTS │ │ │ │ │ └───────────┘ └───────────┘ └───────────┘ └───────────┘ │ │ │ └─────────────────────────────────────────────────────────────────────┘ │ │ │ │ │ ▼ │ │ ┌─────────────┐ │ │ │ SLACK │ │ │ │ (Output) │ │ │ └─────────────┘ │ │ │ └─────────────────────────────────────────────────────────────────────────────┘

The Four Agent Categories

Agents are organized by how and when they run:

Category Agent Trigger Purpose
Daily daily-ops-pulse 6am EST Morning briefing with key metrics
cs-ticket-triager On ticket Categorize and route support tickets
inventory-watchdog 4-hourly Low stock alerts
Periodic netsuite-shopify-reconciler Nightly 11pm Revenue reconciliation
weekly-department-digest Friday 4pm Week-end department summary
board-metrics-compiler Monthly 1st Board report metrics
Reactive rx-order-tracker On event Prescription order lifecycle
order-anomaly-detector Hourly Fraud and high-value alerts
project-status-aggregator Mon/Thu Cross-functional project health
Utility vendor-comm-drafter On-demand China factory communications

Anatomy of an Agent

Each agent is defined by a single SKILL.md file that specifies everything Claude needs to execute the task. Here's a simplified version of the daily-ops-pulse agent:

---
name: daily-ops-pulse
description: Morning operations briefing with key metrics
model: haiku
trigger: Daily 6am EST
token_budget: 1500
mcp_tools:
  - netsuite (mcp__netsuite__)
  - shopify (mcp__zapier-mcp__shopify_*)
  - slack (mcp__zapier-mcp__slack_*)
---

# Daily Ops Pulse Agent

Generate a concise morning operations briefing for the COO.

## Data Collection Steps

### 1. Yesterday's Revenue (NetSuite)

SELECT
  SUM(tl.netamount) * -1 as revenue,
  COUNT(DISTINCT t.id) as order_count
FROM transaction t
JOIN transactionline tl ON t.id = tl.transaction
WHERE t.trandate = CURRENT_DATE - 1
  AND t.type IN ('CashSale', 'CustInvc')
  ...

### 2. Order Status (Shopify)

Use mcp__zapier-mcp__shopify_find_order to get:
- Orders created yesterday
- Orders with fulfillment_status = null

## Output Format

Post to Slack:

:sunrise: *Daily Ops Pulse - {DATE}*

*Revenue*
- Yesterday: ${REVENUE} ({CHANGE}% vs prior day)
- MTD: ${MTD_REVENUE}

*Orders*
- New: {COUNT} | Pending: {PENDING}

*Inventory Alerts*
{LOW_STOCK_ITEMS or "All stock healthy"}
Key Design Principle

Each SKILL.md is self-contained. It specifies the data sources, queries, output format, and error handling. Claude doesn't need to figure out what to do — it just executes the playbook.

The Token Efficiency Problem

Running 10 agents with full context would burn through tokens fast. If each agent loaded the full company context, system documentation, and historical data, we'd hit 55,000+ tokens per session.

The solution is context layering:

  1. Core context (~800 tokens) — Loaded once per session. Contains company overview, KPI definitions, and output templates.
  2. Agent-specific context (~1,500-2,500 tokens) — Only what that agent needs.
  3. Progressive loading — Start with summaries, drill into details only when needed.
# Context hierarchy
~/.claude/agents/
├── core/                          # Shared (load once)
│   ├── company-context.md         # ~500 tokens
│   ├── metrics-definitions.md     # ~300 tokens
│   └── templates/
│       ├── slack-formats.md
│       └── report-structures.md
│
├── daily/                         # Lightweight (<2k each)
│   ├── daily-ops-pulse/
│   ├── cs-ticket-triager/
│   └── inventory-watchdog/
│
└── periodic/                      # Medium weight
    ├── board-metrics-compiler/
    └── weekly-department-digest/
Results

Token usage dropped from ~55K to ~20K per full agent suite run. That's a 63% reduction with no loss of capability.

MCP: The API of AI

The magic happens through Model Context Protocol (MCP) servers. These create a standardized way for Claude to interact with external systems.

I run four MCP servers:

Server Tools Exposed Agents Using It
NetSuite SuiteQL queries, record CRUD ops-pulse, reconciler, inventory, board
Shopify (via Zapier) Orders, customers, products ops-pulse, reconciler, anomaly, rx-tracker
Airtable Records, comments, tables cs-triage, rx-tracker, project-status
Slack (via Zapier) Messages, channels All agents

When the daily-ops-pulse agent runs, it calls:

// NetSuite revenue query
mcp__netsuite__netsuite_search({
  query: "SELECT SUM(tl.netamount) * -1 as revenue..."
})

// Shopify pending orders
mcp__zapier-mcp__shopify_find_order({
  instructions: "Find orders with null fulfillment_status",
  output_hint: "order count and total value"
})

// Post to Slack
mcp__zapier-mcp__slack_send_channel_message({
  channel: "#ops-pulse",
  text: "..."
})

State Management

Agents need to remember things between runs. The order-anomaly-detector needs to know which orders it's already flagged. The reconciler needs yesterday's numbers for comparison.

I use Supabase tables for state:

This enables trend analysis. The board-metrics-compiler can pull from historical logs to show month-over-month changes without re-running expensive queries.

What Actually Worked

After months of iteration, here's what moved the needle:

1. Single-responsibility agents beat multi-purpose ones

Early versions tried to do too much. A "morning briefing" agent that also handled alerts and anomaly detection became unwieldy. Breaking it into focused agents made each one more reliable and easier to debug.

2. Explicit output formats eliminate ambiguity

Telling Claude "summarize the data" produces inconsistent results. Specifying exactly what the Slack message should look like, with placeholders for each value, produces consistent output every time.

3. Error handling needs to be explicit

When a NetSuite query fails, the agent shouldn't just crash. Each SKILL.md specifies what to do: note the failure in the output, continue with available data, flag for manual review.

4. Model tiering saves money without sacrificing quality

Daily agents use Haiku — it's fast, cheap, and good enough for structured queries. Periodic agents that need synthesis (like board reports) use Sonnet. Strategic planning (rarely needed) gets Opus.

The Results

Six months in, this system runs reliably:

The biggest win isn't time saved — it's decision latency. I make better decisions faster because I have current data instead of stale reports.

What's Next

This is v1. The roadmap includes:


If you're building something similar, the key insight is this: AI agents aren't magic. They're automation with better interfaces. The work is in defining exactly what you want, connecting the data sources, and iterating on the output format until it's useful.

The tools are finally good enough. The question is: what operations problem is eating your time?


I'm the COO at Innovative Eyewear (NASDAQ: LUCY). Follow me on Twitter/X for more on AI operations.