Building an AI Operating System for a Public Company

The Problem

As COO of a NASDAQ-listed company, I oversee six departments: Customer Support, Sales, Marketing, Design, Supply Chain, and App Development. Every Monday morning, I need to know: What happened last week? What's on fire? What needs my attention?

The information lives across multiple systems:

NetSuite — Revenue, inventory, financials
Shopify — Orders, customer data, products
Airtable — Project tracking, content calendars, Rx orders
Intercom — Customer support tickets
ShipStation — Fulfillment status

Getting a complete picture meant logging into 5+ systems, running reports, copying data into spreadsheets, and synthesizing it into something actionable. This took hours every week. And by the time I had the data, it was already stale.

I needed a system that could:

Pull data from multiple sources automatically
Synthesize it into executive-ready summaries
Alert me to anomalies in real-time
Run without constant babysitting

The Architecture

The solution is a collection of 10 specialized AI agents, each with a single responsibility. They run on Claude Code via Model Context Protocol (MCP) servers that connect directly to enterprise systems.

┌─────────────────────────────────────────────────────────────────────────────┐ │ AGENTIC OPS ARCHITECTURE │ ├─────────────────────────────────────────────────────────────────────────────┤ │ │ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │ │ │ NETSUITE │ │ SHOPIFY │ │ AIRTABLE │ │ SUPABASE │ │ │ │ (Revenue, │ │ (Orders, │ │ (Projects, │ │ (State, │ │ │ │ Inventory) │ │ Customers) │ │ Rx Orders) │ │ Logs) │ │ │ └──────┬──────┘ └──────┬──────┘ └──────┬──────┘ └──────┬──────┘ │ │ │ │ │ │ │ │ └────────────┬─────┴─────────────┬────┴──────────────────┘ │ │ │ │ │ │ ▼ ▼ │ │ ┌─────────────────────────────────────┐ │ │ │ MCP SERVERS │ │ │ │ (NetSuite, Zapier, Airtable) │ │ │ └─────────────────┬───────────────────┘ │ │ │ │ │ ▼ │ │ ┌─────────────────────────────────────────────────────────────────────┐ │ │ │ CLAUDE CODE │ │ │ │ ┌───────────┐ ┌───────────┐ ┌───────────┐ ┌───────────┐ │ │ │ │ │ DAILY │ │ PERIODIC │ │ REACTIVE │ │ UTILITY │ │ │ │ │ │ AGENTS │ │ AGENTS │ │ AGENTS │ │ AGENTS │ │ │ │ │ └───────────┘ └───────────┘ └───────────┘ └───────────┘ │ │ │ └─────────────────────────────────────────────────────────────────────┘ │ │ │ │ │ ▼ │ │ ┌─────────────┐ │ │ │ SLACK │ │ │ │ (Output) │ │ │ └─────────────┘ │ │ │ └─────────────────────────────────────────────────────────────────────────────┘

The Four Agent Categories

Agents are organized by how and when they run:

Category	Agent	Trigger	Purpose
Daily	daily-ops-pulse	6am EST	Morning briefing with key metrics
	cs-ticket-triager	On ticket	Categorize and route support tickets
	inventory-watchdog	4-hourly	Low stock alerts
Periodic	netsuite-shopify-reconciler	Nightly 11pm	Revenue reconciliation
	weekly-department-digest	Friday 4pm	Week-end department summary
	board-metrics-compiler	Monthly 1st	Board report metrics
Reactive	rx-order-tracker	On event	Prescription order lifecycle
	order-anomaly-detector	Hourly	Fraud and high-value alerts
	project-status-aggregator	Mon/Thu	Cross-functional project health
Utility	vendor-comm-drafter	On-demand	China factory communications

Anatomy of an Agent

Each agent is defined by a single SKILL.md file that specifies everything Claude needs to execute the task. Here's a simplified version of the daily-ops-pulse agent:

---
name: daily-ops-pulse
description: Morning operations briefing with key metrics
model: haiku
trigger: Daily 6am EST
token_budget: 1500
mcp_tools:
  - netsuite (mcp__netsuite__)
  - shopify (mcp__zapier-mcp__shopify_*)
  - slack (mcp__zapier-mcp__slack_*)
---

# Daily Ops Pulse Agent

Generate a concise morning operations briefing for the COO.

## Data Collection Steps

### 1. Yesterday's Revenue (NetSuite)

SELECT
  SUM(tl.netamount) * -1 as revenue,
  COUNT(DISTINCT t.id) as order_count
FROM transaction t
JOIN transactionline tl ON t.id = tl.transaction
WHERE t.trandate = CURRENT_DATE - 1
  AND t.type IN ('CashSale', 'CustInvc')
  ...

### 2. Order Status (Shopify)

Use mcp__zapier-mcp__shopify_find_order to get:
- Orders created yesterday
- Orders with fulfillment_status = null

## Output Format

Post to Slack:

:sunrise: *Daily Ops Pulse - {DATE}*

*Revenue*
- Yesterday: ${REVENUE} ({CHANGE}% vs prior day)
- MTD: ${MTD_REVENUE}

*Orders*
- New: {COUNT} | Pending: {PENDING}

*Inventory Alerts*
{LOW_STOCK_ITEMS or "All stock healthy"}

Key Design Principle

Each SKILL.md is self-contained. It specifies the data sources, queries, output format, and error handling. Claude doesn't need to figure out what to do — it just executes the playbook.

The Token Efficiency Problem

Running 10 agents with full context would burn through tokens fast. If each agent loaded the full company context, system documentation, and historical data, we'd hit 55,000+ tokens per session.

The solution is context layering:

Core context (~800 tokens) — Loaded once per session. Contains company overview, KPI definitions, and output templates.
Agent-specific context (~1,500-2,500 tokens) — Only what that agent needs.
Progressive loading — Start with summaries, drill into details only when needed.

# Context hierarchy
~/.claude/agents/
├── core/                          # Shared (load once)
│   ├── company-context.md         # ~500 tokens
│   ├── metrics-definitions.md     # ~300 tokens
│   └── templates/
│       ├── slack-formats.md
│       └── report-structures.md
│
├── daily/                         # Lightweight (<2k each)
│   ├── daily-ops-pulse/
│   ├── cs-ticket-triager/
│   └── inventory-watchdog/
│
└── periodic/                      # Medium weight
    ├── board-metrics-compiler/
    └── weekly-department-digest/

Results

Token usage dropped from ~55K to ~20K per full agent suite run. That's a 63% reduction with no loss of capability.

MCP: The API of AI

The magic happens through Model Context Protocol (MCP) servers. These create a standardized way for Claude to interact with external systems.

I run four MCP servers:

Server	Tools Exposed	Agents Using It
NetSuite	SuiteQL queries, record CRUD	ops-pulse, reconciler, inventory, board
Shopify (via Zapier)	Orders, customers, products	ops-pulse, reconciler, anomaly, rx-tracker
Airtable	Records, comments, tables	cs-triage, rx-tracker, project-status
Slack (via Zapier)	Messages, channels	All agents

When the daily-ops-pulse agent runs, it calls:

// NetSuite revenue query
mcp__netsuite__netsuite_search({
  query: "SELECT SUM(tl.netamount) * -1 as revenue..."
})

// Shopify pending orders
mcp__zapier-mcp__shopify_find_order({
  instructions: "Find orders with null fulfillment_status",
  output_hint: "order count and total value"
})

// Post to Slack
mcp__zapier-mcp__slack_send_channel_message({
  channel: "#ops-pulse",
  text: "..."
})

State Management

Agents need to remember things between runs. The order-anomaly-detector needs to know which orders it's already flagged. The reconciler needs yesterday's numbers for comparison.

I use Supabase tables for state:

agent_reconciliation_log — Nightly reconciliation results
inventory_watchdog_log — Historical low-stock alerts
order_anomaly_log — Flagged orders with disposition
project_status_history — Project snapshots over time

This enables trend analysis. The board-metrics-compiler can pull from historical logs to show month-over-month changes without re-running expensive queries.

What Actually Worked

After months of iteration, here's what moved the needle:

1. Single-responsibility agents beat multi-purpose ones

Early versions tried to do too much. A "morning briefing" agent that also handled alerts and anomaly detection became unwieldy. Breaking it into focused agents made each one more reliable and easier to debug.

2. Explicit output formats eliminate ambiguity

Telling Claude "summarize the data" produces inconsistent results. Specifying exactly what the Slack message should look like, with placeholders for each value, produces consistent output every time.

3. Error handling needs to be explicit

When a NetSuite query fails, the agent shouldn't just crash. Each SKILL.md specifies what to do: note the failure in the output, continue with available data, flag for manual review.

4. Model tiering saves money without sacrificing quality

Daily agents use Haiku — it's fast, cheap, and good enough for structured queries. Periodic agents that need synthesis (like board reports) use Sonnet. Strategic planning (rarely needed) gets Opus.

The Results

Six months in, this system runs reliably:

Monday mornings: I wake up to a Slack message with the week's performance, flagged anomalies, and priority items
Inventory alerts: Get notified before stockouts happen, not after
Board reporting: What used to take a day now takes an hour of review
Vendor communications: Consistent, professional factory emails in English and Chinese

The biggest win isn't time saved — it's decision latency. I make better decisions faster because I have current data instead of stale reports.

What's Next

This is v1. The roadmap includes:

Dashboard integration: Agents feeding live dashboards on Cloudflare Pages
Cross-agent communication: The anomaly detector triggering the customer support agent
Natural language queries: Ask questions in Slack, get answers from agents
Predictive alerts: Using historical patterns to predict problems before they happen

If you're building something similar, the key insight is this: AI agents aren't magic. They're automation with better interfaces. The work is in defining exactly what you want, connecting the data sources, and iterating on the output format until it's useful.

The tools are finally good enough. The question is: what operations problem is eating your time?

I'm the COO at Innovative Eyewear (NASDAQ: LUCY). Follow me on Twitter/X for more on AI operations.