The Problem
As COO of a NASDAQ-listed company, I oversee six departments: Customer Support, Sales, Marketing, Design, Supply Chain, and App Development. Every Monday morning, I need to know: What happened last week? What's on fire? What needs my attention?
The information lives across multiple systems:
- NetSuite — Revenue, inventory, financials
- Shopify — Orders, customer data, products
- Airtable — Project tracking, content calendars, Rx orders
- Intercom — Customer support tickets
- ShipStation — Fulfillment status
Getting a complete picture meant logging into 5+ systems, running reports, copying data into spreadsheets, and synthesizing it into something actionable. This took hours every week. And by the time I had the data, it was already stale.
I needed a system that could:
- Pull data from multiple sources automatically
- Synthesize it into executive-ready summaries
- Alert me to anomalies in real-time
- Run without constant babysitting
The Architecture
The solution is a collection of 10 specialized AI agents, each with a single responsibility. They run on Claude Code via Model Context Protocol (MCP) servers that connect directly to enterprise systems.
The Four Agent Categories
Agents are organized by how and when they run:
| Category | Agent | Trigger | Purpose |
|---|---|---|---|
| Daily | daily-ops-pulse | 6am EST | Morning briefing with key metrics |
| cs-ticket-triager | On ticket | Categorize and route support tickets | |
| inventory-watchdog | 4-hourly | Low stock alerts | |
| Periodic | netsuite-shopify-reconciler | Nightly 11pm | Revenue reconciliation |
| weekly-department-digest | Friday 4pm | Week-end department summary | |
| board-metrics-compiler | Monthly 1st | Board report metrics | |
| Reactive | rx-order-tracker | On event | Prescription order lifecycle |
| order-anomaly-detector | Hourly | Fraud and high-value alerts | |
| project-status-aggregator | Mon/Thu | Cross-functional project health | |
| Utility | vendor-comm-drafter | On-demand | China factory communications |
Anatomy of an Agent
Each agent is defined by a single SKILL.md file that specifies everything Claude needs to execute the task. Here's a simplified version of the daily-ops-pulse agent:
---
name: daily-ops-pulse
description: Morning operations briefing with key metrics
model: haiku
trigger: Daily 6am EST
token_budget: 1500
mcp_tools:
- netsuite (mcp__netsuite__)
- shopify (mcp__zapier-mcp__shopify_*)
- slack (mcp__zapier-mcp__slack_*)
---
# Daily Ops Pulse Agent
Generate a concise morning operations briefing for the COO.
## Data Collection Steps
### 1. Yesterday's Revenue (NetSuite)
SELECT
SUM(tl.netamount) * -1 as revenue,
COUNT(DISTINCT t.id) as order_count
FROM transaction t
JOIN transactionline tl ON t.id = tl.transaction
WHERE t.trandate = CURRENT_DATE - 1
AND t.type IN ('CashSale', 'CustInvc')
...
### 2. Order Status (Shopify)
Use mcp__zapier-mcp__shopify_find_order to get:
- Orders created yesterday
- Orders with fulfillment_status = null
## Output Format
Post to Slack:
:sunrise: *Daily Ops Pulse - {DATE}*
*Revenue*
- Yesterday: ${REVENUE} ({CHANGE}% vs prior day)
- MTD: ${MTD_REVENUE}
*Orders*
- New: {COUNT} | Pending: {PENDING}
*Inventory Alerts*
{LOW_STOCK_ITEMS or "All stock healthy"}
Each SKILL.md is self-contained. It specifies the data sources, queries, output format, and error handling. Claude doesn't need to figure out what to do — it just executes the playbook.
The Token Efficiency Problem
Running 10 agents with full context would burn through tokens fast. If each agent loaded the full company context, system documentation, and historical data, we'd hit 55,000+ tokens per session.
The solution is context layering:
- Core context (~800 tokens) — Loaded once per session. Contains company overview, KPI definitions, and output templates.
- Agent-specific context (~1,500-2,500 tokens) — Only what that agent needs.
- Progressive loading — Start with summaries, drill into details only when needed.
# Context hierarchy
~/.claude/agents/
├── core/ # Shared (load once)
│ ├── company-context.md # ~500 tokens
│ ├── metrics-definitions.md # ~300 tokens
│ └── templates/
│ ├── slack-formats.md
│ └── report-structures.md
│
├── daily/ # Lightweight (<2k each)
│ ├── daily-ops-pulse/
│ ├── cs-ticket-triager/
│ └── inventory-watchdog/
│
└── periodic/ # Medium weight
├── board-metrics-compiler/
└── weekly-department-digest/
Token usage dropped from ~55K to ~20K per full agent suite run. That's a 63% reduction with no loss of capability.
MCP: The API of AI
The magic happens through Model Context Protocol (MCP) servers. These create a standardized way for Claude to interact with external systems.
I run four MCP servers:
| Server | Tools Exposed | Agents Using It |
|---|---|---|
| NetSuite | SuiteQL queries, record CRUD | ops-pulse, reconciler, inventory, board |
| Shopify (via Zapier) | Orders, customers, products | ops-pulse, reconciler, anomaly, rx-tracker |
| Airtable | Records, comments, tables | cs-triage, rx-tracker, project-status |
| Slack (via Zapier) | Messages, channels | All agents |
When the daily-ops-pulse agent runs, it calls:
// NetSuite revenue query
mcp__netsuite__netsuite_search({
query: "SELECT SUM(tl.netamount) * -1 as revenue..."
})
// Shopify pending orders
mcp__zapier-mcp__shopify_find_order({
instructions: "Find orders with null fulfillment_status",
output_hint: "order count and total value"
})
// Post to Slack
mcp__zapier-mcp__slack_send_channel_message({
channel: "#ops-pulse",
text: "..."
})
State Management
Agents need to remember things between runs. The order-anomaly-detector needs to know which orders it's already flagged. The reconciler needs yesterday's numbers for comparison.
I use Supabase tables for state:
agent_reconciliation_log— Nightly reconciliation resultsinventory_watchdog_log— Historical low-stock alertsorder_anomaly_log— Flagged orders with dispositionproject_status_history— Project snapshots over time
This enables trend analysis. The board-metrics-compiler can pull from historical logs to show month-over-month changes without re-running expensive queries.
What Actually Worked
After months of iteration, here's what moved the needle:
1. Single-responsibility agents beat multi-purpose ones
Early versions tried to do too much. A "morning briefing" agent that also handled alerts and anomaly detection became unwieldy. Breaking it into focused agents made each one more reliable and easier to debug.
2. Explicit output formats eliminate ambiguity
Telling Claude "summarize the data" produces inconsistent results. Specifying exactly what the Slack message should look like, with placeholders for each value, produces consistent output every time.
3. Error handling needs to be explicit
When a NetSuite query fails, the agent shouldn't just crash. Each SKILL.md specifies what to do: note the failure in the output, continue with available data, flag for manual review.
4. Model tiering saves money without sacrificing quality
Daily agents use Haiku — it's fast, cheap, and good enough for structured queries. Periodic agents that need synthesis (like board reports) use Sonnet. Strategic planning (rarely needed) gets Opus.
The Results
Six months in, this system runs reliably:
- Monday mornings: I wake up to a Slack message with the week's performance, flagged anomalies, and priority items
- Inventory alerts: Get notified before stockouts happen, not after
- Board reporting: What used to take a day now takes an hour of review
- Vendor communications: Consistent, professional factory emails in English and Chinese
The biggest win isn't time saved — it's decision latency. I make better decisions faster because I have current data instead of stale reports.
What's Next
This is v1. The roadmap includes:
- Dashboard integration: Agents feeding live dashboards on Cloudflare Pages
- Cross-agent communication: The anomaly detector triggering the customer support agent
- Natural language queries: Ask questions in Slack, get answers from agents
- Predictive alerts: Using historical patterns to predict problems before they happen
If you're building something similar, the key insight is this: AI agents aren't magic. They're automation with better interfaces. The work is in defining exactly what you want, connecting the data sources, and iterating on the output format until it's useful.
The tools are finally good enough. The question is: what operations problem is eating your time?
I'm the COO at Innovative Eyewear (NASDAQ: LUCY). Follow me on Twitter/X for more on AI operations.