Key Takeaways
- CLI-first, not chat-first. Building on Claude Code instead of ChatGPT's GUI enabled programmatic workflows, persistent context, and live API integrations that a chat window cannot support.
- 36 custom skills cover the full campaign lifecycle — from keyword research to weekly client reviews — each with structured inputs, deliberation gates, and logged outputs.
- API-Primary Architecture pulls live data from Google Ads, Meta, and BigQuery directly, eliminating the copy-paste-into-AI bottleneck that makes most agency AI implementations useless.
- Institutional memory compounds. Every client interaction, campaign change, and performance insight is captured in structured context files that make the system smarter over time.
How a Singapore performance marketing agency built a Claude-native AI operations system with 36 custom skills, live API integrations, and automated workflows to manage 20+ APAC clients.
The Problem Nobody Talks About
Most agencies treat AI like a better intern. Hand it a brief, get back a draft, fix the draft, repeat. That's not an AI strategy — it's a typing shortcut with extra steps.
The real bottleneck in performance marketing isn't content generation. It's the operational grind: pulling data across platforms, cross-referencing campaign performance against client KPIs, writing the same weekly review structure for 20 clients, remembering what you changed three weeks ago and why. These are the tasks that eat 60% of a strategist's week and produce zero strategic value.
We tried the obvious approach first. We used ChatGPT. We pasted screenshots. We wrote elaborate prompts. It worked — until it didn't. The moment we needed the system to remember a client's budget allocation, pull live Google Ads data, or execute a multi-step campaign build without hallucinating the account structure, the chat-window model collapsed.
So we built something different — from our Singapore headquarters, serving APAC campaigns across Singapore, Indonesia, and Australia.
Why CLI-First Changes Everything
The decision that shaped everything downstream was choosing Claude Code — Anthropic's command-line interface — over any GUI-based tool. This is not an aesthetic preference. It's an architectural one.
A GUI-based AI tool is a conversation. A CLI-based AI tool is a programmable system. When your AI operates in a terminal, it can read files, write files, execute scripts, call APIs, and maintain persistent context between sessions. It becomes infrastructure, not just an assistant.
We named the system Kali — a Claude-native operations system, meaning an operational architecture purpose-built around Claude Code's capabilities: file system access, API integration, persistent context, and programmatic skill execution, rather than retrofitting a chat-based AI into existing workflows. The delivery arm — the part that handles performance marketing — runs as Claudette, a persona that loads client context, follows operational playbooks, and applies deliberation gates before any action touches live campaigns.
Here is the key insight that makes this work: Claude Code treats your codebase as context. Every file in the project directory is accessible. That means client configurations, strategy documents, historical performance logs, and operational playbooks all exist as readable, structured files that the system references automatically. No copy-pasting. No prompt engineering gymnastics. The context is just there.
What We Actually Built
Kali is not one monolithic prompt. It's 36 discrete skills — each a self-contained capability with defined inputs, execution steps, and outputs. Skills range from simple (log an action to the pod action log) to complex (build a full Google Ads campaign structure from a media plan). The system currently manages over 20 clients across our Singapore and Indonesia delivery pods.
The system rests on three architectural pillars.
Pillar 1: API-Primary Architecture
Every data point the system uses comes from a live API call, not a pasted screenshot or uploaded CSV. Kali connects directly to Google Ads (via GAQL), Meta Marketing API, and BigQuery for historical reporting. When Claudette runs a weekly review, she queries actual campaign data, compares it against the client's configured KPIs, and produces analysis grounded in real numbers.
This sounds obvious. It is not. The vast majority of agencies using AI for reporting still paste data into a chat window. That approach breaks the moment you need to compare this week against last week, or cross-reference spend pacing against monthly budget targets — especially when you're managing Southeast Asian e-commerce accounts alongside B2B lead generation in Australia.
Pillar 2: Deliberation Gates
AI should not autonomously spend client money. Every Kali workflow that could affect live campaigns passes through a deliberation gate — a checkpoint where the system presents its analysis and recommendation, then waits for human approval before executing. Research produces recommendations. Builds create paused structures. Execution requires named sign-off.
Pillar 3: Institutional Memory
Each client has four context files that persist across sessions: client-context.md (business model, goals, constraints), ads-strategy.md (campaign architecture, targeting logic), media-plan.md (budget allocation, timeline), and memory.md (running log of decisions, learnings, and changes). Every session reads from these files. Important learnings get written back. The system compounds knowledge instead of resetting to zero.
System Architecture
A Day in the Life of Claudette
At 10:00 AM SGT, a cron job triggers the daily pacing script. Claudette pulls spend data from BigQuery for every active client, compares it against each client's daily budget target, and flags any account that is over-pacing by more than 15% or under-pacing by more than 20%. The results are posted to ClickUp, the team's project management tool, so strategists see alerts before they even open the ad platforms.
When a strategist needs a weekly review for a client, they invoke the skill. Claudette loads the client's context files, queries seven days of performance data from BigQuery, compares it against the previous period, identifies the top movers (campaigns with the largest absolute changes in spend, CPA, or ROAS), and writes a narrative analysis — not a data dump, but an interpretation of what changed and why it matters.
When a team member joins a client call and records the transcript, the transcript processing skill extracts action items, updates the client's memory file with relevant decisions, and logs follow-ups to the pod action log. Meeting notes that used to take 30 minutes to write and were often skipped entirely now happen automatically.
What Changed — Operationally
Building this system didn't eliminate anyone's job. It eliminated the parts of everyone's job that produced the least strategic value.
Before & After — Operational Impact
| Task | Before Kali | After Kali |
|---|---|---|
| Weekly client review | 2-3 hours per client (data pull, analysis, write-up) | 15-20 minutes (review & edit AI-generated analysis) |
| Daily spend monitoring | Manual check across 20+ accounts each morning | Automated alerts at 10 AM SGT, only exceptions surfaced |
| Client onboarding | Tribal knowledge, inconsistent setup across strategists | Config-First Onboarding: structured files, same quality every time |
| Meeting follow-ups | Often skipped or delayed by days | Transcripts processed same-day, actions logged automatically |
| Campaign builds | Manual setup in platform, easy to miss settings | AI-prepared structure with deliberation gate, created as PAUSED |
| Knowledge retention | In people's heads, lost when they leave | Captured in memory files and knowledge hub, compounds over time |
The strategists now spend their time on the work that actually moves the needle: creative strategy, client relationships, and campaign experimentation. The operational infrastructure runs underneath, reliably and consistently.
What Most People Get Wrong
When agencies ask us about our AI setup, they almost always ask the wrong question. They ask "what model do you use?" or "what prompts did you write?" Those questions reveal a fundamental misunderstanding of what makes an AI operations system work.
The model is table stakes. The prompts are the least important part. What matters is the architecture around the model: how data flows in, how context loads, how outputs are validated, how decisions are logged, and how the system learns over time. We could switch from Claude to another model tomorrow and the system would still work, because the value lives in the operational architecture — the skills, the context structure, the deliberation gates, the feedback loops — not in any single model's capabilities.
If your AI strategy is "give everyone ChatGPT licenses and see what happens," you will get exactly what that approach deserves: sporadic, inconsistent, and impossible to scale.
The alternative is to treat AI as infrastructure. Build it into the workflows. Give it real data access. Put guardrails around its actions. Capture what it learns. That's what we did across our APAC operations, and that's what actually works.
Frequently Asked Questions
How do marketing agencies use Claude AI for campaign management?
Agencies using Claude for campaign management typically build custom skill sets — reusable workflows that handle specific tasks like weekly performance reviews, campaign builds, and spend monitoring. At Kaliber, we built 36 skills on Claude Code that connect to Google Ads, Meta, and BigQuery APIs, allowing the system to pull live data, analyze performance against client-specific KPIs, and generate actionable reports. The key difference from casual ChatGPT use is persistent context: the system remembers each client's business model, targets, and history.
What is the difference between using ChatGPT and Claude Code for marketing?
ChatGPT is a conversation interface — you paste data in, get analysis out, and context resets between sessions. Claude Code is a CLI tool that operates within your file system, meaning it can read client configurations, execute API calls, write structured outputs, and maintain persistent memory across sessions. For one-off content tasks, ChatGPT works fine. For operational workflows that need to run consistently across 20 clients with real data, Claude Code's programmatic access is the differentiator.
Can Claude AI manage Google Ads and Meta campaigns?
Yes, but with critical guardrails. Our system uses the Google Ads API (via GAQL queries) and Meta Marketing API to both read campaign data and execute changes — including budget adjustments, status changes, bid strategy modifications, and campaign creation. However, every mutation passes through a deliberation gate: the AI prepares the action, presents its reasoning, and waits for human approval before executing. All campaigns are created in a PAUSED state by default.
How much does it cost to build an AI operations system for a marketing agency?
The primary costs are Claude API usage (which varies by volume but typically runs a few hundred dollars per month for a 20-client agency), API access fees for ad platforms (Google Ads and Meta APIs are free), and BigQuery storage and queries (minimal for reporting workloads). The real investment is time: building the skill library, structuring client context files, and iterating on workflows took several months of incremental development. There is no large upfront software license — the system is built on open tooling.
What are Claude Code skills and how do they work?
A Claude Code skill is a structured markdown file that defines a repeatable workflow — including what inputs it needs, what steps to follow, what APIs to call, and what output to produce. When invoked, Claude Code reads the skill file and executes it as a multi-step process. For example, Kaliber's "weekly review" skill loads a client's context files, queries BigQuery for seven days of performance data, compares metrics against configured KPIs, and generates a narrative analysis. Skills are reusable across clients because the client-specific details come from config files, not the skill itself.
Is Claude better than ChatGPT for marketing agencies?
For operational workflows that require file system access, API integration, and persistent context, Claude Code offers capabilities that ChatGPT's interface does not support. For ad-hoc brainstorming, content ideation, and one-off analysis, both models perform well. The choice depends less on model quality and more on how you plan to use AI — as a conversation partner (ChatGPT is fine) or as operational infrastructure (Claude Code's CLI architecture is better suited).