Every week, a new MCP server shows up on GitHub that wraps a CLI tool that already existed. MCP server for GitHub? gh has been doing that since 2020. MCP server for AWS? The aws CLI covers 200+ services with --output json. MCP server for Datadog? They shipped a CLI instead — and it works better.
I get the excitement around MCP (Model Context Protocol). It’s a clean standard for connecting AI to external systems. Anthropic designed it well. But somewhere along the way, the community started treating MCP as the default way to give AI agents access to tools — even when a perfectly good CLI already exists.
That’s a mistake.
The problem with MCP-by-default
MCP requires you to build and maintain a server. That server needs to:
- Implement the MCP protocol correctly
- Handle authentication and session state
- Stay compatible with evolving MCP client versions
- Be discovered, installed, and configured by the user
- Be maintained when the underlying API changes
That’s a lot of work for wrapping something that already has a CLI.
Compare that with what happens when Claude Code uses the GitHub CLI. Here’s what my actual workflow looks like — no MCP server involved:
# List open PRs with specific fields
gh pr list --state open --json number,title,author
# Check PR review status and CI checks
gh pr checks 42
gh api repos/owner/repo/pulls/42/reviews --jq '.[].state'
# Create a PR with a structured body
gh pr create --title "Fix auth bug" --body "## Summary\n- Fixed token refresh\n\n## Test plan\n- [x] Unit tests pass"
# Merge Dependabot PRs after checking changelogs
gh pr merge 15 --squash --auto
# Search issues across repos
gh search issues "memory leak" --repo owner/repo --state open --json number,title,urlNo server to run. No configuration beyond gh auth login. No protocol negotiation. Just text in, text out. And Claude already knows how to use it — because it’s been trained on millions of shell sessions, man pages, and Stack Overflow answers that reference these tools.
In fact, the previous blog post on this blog was researched entirely through gh — pulling PR timelines, commit histories, issue data, and Dependabot PR counts across three repositories. All via CLI. No MCP server needed.
LLMs are already fluent in CLI
This is the part that MCP advocates consistently underestimate. Large language models don’t need a structured protocol to interact with well-built CLI tools. They already understand them natively.
Think about it:
gh— Claude knows every subcommand, every flag, every--jqfilter pattern. It can compose multi-step GitHub workflows by chainingghcalls without ever reading documentation.aws— 200+ services, all with--output json. Claude can query CloudWatch, manage S3 buckets, describe EC2 instances — all through the CLI it’s been trained on.pup— Datadog’s new CLI. 200+ commands, structured JSON output, automatic agent detection. A well-built CLI doesn’t need training data —--helpand structured output are enough.jq,curl,git— The foundational UNIX tools. Claude is fluent in these to a degree that no MCP server will ever match, because the training data is decades deep.
As Sumner Evans put it : CLIs are text-based, so there are no wasted tokens from mode transformations. The input and output are plain text streams — exactly what LLMs natively consume and produce. No serialization overhead, no protocol negotiation, no schema translation. Just text.
The market already figured this out
Peter Steinberger — founder of PSPDFKit, 36k GitHub followers — came back from retirement to build AI agent tooling. He didn’t build MCP servers. He built CLIs.
Look at what he shipped: gogcli (5k stars) for Google Suite, imsg for Apple Messages, wacli for WhatsApp, Peekaboo (2.4k stars) for screenshots, summarize (4.4k stars) for URL/YouTube/Podcast summarization, remindctl for Apple Reminders, spogo for Spotify. All CLIs. All designed for AI agents.
These CLIs became the backbone of OpenClaw
— the open-source personal AI assistant with 226k stars. Its skills ecosystem
is literally a directory of 60+ standalone CLIs: imsg, wacli, gogcli, peekaboo, summarize, oracle, sag, ordercli — each one a CLI that OpenClaw shells out to. Not MCP servers. CLIs.
And then Steinberger built MCPorter
(2k+ stars) — a tool whose generate-cli command takes any MCP server definition and mints a standalone CLI from it. With typed arguments, --help, and structured output. He’s not the only one who saw this — Kan Yilmaz
built CLIHub
, an open-source directory and converter that transforms MCP servers into CLI tools with a single command.
The person who built the most successful open-source AI assistant — one so successful that he joined OpenAI and OpenClaw moved to a foundation — chose CLIs as the foundation for the entire skills architecture. And then built a tool to convert MCP servers back into CLIs for everyone else. That multiple people independently built MCP-to-CLI converters tells you something about where the market is heading.
The token tax
Here’s a concrete cost most people don’t think about: MCP tool descriptions consume tokens in every conversation.
When you configure an MCP server, its tool schemas get loaded into the context window. Every tool, every parameter, every description — all of it eats tokens before you’ve even asked a question. Configure a few MCP servers and you’re burning thousands of tokens per message just on tool definitions.
Kan Yilmaz ran the numbers and the gap is staggering. A typical MCP setup — 6 servers, 14 tools each, ~185 tokens per schema — costs ~15,540 tokens at session start before you’ve asked a single question. The CLI equivalent? ~300 tokens for a lightweight skill listing. That’s a 98% reduction.
It doesn’t get better at invocation time either. A single MCP tool call costs ~15,570 tokens (because the full schema stays in context). The same operation via CLI — discovery through --help plus execution — costs ~910 tokens. 94% savings. And the pattern holds at scale: 10 tools saves 94%, 100 tools saves 92%.
Even Anthropic’s own Tool Search — which defers tool loading to reduce upfront costs — still pulls the full JSON Schema when a tool is fetched. Yilmaz’s analysis shows CLI outperforms Tool Search by 74-88% across various tool counts, and works across any AI model, not just Anthropic’s.
CLI tools don’t have this overhead. They exist in the shell. The LLM knows about them from training. It calls them when it needs them, reads the output, and moves on. Zero token overhead for tool discovery.
Mario Zechner’s benchmarks confirmed this from a different angle — CLI-based approaches were significantly more token-efficient than MCP equivalents for the same tasks. His memorable framing: “Just like a lot of meetings could have been emails, a lot of MCPs could have been CLI invocations.”
When MCP actually makes sense
I’m not saying MCP is useless. It genuinely shines in specific scenarios:
When no CLI exists. Some services simply don’t have a CLI. Internal tools, proprietary platforms, niche SaaS products — if there’s no command-line interface, MCP is a reasonable way to bridge the gap.
When your users aren’t developers. This is Zuplo’s argument , and it’s a good one. A product manager checking deployment status, a finance team querying API usage, a legal team running compliance reports — these people will never open a terminal. MCP lets them interact with your platform through natural language in a chat interface.
When you need persistent state. MCP servers are stateful by default. For things like browser sessions, database connections, or long-running operations where you need to maintain context between calls, MCP’s architecture makes more sense than stateless CLI invocations.
When the client has no shell. Some LLM interfaces — web chat, mobile apps, Claude.ai — don’t provide terminal access. MCP is the only way to extend these clients with external tool access.
Companies that got this right
Datadog shipped a CLI while their MCP is still in beta
Datadog’s MCP server is still in closed beta. But instead of making everyone wait, they shipped pup — an open-source CLI that gives AI agents (and humans) access to 200+ commands across 33+ Datadog products. Written in Rust. Apache 2.0 licensed. Available today.
# Query metrics
pup metrics query --query="avg:system.cpu.user{*}" --from="1h"
# List monitors in alert state
pup monitors list --output json
# Check incidents
pup incidents list
# Browse dashboards
pup dashboards listThe clever part: pup automatically detects when it’s being invoked by an AI agent. It checks for environment variables like CLAUDE_CODE, CURSOR_AGENT, CODEX, and others. When agent mode is active, responses return structured JSON optimized for machine parsing, confirmation prompts auto-approve, and error messages include actionable hints.
No MCP server to configure. No protocol to negotiate. Claude Code runs pup monitors list, gets clean JSON back, and keeps going. The MCP server is still gated behind access requests. The CLI shipped and works today.
We built rootly-cli after shipping an MCP server first
At Rootly , we actually built the MCP server first. It worked — but once we saw how developers actually used AI agents in terminals, we realized a CLI would reach them faster with less friction. So we built rootly-cli , designed to be AI-agent-native from the ground up.
# List critical incidents
rootly incidents list --status=started --severity=critical
# Check who's on-call right now
rootly oncall who
# Create an incident
rootly incidents create --title "API latency spike" --severity critical
# Send a deployment pulse
rootly pulse run -- make deployWe made specific design choices that make it work seamlessly with AI agents:
- TTY-aware output: When piped (as it would be from Claude Code), it automatically switches to JSON — no
--formatflag needed - Markdown output mode:
--format=markdownoutputs data in markdown — and agents love markdown. It’s the format LLMs are most fluent in. Tables, headers, lists — all parseable without any JSON wrangling, and compact enough to keep token usage low. We added it specifically because we noticed Claude produced better responses when given markdown input vs. raw JSON. - Clean stdout/stderr separation: Pagination info goes to stderr, data goes to stdout, so agents get clean output
- Server-side filtering:
--status,--severity,--source,--limit— the agent queries exactly what it needs instead of pulling everything and filtering locally
The result: Claude Code can manage incidents, check on-call schedules, query alerts, and track deployments — all through shell commands it already understands. No MCP configuration, no token overhead from tool schemas, no servers to maintain.
The decision is simpler than you think
Before building an MCP server, ask one question: Does this tool already have a CLI?
If yes, you probably don’t need an MCP server. Especially if your users are developers working in environments that have shell access (terminals, IDEs, CI/CD). The AI agent already knows how to use the CLI. You’re building infrastructure for a problem that’s already solved.
If no, or if your audience includes non-developers who need access through chat interfaces, then MCP is the right call.
Here’s the decision tree:
| Scenario | Use CLI | Use MCP |
|---|---|---|
Developer tool with existing CLI (gh, aws, pup) | Yes | No |
| Internal service with no CLI | No | Yes |
| Non-developer users needing platform access | No | Yes |
| LLM client without shell access (web chat, mobile) | No | Yes |
| Stateful operations (browser sessions, DB connections) | No | Yes |
| Composable UNIX-style data pipelines | Yes | No |
The UNIX philosophy was right all along
There’s a delicious irony in all of this. The UNIX philosophy — write programs that handle text streams, expect your output to become input to another program, design for composability — was articulated in the 1970s. Fifty years later, it turns out to be the perfect interface for AI agents.
CLI tools built on UNIX principles didn’t just survive the AI revolution. They became the most natural interface for it. Text in, text out. Pipes and filters. Small tools that do one thing well. It’s what LLMs were born to work with.
So before you write another MCP server that wraps gh or aws — ask yourself: is this MCP solving a real problem, or is it just a meeting that could have been an email?
Written with Claude Opus 4.6 via Claude Code .
