MCP — What Actually Travels Between Client and Server

MCP looks like a platform until you watch the bytes move. Here's the same tool call over stdio and Streamable HTTP — same JSON-RPC, different wire.

Posted May 9, 2026 Updated May 9, 2026

By Tsotne Okrostsvaridze

10 min read

MCP — short for Model Context Protocol — is how AI assistants like Claude talk to outside tools: GitHub, your filesystem, a database, a calendar, anything. If you’ve used Claude Code with a tool plugged in, you’ve already used MCP, even if you didn’t notice.

For a protocol that’s becoming the standard plumbing of AI tooling, the actual MCP “thing” is surprisingly small. Most explanations make it sound bigger and more architectural than it is.

This post does the opposite. We’re going to look at the actual bytes on the wire — twice — and watch the whole protocol collapse into a single idea.

The Punchline First

MCP isn’t a platform. It isn’t an architecture. It’s just JSON-RPC 2.0 — a convention for JSON messages shaped like requests and matching responses — sent over a pipe between two programs. The whole protocol surface fits on a napkin.

Everything that feels confusing about MCP — local servers, remote servers, stdio, Streamable HTTP, SSE, “hosted MCP” — collapses into one decision:

Which pipe are the JSON-RPC messages travelling through?

I’m going to prove it by watching the same tool call cross the wire twice. Same JSON. Different pipe.

For the whole post, our actors are:

Client: Claude Code
Server: the GitHub MCP server
Tool: create_issue

Same two characters the whole way through. One mental model.

What MCP Actually Says

First, the language. MCP messages are written in JSON-RPC 2.0 — a small, decades-old convention for “request/response, both as JSON.” Each request carries an id. Each response carries the same id so the client can tell which answer goes with which question. That’s the whole trick.

So: Claude Code wants to file a GitHub issue called “MCP post draft” in cotneok/blog. Here’s the entire tool-call exchange it has with the GitHub MCP server to make that happen:

JSON-RPC exchange — the invariant layer

Claude Code → GitHub MCP server

{
  "jsonrpc": "2.0",
  "id": 7,
  "method": "tools/call",
  "params": {
    "name": "create_issue",
    "arguments": {
      "owner": "cotneok",
      "repo": "blog",
      "title": "MCP post draft"
    }
  }
}

→id: 7

GitHub MCP server → Claude Code

{
  "jsonrpc": "2.0",
  "id": 7,
  "result": {
    "content": [
      { "type": "text",
        "text": "Created issue #42" }
    ]
  }
}

Three things to notice in those two objects:

id: 7 — the matching number. Client picks it, server echoes it back. That’s how the client knows which response belongs to which request when several are in flight at once.
method: "tools/call" — what the client wants the server to do. MCP defines other methods too — for setup, tool discovery, prompts, resources, notifications, and more — but for this post we only need this one: tools/call.
params — the arguments. Same shape as calling a function.

That’s it. For this exchange, that’s MCP, end to end. A request with an id, a response with the matching id. Two JSON objects.

The rest of this post is about how those two objects physically move from one process to another. Because that — and only that — is where stdio and Streamable HTTP differ.

Wire 1: stdio

When you install an MCP server locally — an executable, a Python script, a Docker container Claude Code launches itself — the transport is usually stdio.

Quick definition, in case “stdio” is jargon: every program on your computer gets three built-in text channels — stdin for input, stdout for output, stderr for errors. “stdio” is just shorthand for “send the messages over those built-in channels.” Nothing more exotic than that.

Here’s what physically happens when Claude Code uses stdio:

Claude Code spawns the GitHub MCP server as a child process.
The OS wires the child’s stdin and stdout to Claude Code.
Claude Code writes the request JSON to the child’s stdin, terminated by \n.
The server reads one line from its own stdin, parses the JSON, runs create_issue, writes the response JSON to stdout, terminated by \n.
Claude Code reads one line from the child’s stdout and matches id: 7.

And here’s what one of those stdio frames actually looks like on the wire:

{"jsonrpc":"2.0","id":7,"method":"tools/call","params":{...}}\n

On stdio, this is literally one UTF-8 line. The \n is the frame delimiter — read until newline, parse the buffer as JSON, and you have a JSON-RPC message. No length-prefix, no fancy framing protocol. Once MCP’s brief setup exchange at startup is out of the way, the transport really is this boring. That’s it.

That’s the whole transport. Two unidirectional byte streams between a parent and a child process. Newline-delimited JSON-RPC.

This is fast. There’s no networking. No ports. No load balancer. Authentication is usually whatever environment variables the parent passed into the child — the GitHub MCP server, for example, just reads GITHUB_TOKEN out of its own environment when Claude Code spawns it.

It’s also only local. There’s no way for Claude Code on your laptop to “stdio into” a server running on someone else’s machine. stdio requires a parent-child process relationship, and processes don’t span hosts.

That’s not an MCP limitation. That’s how operating systems work.

So the moment you want a remote MCP server — one running in someone else’s data center, not on your laptop — you need a different wire. The internet’s default wire is HTTP. So MCP uses HTTP.

Wire 2: Streamable HTTP

When the GitHub MCP server runs somewhere else — a hosted deployment, a container behind a load balancer, a third-party service — the transport has to be the network.

MCP’s current answer is Streamable HTTP. The server exposes a single endpoint, usually /mcp. The client sends JSON-RPC messages by POST-ing them to that endpoint. The server can answer in one of two shapes — either a normal JSON body, or an SSE stream.

(SSE = Server-Sent Events, a long-standing web standard for keeping a single HTTP response open and trickling data through it event-by-event, like a live feed. We’ll see why MCP uses it in a moment.)

Here’s our same create_issue call over Streamable HTTP:

  
POST /mcp HTTP/1.1
Host: github-mcp.example.com
Content-Type: application/json
Accept: application/json, text/event-stream
Mcp-Session-Id: 9f3c...

{"jsonrpc":"2.0","id":7,"method":"tools/call","params":{...}}

Two headers on this request are doing real work:

Accept: application/json, text/event-stream — required by the spec. The client commits, up front, to handling either response shape. That’s what gives the server license to pick.
Mcp-Session-Id — if the server established a session during initialize, Claude Code echoes the id on every later request so the server can route it to the right session.

The server has two ways to respond.

Option A — a single JSON response:

  
HTTP/1.1 200 OK
Content-Type: application/json

{"jsonrpc":"2.0","id":7,"result":{...}}

Option B — an SSE stream:

  
HTTP/1.1 200 OK
Content-Type: text/event-stream

event: message
data: {"jsonrpc":"2.0","method":"notifications/progress",...}

event: message
data: {"jsonrpc":"2.0","id":7,"result":{...}}

Why two response shapes? Because some tool calls finish instantly and some don’t. Creating an issue is one quick answer — a single JSON body is fine. But running a long build, scraping a bunch of pages, or generating a big document might take half a minute, and the server might want to stream progress along the way. The notifications/progress line in the SSE example above is exactly that — the server saying “I’m partway through” mid-call. SSE lets one HTTP response carry that whole back-and-forth instead of forcing the client to poll.

Worth being precise on the part people most often tangle:

SSE is not the transport. Streamable HTTP is the transport. SSE is one response shape inside Streamable HTTP, used when the server has more than one message to send back.

Small bonus, when servers opt in: if the server attaches SSE event ids to its messages, Claude Code can reconnect after a drop and send Last-Event-ID, and the server may replay whatever was missed. It’s not automatic — it’s a hook MCP leaves open for servers that need resumable streams.

(Historical footnote, in case you read older docs: an earlier MCP version used a separate “HTTP+SSE” transport. The 2025-03-26 spec revision replaced it with Streamable HTTP, which is what’s described above. Modern MCP defines exactly two standard transports: stdio and Streamable HTTP. That’s the whole list.)

So What Actually Changed?

Look at what the two transports have in common, and what they don’t.

The JSON didn’t change. The protocol didn’t change. The tool didn’t change. The thing on top of the wire is exactly the same in both pictures.

What changed was the wire.

This is why people get confused about “local MCP” and “remote MCP” as if they were different products. They aren’t. The same MCP server idea can run under stdio for local Claude Code and under Streamable HTTP for a hosted deployment. Only the transport wiring differs.

The Reframe

Once you can picture the wire, MCP stops being a platform and becomes plumbing.

There’s no magic MCPServer class doing clever things. There’s a process — or a host — that reads JSON-RPC frames off some channel, runs a function, writes a JSON-RPC frame back. Every “MCP integration” you’ll ever read about is a variant of that.

If you followed this far, you now hold the whole protocol in your head:

JSON-RPC 2.0 is the language. Request, response, matched by id.
stdio is for local servers. Client launches the server as a child process; they talk over stdin/stdout, one JSON message per line.
Streamable HTTP is for remote servers. Client POSTs JSON-RPC to /mcp. Server answers with either a single JSON body or an SSE stream (when there’s more than one message to send back).

The choice between stdio and Streamable HTTP isn’t a protocol decision. It’s a deployment decision: where does the server live?

Local? stdio. Remote? HTTP. Same JSON either way.

That’s the whole map.

If you want to see real frames flying past instead of taking my word for it, point the MCP Inspector at any MCP server and watch the actual JSON-RPC traffic in your browser. It’s the fastest way to make this post stop being theory.

Tsotne · tsotne.blog · AI engineering series, post #2

AI Engineering, MCP

This post is licensed under CC BY 4.0 by the author.