Technical guide from CloudNSite engineering
CloudNSite designs, builds, and operates Model Context Protocol servers that expose your internal tools, data sources, and approval workflows to Claude, GPT-5.5, Cursor, and custom LLM hosts. Streamable HTTP transport, OAuth 2.1, scoped tool surfaces, and an evaluation harness before any production call. Most servers ship in 3 to 6 weeks with ongoing tuning and monitoring.
System diagram

Direct answer
A Model Context Protocol (MCP) server is a process that exposes tools, resources, and prompts to LLM clients through a standardized JSON-RPC interface. It lets one server work with Claude, GPT, Cursor, and custom hosts without rewriting glue code. Production MCP servers add OAuth 2.1, scoped permissions, structured error envelopes, and evaluation before any tool reaches a model.
A production MCP server has six layers that each get reviewed independently before shipping. CloudNSite uses the same skeleton across every engagement, with the tool and resource surface scoped to the actual workflow rather than a wide open API mirror.
Streamable HTTP from a single /mcp endpoint with POST plus an optional SSE stream. Strict Origin validation and DNS rebinding protection are enforced for any server reachable from a local host.
OAuth 2.1 with PKCE for remote servers, bearer tokens scoped per tool group, and mTLS or VPN-only deployment for regulated environments.
Initialize handshake declaring logging, prompts, resources, tools, and (in the 2025-11-25 spec) tasks support, with the protocol version pinned to the spec date the server was built against.
Each tool ships with a JSON Schema input, an idempotency contract, a side-effect classification, and a structured error envelope. Each server is scoped to one workflow, not a generic API mirror.
URI-templated read endpoints with pagination, change subscriptions where the underlying system supports them, and per-call authorization enforced server-side.
OpenTelemetry traces per JSON-RPC call, request and response logging with secrets redacted, plus an evaluation harness that exercises every tool against fixtures before any client connection.
Interview the workflow owners, list the model-callable actions, and prune to the smallest set that finishes the workflow. Generic API mirrors blow up context budgets and confuse models.
Choose Streamable HTTP for remote servers, stdio for local desktop integration. Pin OAuth 2.1 with PKCE, define scope groups per tool category, and lock down Origin and DNS rebinding posture before anything is exposed to a host.
Every tool gets a JSON Schema, a structured error envelope, and a fixture-driven test before it ever sees a real model. Resources get URI templates and pagination contracts written before implementation.
Connect OpenTelemetry traces, structured logs with secrets redacted, and an evaluation harness that scores tool calls on accuracy, refusals, and side-effect correctness. Regressions block deploys.
Ship to one host first, watch the JSON-RPC traffic, then expand to additional MCP clients once tool behavior is stable. CloudNSite continues to tune and operate the server after launch.
Current normative reference. Defines Streamable HTTP transport, the tasks capability, and the authorization profile we pin to.
Every MCP request, response, and notification rides this contract.
Required for every tool input definition. Validated server-side before any tool body runs.
Standard for remote MCP servers per the MCP authorization profile. We add per-tool scopes for regulated environments.
Distributed tracing across host, server, and downstream systems on every engagement we operate.
Official SDKs we extend rather than fork, pinned to the latest spec version.
From the field
Our agentic RAG connectors case study describes a CloudNSite-built MCP server that exposes a vector store, a connector index, and approval-gated write tools to a Claude-based agent. The same server backs an internal tooling host and a customer-facing agent without changes to the tool layer.
Read the full case studyMCP is an open protocol maintained at modelcontextprotocol.io. The current spec version is 2025-11-25. The protocol is implementation-agnostic, so a single MCP server works with any compliant client including Claude, GPT-5.5, Cursor, and custom agent hosts.
A REST API is consumed by application code. An MCP server is consumed by an LLM through a host process, with capability negotiation, JSON Schema input validation, structured error envelopes, and capability discovery built into the protocol. You can wrap a REST API in an MCP server, but the framing, scope, and contracts are different.
Build custom when the workflow touches private systems, regulated data, multi-tenant authorization, or internal approvals that no off-the-shelf server covers. Use an official server when the integration is to a single SaaS that already ships one and the off-the-shelf scope matches what you need.
Streamable HTTP is the current remote transport in the 2025-11-25 spec. Use stdio for local desktop integrations where the host launches the server as a subprocess. The older HTTP+SSE transport pair is deprecated and we do not ship it in new builds.
Remote MCP servers use OAuth 2.1 with PKCE per the MCP authorization profile. Bearer tokens are sent in the Authorization header and validated server-side on every JSON-RPC call. For regulated environments we add per-tool scope checks, IP allowlists, and where required mTLS or VPN-only deployment.
Cap the tool count per server to the workflow scope, paginate resource reads, return structured summaries instead of full payloads, and put detail behind a follow-up tool call. CloudNSite reviews token-budget behavior in evaluation before any production traffic.
CloudNSite. We build the server, operate it inside your infrastructure or ours per the engagement contract, monitor JSON-RPC traffic and tool accuracy, and ship updates as the spec evolves and your workflow changes.
Yes. That is the point of the protocol. Once capability negotiation and authorization scopes are correctly modeled, the same server can back a Claude Desktop host, a GPT-5.5 agent, an internal app, and a Cursor IDE integration without per-client code.