Skip to content

How to Handle Authentication and Tool Sharing in Multi-Agent MCP Systems (CrewAI, AutoGen, LangGraph)

Step-by-step guide to pulling real-time CRM data from Salesforce and HubSpot into LLM prompts via MCP servers, with OAuth scopes, tool definitions, rate-limit handling, and normalization code.

Uday Gajavalli Uday Gajavalli · · 21 min read
How to Handle Authentication and Tool Sharing in Multi-Agent MCP Systems (CrewAI, AutoGen, LangGraph)

Orchestrating autonomous agents using frameworks like CrewAI, AutoGen, or LangGraph feels incredibly powerful during local development. You define a persona, assign a few Python functions, and watch the agent reason through complex tasks. But deploying that multi-agent system into a production B2B SaaS environment exposes a massive architectural gap.

The framework handles the agentic reasoning, but it does not solve the enterprise integration problem. When your agents need to act on behalf of your users inside external systems—reading Jira tickets, updating Salesforce opportunities, or pulling BambooHR employee records—you suddenly have to manage multi-tenant OAuth 2.0 lifecycles, handle vendor-specific rate limits, and ensure strict isolation between what different agents are allowed to access.

Hand-rolling that infrastructure is where multi-agent projects quietly die. Building point-to-point custom API connectors for every agent capability is an engineering dead end, and the industry has rapidly aligned on the Model Context Protocol (MCP) as the standard middleware layer for this exact problem.

This guide walks through the architectural patterns that actually work for multi-agent authentication and tool sharing over MCP—what CrewAI's MCPServerAdapter and AutoGen's McpWorkbench give you out of the box, where they leave the heavy lifting to you, and how to design a production setup that won't melt during an enterprise security review.

Pulling Real-Time CRM Context Into LLM Prompts

If you're looking for the easiest way to pull real-time CRM context into an LLM prompt, the answer is an MCP server backed by OAuth-authenticated proxy access to your CRM's live API. No batch exports, no stale CSVs, no vector database sync jobs. The agent issues a tools/call request, the MCP server hits the CRM with a valid access token, and fresh data lands in the LLM's context window within a few hundred milliseconds.

This guide covers the full stack for making that work in production with Salesforce and HubSpot: OAuth setup with least-privilege scopes, field normalization across CRMs, concrete MCP tool definitions with JSON Schema, rate-limit-aware retry code, PII redaction, and a sandbox project you can stand up quickly.

The key architectural insight: injecting real-time CRM data into LLM prompts requires solving three problems simultaneously - authentication (whose data?), normalization (what shape?), and tool routing (which agent sees what?). MCP gives you the protocol. The sections below give you the implementation.

sequenceDiagram
    participant Agent as LLM Agent<br>(CrewAI / AutoGen)
    participant MCP as MCP Server
    participant Auth as Token Store
    participant CRM as CRM API<br>(Salesforce / HubSpot)

    Agent->>MCP: tools/call list_all_contacts
    MCP->>Auth: Get valid access token
    Auth-->>MCP: Bearer token (refreshed if needed)
    MCP->>CRM: GET /contacts (with token)
    CRM-->>MCP: Raw CRM response
    MCP->>MCP: Normalize to canonical schema
    MCP-->>Agent: Unified JSON in context window

The Multi-Agent Integration Bottleneck

Most multi-agent demos use standard input/output (stdio) MCP servers with environment variables for credentials. The developer hardcodes an API key into a .env file, and the local agent reads it. That works on a laptop. It does not work when your CRM agent needs to act on behalf of 4,000 customers, each with their own Salesforce instance, refresh tokens, and scopes.

Before MCP, connecting AI models to external data sources required a custom integration for every combination. As we've seen with native LLM connectors falling short, if you supported five LLMs and needed to connect them to fifty enterprise SaaS applications, you were staring down the barrel of 250 custom API wrappers.

In a multi-agent framework, the pain compounds across three axes:

  • Per-tenant OAuth: Every customer connects their own Salesforce, HubSpot, Jira, and BambooHR. You need a token vault, refresh logic, and a way to map an agent run to the correct user's credentials.
  • Tool routing: A SalesAgent, SupportAgent, and HRAgent should not all see the same 400-tool flat list. The model wastes tokens reasoning over irrelevant capabilities and frequently picks the wrong one.
  • Rate limits and failure modes: When an agent loops, it can burn a customer's API quota in seconds. HTTP 429s need to flow back to the framework cleanly so the planner can back off instead of retrying blindly.

MCP is the protocol-level answer to the first two. The third is where most platforms get it wrong.

How MCP Solves the M x N Connector Problem

The Model Context Protocol (MCP) is an open standard from Anthropic (now governed under the Linux Foundation's AAIF) that lets any AI client talk to any tool server using JSON-RPC 2.0. With MCP, the M * N integration nightmare transforms into an M + N standard. It requires 110 standardized implementations—one client per model, one server per tool surface—instead of 1,000 custom integrations.

For a deeper primer on the architecture, see our 2026 architecture guide for SaaS PMs.

What matters for multi-agent frameworks is that every major orchestrator now natively ships an MCP client:

  • CrewAI Tools supports the Model Context Protocol, giving access to tools from hundreds of MCP servers built by the community, exposed through the MCPServerAdapter.
  • AutoGen provides McpWorkbench that implements an MCP client, which you can use to create an agent that uses tools provided by MCP servers.
  • LangGraph nodes can be wrapped around MCP clients to invoke tools as part of a graph step.
graph TD
    subgraph Multi-Agent Framework
        A[CrewAI Agent]:::client
        B[AutoGen Agent]:::client
        C[LangGraph Executor]:::client
    end

    subgraph MCP Middleware Layer
        D[MCP Client Interface]:::middleware
    end

    subgraph Remote MCP Servers
        E[Salesforce MCP Server]:::server
        F[Zendesk MCP Server]:::server
        G[BambooHR MCP Server]:::server
    end

    A -->|JSON-RPC| D
    B -->|JSON-RPC| D
    C -->|JSON-RPC| D
    D -->|HTTP/SSE| E
    D -->|HTTP/SSE| F
    D -->|HTTP/SSE| G

    classDef client fill:#f9f9f9,stroke:#333,stroke-width:2px;
    classDef middleware fill:#e1f5fe,stroke:#0288d1,stroke-width:2px;
    classDef server fill:#e8f5e9,stroke:#388e3c,stroke-width:2px;

The protocol standardizes the communication, but it leaves the heavy lifting of authentication, token management, and security entirely up to the developer.

Handling Authentication: From Local Stdio to Production OAuth 2.0

Local MCP setups inject API keys via environment variables. Look at any AutoGen GitHub MCP example: the agent passes a GITHUB_PERSONAL_ACCESS_TOKEN through env to a Docker-launched MCP server. This approach is useless for B2B SaaS.

In a production multi-tenant system, agents operate on behalf of specific end-users. You must use remote MCP servers communicating over HTTP or Server-Sent Events (SSE). This requires a highly secure authentication architecture that handles the OAuth 2.0 authorization code flow for hundreds of different third-party providers.

The OAuth Token Lifecycle Problem

When an AI agent connects to a remote MCP server to execute a tool (e.g., update_hubspot_contact), the server must attach a valid OAuth access token to the outbound HTTP request. Access tokens typically expire in 30 to 60 minutes.

The naive implementation—check expiry before each call, refresh inline if stale—falls apart fast. If multiple agents attempt to call the API simultaneously right as the token expires, you will encounter race conditions. Two requests racing through token.expired() will both attempt to refresh, and most identity providers invalidate the old refresh token the moment a new one is issued. Now both calls fail with an invalid_grant error, the entire token chain is revoked due to reuse detection, the account flips to needs_reauth, and your customer's CSM is on the phone.

Managed platforms solve this by treating token refreshes as a distributed systems problem. The correct architecture has three properties:

  1. Proactive refresh: The platform schedules work to refresh credentials 60 to 180 seconds before they expire, complete with jitter to avoid thundering herds.
  2. Mutex-protected refresh: Mutex locks per integrated account ensure that concurrent agent requests queue cleanly behind a single in-flight refresh operation instead of duplicating it.
  3. Graceful reauth signaling: When a refresh token genuinely dies, the system fires a webhook so your app can prompt the user, rather than silently failing mid-agent-run.

For a deeper architectural treatment, see OAuth at Scale: The Architecture of Reliable Token Refreshes.

Least-Privilege Scopes for CRM Integrations

OAuth scopes define the blast radius if a token is compromised. For AI agents pulling real-time CRM context, you almost never need full account access. Here's what minimal scope sets look like for the two most common CRMs.

Salesforce

For most apps, refresh_token, web, and api are sufficient. A read-only CRM context agent needs just two scopes during the authorization code flow:

api            # REST API access to CRM objects
refresh_token  # Offline renewal for background agents

Grant only the scopes the app truly needs - often just api. Avoid full access scopes at all cost. If your agent only reads contacts and opportunities, enforce that through Salesforce permission sets on the integration user rather than requesting broader scopes. Least privilege means giving connected apps only the permissions necessary to perform their intended business function. For example, a reporting tool may only require read access instead of broad write permissions.

HubSpot

HubSpot uses granular, object-level scopes. A CRM intelligence agent that reads contacts and deal stages to generate pipeline summaries only needs crm.objects.contacts.read and crm.objects.deals.read. A practical minimum for read-only CRM context:

crm.objects.contacts.read
crm.objects.companies.read
crm.objects.deals.read
crm.objects.owners.read

Requiring all scopes upfront causes connection drop-off from cautious enterprise admins. Optional scopes let agents request only what they need now, while leaving the door open for broader permissions later - without blocking the initial connection. Split your scopes into required (the read scopes above) and optional (write scopes your agent might need later). This keeps your OAuth consent screen lean and your enterprise close rate high.

Tip

Scope hygiene rule. Start with read-only scopes. Add write scopes only when a specific agent role demands it, and route those agents through a separate MCP server with methods=["write"] filtering.

Remote MCP with HTTP Transport

For multi-tenant agents, you want remote MCP servers, not stdio. AutoGen supports this directly via SseServerParams:

from autogen_ext.tools.mcp import McpWorkbench, SseServerParams
from autogen_agentchat.agents import AssistantAgent
 
server_params = SseServerParams(
    url="https://api.truto.one/mcp/<hashed_token>",
    headers={"Authorization": "Bearer <platform_api_token>"},
)
 
async with McpWorkbench(server_params) as mcp:
    agent = AssistantAgent(
        "crm_agent",
        model_client=model_client,
        workbench=mcp,
        reflect_on_tool_use=True,
    )

CrewAI follows the exact same shape:

from crewai_tools import MCPServerAdapter
 
server_params = {
    "url": "https://api.truto.one/mcp/<hashed_token>",
    "headers": {"Authorization": "Bearer <platform_api_token>"}
}
with MCPServerAdapter(server_params) as tools:
    agent = Agent(role="CRM Analyst", tools=tools, ...)

Securing the MCP Endpoint

Exposing an MCP server over HTTP introduces a security risk: anyone with the URL could theoretically execute tools against your customer's SaaS account. To lock this down, your architecture should implement a layered authentication model.

In enterprise setups, the MCP URL itself acts as a per-tenant capability token, cryptographically hashed to encode which connected account the server is bound to. Combined with a second-factor flag (like require_api_token_auth), possession of the URL alone is not enough. The agent framework must also pass a valid platform API token in the Authorization header. This ensures that only your authenticated backend services can invoke the MCP server, which is critical when MCP URLs end up in logs, dotfiles, or LangSmith traces.

Warning

Trust boundary reality check. Always make sure that you trust the MCP Server before using it. Using an STDIO server will execute code on your machine. Using SSE is still not a silver bullet with many injection possibilities into your application from a malicious MCP server. Pin remote MCP URLs to known origins and authenticate the channel.

Tool Sharing and Routing in Multi-Agent Frameworks

A common mistake when building multi-agent systems is dumping every available API endpoint into the context window of every agent. If your SalesAgent sees BambooHR's leave-balance tool, two things happen: token cost balloons, and the LLM occasionally hallucinates an argument to call it anyway.

You must explicitly route tools to the specialized agents that need them. There are three architectural patterns for tool scoping:

1. One MCP Server Per Agent Role

Generate a separate MCP URL per role, each scoped to a different connected account or filtered toolset. The SalesAgent gets a Salesforce-only URL; the HRAgent gets a BambooHR-only URL. Frameworks like AutoGen support multiple workbenches on a single agent, so you can also compose them: pass a list of workbenches to one agent to give it both web browsing and filesystem tools.

2. Tag-Based Filtering on a Single Server

When one agent legitimately needs cross-system tools (e.g., an OnboardingAgent that touches BambooHR, Okta, and Slack), filter the toolset by functional tag. On a tag-aware MCP server, the URL configuration includes tags=["directory", "messaging"] and the server dynamically generates highly scoped tools matching those tags. This keeps the tool list tight without spinning up multiple connections.

3. Method-Level Filtering for Write-Safety

For read-only analytics agents, scope the server to methods=["read"]. The agent can list and get but cannot create, update, or delete. This is one of the cleanest ways to enforce the principle of least privilege for AI agents—far simpler than attempting to validate writes after the fact.

Dynamic Tool Generation and Documentation Gates

Maintaining tool definitions manually is a massive technical debt trap. Third-party APIs change constantly. If you hardcode the JSON Schema for a Jira issue creation tool, it will eventually break when Jira adds a required custom field.

A practical lesson from running this in production: tool descriptions are not optional decoration. The LLM picks tools by reading their descriptions. If your MCP server auto-emits 200 tools with empty schemas, the planner will hallucinate arguments and waste retries.

Auto-Generated MCP Tools: Documentation-Driven Tool Creation for AI Agents (2026) outlines why tool generation must be dynamic. Advanced platforms derive MCP tools directly from the integration's resource definitions and documentation records on every tools/list request. If an endpoint lacks documentation, the tool is skipped. This acts as a strict quality gate, ensuring your agents are only exposed to well-documented tools with accurate query and body schemas. When the framework requests the tool list, schemas are dynamically enhanced—injecting pagination parameters like limit and next_cursor complete with LLM instructions to pass the cursor back exactly as received.

Concrete MCP Tool Definitions for CRM Resources

To make the dynamic tool generation tangible, here is what actual MCP tool definitions look like for CRM resources. In MCP, each tool is defined using a structured schema that describes its identity, purpose, and how it should be invoked. This definition allows clients (such as LLMs) to discover what the tool does, what inputs it accepts, and how to use it safely.

A list tool for HubSpot contacts, with auto-injected pagination parameters:

{
  "name": "list_all_hub_spot_contacts",
  "description": "List all contacts from HubSpot CRM. Returns paginated results including name, email, phone numbers, and custom properties.",
  "inputSchema": {
    "type": "object",
    "properties": {
      "limit": {
        "type": "string",
        "description": "The number of records to fetch"
      },
      "next_cursor": {
        "type": "string",
        "description": "The cursor to fetch the next set of records. Always send back exactly the cursor value you received (nextCursor) without decoding, modifying, or parsing it."
      }
    }
  }
}

A get tool for Salesforce contacts, with the required id parameter auto-injected:

{
  "name": "get_single_salesforce_contact_by_id",
  "description": "Retrieve a single Salesforce contact by ID. Returns all standard and custom fields for the contact record.",
  "inputSchema": {
    "type": "object",
    "properties": {
      "id": {
        "type": "string",
        "description": "The id of the contact to get. Required."
      }
    },
    "required": ["id"]
  }
}

And a create tool with a body schema derived from documentation:

{
  "name": "create_a_hub_spot_contact",
  "description": "Create a new contact in HubSpot CRM.",
  "inputSchema": {
    "type": "object",
    "properties": {
      "first_name": { "type": "string", "description": "Contact first name" },
      "last_name": { "type": "string", "description": "Contact last name" },
      "email": { "type": "string", "description": "Primary email address" },
      "phone": { "type": "string", "description": "Phone number" },
      "company": { "type": "string", "description": "Company name" }
    },
    "required": ["email"]
  }
}

While JSON Schema supports deep nesting and complex validation logic (like oneOf or allOf), it is advisable to keep tool schemas as flat as possible. Deeply nested structures increase the token count and cognitive load for the LLM, which can lead to higher latency or parsing errors. This is exactly why well-designed MCP servers flatten the input namespace - query parameters and body parameters share a single inputSchema, and the server splits them internally using each schema's property keys.

Managing Rate Limits and Context Windows

AI agents are inherently aggressive multi-agent systems are rate-limit machines. An AutoGen loop tasked with analyzing 5,000 customer records will try to execute 5,000 parallel tool calls as fast as the LLM can generate them. Enterprise APIs will immediately reject this traffic with HTTP 429 Too Many Requests errors.

How your integration middleware handles these 429 errors dictates whether your multi-agent system succeeds or fails.

The Danger of Middleware Retries

A common instinct is to have the integration middleware automatically absorb rate limit errors, apply exponential backoff, and retry the request silently. For standard application code, this is a good pattern. For AI agents, it is fatal.

If the middleware blocks the HTTP connection for 45 seconds while waiting for a rate limit window to reset, the LLM client will likely time out. Even worse, it hides backpressure from the agent's planner, which is the only component that knows whether to abandon a step, take a different path, or escalate. The LLM does not know why the tool is taking so long, so it loses agency.

Passing Rate Limits to the Agent

The correct architectural approach is radical transparency. The integration layer should not retry, throttle, or apply backoff on rate limit errors. When an upstream API returns an HTTP 429, it should immediately pass that error back to the caller.

To make this actionable for the agent framework, the chaotic, vendor-specific rate limit headers (like X-RateLimit-Remaining-Day or Sforce-Limit-Info) must be normalized into standardized IETF headers. The IETF draft draft-ietf-httpapi-ratelimit-headers defines the standard:

  • RateLimit-Limit: The total request quota in the current window.
  • RateLimit-Remaining: The number of requests left.
  • RateLimit-Reset: The seconds until the quota replenishes.

By passing the 429 error and these standardized headers back to the agent framework, the caller (your CrewAI tool wrapper, your LangGraph node, your AutoGen tool adapter) can handle backoff gracefully. Here is a reusable retry decorator for tool calls:

import asyncio, random
 
async def call_with_backoff(fn, *args, max_attempts=4, **kwargs):
    for attempt in range(max_attempts):
        resp = await fn(*args, **kwargs)
        if resp.status_code != 429:
            return resp
        reset = int(resp.headers.get("ratelimit-reset", "2"))
        jitter = random.uniform(0, 1)
        await asyncio.sleep(min(reset, 30) + jitter)
    raise RuntimeError("Rate limit budget exhausted")

This pattern lets the agent framework's planner observe the failure—useful for circuit-breaking a runaway loop—while still recovering automatically when the window resets.

Tip

Rule of thumb. Treat the unified API as a thin, honest pass-through. Retry policy belongs in the agent runtime, not the integration layer. Frameworks already support tool-level retries, error reflection (reflect_on_tool_use=True in AutoGen), and bounded iteration caps (max_iter in CrewAI).

CRM Rate Limit Budgets You Need to Know

The retry code above only works if you understand the actual quotas your agents are operating under. Salesforce and HubSpot impose very different rate limit structures, and agents can burn through them fast.

Salesforce uses a daily rolling window. Salesforce enforces a 100,000 daily API request limit for Enterprise Edition orgs plus 1,000 additional requests per user license. Limits are calculated on a 24-hour rolling basis rather than a fixed calendar day. A maximum of 25 long-running requests (20+ seconds) are allowed in production. When an agent loops through paginated contact lists, each page counts as one request against this daily budget.

HubSpot uses burst-based throttling. The limits are measured per 10-second window and depend on your subscription and app type: for private apps it's 100 requests per 10 seconds on Free/Starter, 190 per 10 seconds on Professional or Enterprise. For apps using OAuth authentication distributed via the HubSpot marketplace, each HubSpot account that installs your app is limited to 110 requests every 10 seconds. Search endpoints have a separate, stricter limit: 4 requests per second.

CRM Burst Limit Daily Limit Search API
Salesforce 25 concurrent long-running 100k + 1k/user license per 24h 500 sync report runs/hr
HubSpot 100-190 per 10s (tier-dependent) 250k-1M per day 5 req/s

With these numbers, an unconstrained agent doing paginated reads of 10,000 contacts (at 100 per page) would consume 100 API calls - about 0.1% of a Salesforce Enterprise daily budget, but it would need careful pacing against HubSpot's burst cap.

Latency SLAs: What to Expect From CRM APIs

When injecting real-time CRM data into LLM prompts, end-to-end latency determines whether the user experience feels interactive or sluggish. Here's what to expect and how to measure it.

Typical response times for single-record reads against major CRM APIs:

Operation Salesforce (p50) HubSpot (p50) Notes
Get single contact 150-300ms 100-200ms Varies by field count
List contacts (100 records) 300-600ms 200-400ms Pagination cursor adds overhead
Search contacts 400-800ms 300-500ms Full-text search is slower
Create/Update contact 200-500ms 150-350ms Write latency is higher

Add 50-100ms for MCP protocol overhead (JSON-RPC serialization, token validation, tool routing) and 20-50ms for OAuth token lookup from a managed store. Your total budget for a single tool call is typically 250-950ms before the LLM sees the result.

To test this in your environment, wrap tool calls with timing instrumentation:

import time
 
async def timed_tool_call(mcp_client, tool_name, arguments):
    start = time.monotonic()
    result = await mcp_client.call_tool(tool_name, arguments)
    elapsed_ms = (time.monotonic() - start) * 1000
    print(f"{tool_name}: {elapsed_ms:.0f}ms")
    if elapsed_ms > 1000:
        print(f"  WARNING: exceeds 1s SLA target")
    return result

Set a hard timeout of 10 seconds per tool call in your agent framework. If a CRM API is consistently exceeding that, the account likely has a network or org-level performance issue - not something retries will fix.

Normalization: Building a Canonical CRM Response Envelope

The hardest part of injecting real-time CRM context into LLM prompts is not fetching the data - it's normalizing it. HubSpot returns contacts as nested properties objects with lowercase snake_case keys. Salesforce returns flat PascalCase fields. Your LLM agent shouldn't need to know which CRM the data came from.

A canonical response envelope solves this. Every CRM response gets mapped to a unified shape before it reaches the LLM's context window:

{
  "result": [{
    "id": "501",
    "first_name": "Jane",
    "last_name": "Doe",
    "email_addresses": [{"email": "jane@acme.com", "is_primary": true}],
    "phone_numbers": [{"number": "+1-555-0100", "type": "phone"}],
    "addresses": [{"city": "San Francisco", "state": "CA"}],
    "account": {"id": "200"},
    "custom_fields": {"lead_source__c": "Webinar"},
    "created_at": "2024-03-15T10:00:00Z",
    "updated_at": "2025-06-01T14:30:00Z"
  }],
  "next_cursor": "eyJsYXN0SWQiOiI1MDEifQ==",
  "result_count": 1
}

The normalization happens at the proxy layer, not in the LLM prompt. This is important: if you push raw CRM JSON into the context window, you waste tokens on field names the model doesn't understand (hs_additional_emails, MailingPostalCode) and force the model to learn each vendor's data shape. A canonical envelope means your prompt template stays constant regardless of which CRM the customer uses.

What the Mapping Looks Like Under the Hood

The field mapping between raw CRM responses and the canonical schema is data, not code. For each integration, a declarative mapping (typically using expression languages like JSONata) transforms the response. Consider just the email field:

  • HubSpot: Email lives in properties.email, with additional emails semicolon-delimited in properties.hs_additional_emails
  • Salesforce: Email is a single top-level Email field

Both map to the same email_addresses array. The mapping handles edge cases like splitting semicolon-separated values, filtering nulls, and marking the primary address. Phone numbers are even more divergent - Salesforce exposes six separate phone fields (Phone, MobilePhone, HomePhone, OtherPhone, Fax, AssistantPhone) while HubSpot has phone and mobilephone. The canonical schema normalizes both into a typed array.

This data-driven approach means adding a new CRM is a configuration task, not a code change. The same generic execution pipeline handles every integration.

Cursor-Based Pagination for Complete Context

When your agent needs more than one page of results, cursor-based pagination keeps the process stateless and reliable:

async def fetch_all_contacts(mcp_client, max_pages=10):
    all_contacts = []
    cursor = None
    for _ in range(max_pages):
        args = {"limit": "100"}
        if cursor:
            args["next_cursor"] = cursor
        result = await mcp_client.call_tool(
            "list_all_hub_spot_contacts", args
        )
        data = json.loads(result.content[0].text)
        all_contacts.extend(data["result"])
        cursor = data.get("next_cursor")
        if not cursor:
            break
    return all_contacts

The max_pages guard is non-negotiable for agent workloads. Without it, a list call against a CRM with 500,000 contacts becomes a runaway loop that exhausts the daily API quota.

Security: PII Redaction and Audit Logging

CRM data is inherently sensitive. Every contact record contains personally identifiable information (PII) - names, emails, phone numbers, addresses. When this data flows through an MCP server into an LLM's context window, you need two things: redaction controls and audit trails.

Redaction Patterns

Not every agent needs full PII. A pipeline analytics agent that counts deals by stage doesn't need to see contact email addresses. Apply redaction at the MCP tool level:

  • Field-level stripping: Remove sensitive fields from the response before returning it to the LLM. The tool's output schema can exclude email_addresses and phone_numbers for read-only analytics roles.
  • Masking: Replace PII with hashed or partial values (jane@***.com) when the agent needs to correlate records without seeing the raw data.
  • Tag-based routing: Use MCP server tags to expose only non-PII resources (deals, stages, pipelines) to analytics agents, while restricting contact-level tools to agents with explicit PII access.

Structured Logging Without PII Leakage

Every MCP tool call should produce an audit log entry. Log the tool name, the integrated account ID, the timestamp, and the request ID - but never log the raw arguments or response body. A safe logging pattern:

import logging
 
def log_tool_call(tool_name, account_id, request_id, latency_ms, success):
    logging.info(
        "mcp_tool_call",
        extra={
            "tool": tool_name,
            "account_id": account_id,
            "request_id": request_id,
            "latency_ms": latency_ms,
            "success": success,
            # Never log: arguments, response body, tokens
        }
    )

If you need to debug a failed tool call, use the request_id to trace it through the integration layer's internal logs, which should sit behind access controls. Don't pipe CRM data into your application's general log stream - that turns every log aggregation tool into a PII liability.

Quickstart: Real-Time CRM Context in Under 30 Minutes

Here's a concrete project structure for standing up an MCP-backed CRM context pipeline. This example uses Truto as the integration layer, but the architecture applies to any managed MCP platform.

Project Structure

crm-context-agent/
  ├── agent.py            # CrewAI or AutoGen agent definition
  ├── tools.py            # MCP client setup and tool wrappers
  ├── retry.py            # Rate-limit-aware backoff (from this guide)
  ├── test_latency.py     # SLA validation script
  ├── .env.example        # Required environment variables
  └── requirements.txt    # Dependencies

Step-by-Step Setup

  1. Connect your CRM. In the Truto dashboard, connect a Salesforce or HubSpot account using OAuth. The platform handles the authorization code flow and stores tokens securely.

  2. Create a scoped MCP server. Call the API to create a read-only MCP server filtered to CRM resources:

curl -X POST https://api.truto.one/integrated-account/<account_id>/mcp \
  -H "Authorization: Bearer <your_api_token>" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "CRM Read-Only Agent",
    "config": {
      "methods": ["read"],
      "tags": ["crm"],
      "require_api_token_auth": true
    }
  }'
  1. Wire the MCP URL into your agent. Use the URL from the response to connect your framework:
# tools.py
from autogen_ext.tools.mcp import McpWorkbench, SseServerParams
import os
 
def get_crm_workbench():
    return McpWorkbench(SseServerParams(
        url=os.environ["TRUTO_MCP_URL"],
        headers={"Authorization": f"Bearer {os.environ['TRUTO_API_TOKEN']}"}
    ))
  1. Run the latency test. Before building agent logic, validate that your CRM API responds within SLA:
# test_latency.py
import asyncio, time, json
 
async def test_sla():
    async with get_crm_workbench() as mcp:
        tools = await mcp.list_tools()
        print(f"Available tools: {len(tools)}")
        for tool in tools:
            if tool.name.startswith("list_all"):
                start = time.monotonic()
                result = await mcp.call_tool(tool.name, {"limit": "10"})
                ms = (time.monotonic() - start) * 1000
                print(f"  {tool.name}: {ms:.0f}ms")
 
asyncio.run(test_sla())
  1. Build your agent. With the MCP connection validated, define the agent that pulls real-time CRM context into its prompt:
# agent.py
from autogen_agentchat.agents import AssistantAgent
 
async def run_crm_agent(query: str):
    async with get_crm_workbench() as mcp:
        agent = AssistantAgent(
            "crm_analyst",
            model_client=model_client,
            workbench=mcp,
            system_message=(
                "You are a CRM analyst. Use the available tools to pull "
                "real-time contact and deal data. Always cite the record "
                "IDs in your analysis. Respect pagination cursors."
            ),
            reflect_on_tool_use=True,
        )
        response = await agent.run(task=query)
        return response

From here, you have a working pipeline: the agent receives a natural language query, discovers available CRM tools via MCP, pulls live data with valid OAuth tokens, and gets normalized responses in a consistent schema. The entire auth, refresh, normalization, and rate limit handling is delegated to the infrastructure layer.

Why Managed MCP Infrastructure Wins for B2B SaaS

If you are evaluating how to connect your multi-agent architecture to your customers' enterprise systems, you have to decide where to spend your engineering cycles. As explored in our buyer's guide to MCP server platforms, the build-vs-buy math for multi-agent infrastructure is not subtle.

To ship a production CrewAI or AutoGen system against enterprise SaaS, you need:

  • A token vault with encryption at rest and concurrency-safe refresh per account.
  • Per-customer OAuth app management.
  • A tool generation layer that produces JSON Schemas the LLM can actually use, with descriptions curated per resource and method.
  • Tag and method filtering so each agent role gets a scoped toolset.
  • Standardized error and rate-limit handling across 100+ APIs that each behave differently.

Building custom MCP servers requires implementing JSON-RPC 2.0 protocol handlers, normalizing API schemas into LLM-friendly formats, building a stateful OAuth token refresh system, and writing custom logic to map flat LLM arguments into complex nested request bodies.

Managed unified API platforms (like those compared in our guide to the best MCP server platforms) abstract the infrastructure entirely. With a platform like Truto, adding a new integration to your multi-agent system is a data operation, not a code operation. Every connected account can be turned into an MCP server URL via a single API call. You configure the integration, define the tag-based routing for your specific agents, and the platform dynamically generates the MCP server. It handles the proactive token refreshes ahead of expiry, normalizes the pagination, standardizes the rate limit headers, and enforces multi-tenant isolation.

Your engineering team can focus entirely on prompt engineering, agent orchestration, and business logic—leaving the brutal realities of third-party API maintenance to the infrastructure layer.

Where to Go From Here

If you're standing up a multi-agent system this quarter, here are three concrete next steps:

  1. Define agent-to-toolset boundaries before you write any code. Sketch which agents read, which write, and which cross domains. This becomes your tag and method filter map.
  2. Pick the auth boundary early. Decide whether MCP URLs travel inside your trust boundary (URL-only auth) or whether you need the second-factor API token check. Retrofitting this is painful.
  3. Wire rate-limit-aware retries at the framework layer. Standard IETF headers make this a 30-line utility, not a custom integration project per provider.

FAQ

What is the easiest way to pull real-time CRM context into an LLM prompt?
Point your LLM agent framework (CrewAI, AutoGen, LangGraph) at a remote MCP server backed by OAuth-authenticated proxy access to your CRM's live API. The agent issues a tools/call request, the MCP server hits the CRM with a valid access token, and normalized contact or deal data lands in the context window within 200-800ms. No batch exports or vector database sync required.
What OAuth scopes do I need for Salesforce and HubSpot CRM integrations with AI agents?
For Salesforce, request 'api' and 'refresh_token' scopes at minimum. For HubSpot, use granular object-level scopes like crm.objects.contacts.read, crm.objects.deals.read, crm.objects.companies.read, and crm.objects.owners.read. Start read-only and add write scopes only when specific agent roles require them.
How do I handle CRM API rate limits in a multi-agent system?
Don't absorb rate limits in middleware - pass HTTP 429 errors and standardized IETF RateLimit headers directly back to the agent framework. Salesforce allows ~100,000 daily requests (Enterprise); HubSpot caps at 100-190 requests per 10 seconds. Implement retry with jitter at the framework layer so the agent's planner can observe failures and adjust.
How do I normalize CRM data across Salesforce and HubSpot for LLM consumption?
Map both CRM responses to a canonical envelope with consistent field names (id, first_name, email_addresses, phone_numbers, etc.) at the proxy layer before data reaches the LLM. This keeps prompt templates CRM-agnostic and avoids wasting tokens on vendor-specific field names like 'hs_additional_emails' or 'MailingPostalCode'.

More from our Blog