MCP Integration

Connect AI agents like Claude to your crawl data using the Model Context Protocol. Create crawls, retrieve results, and search content directly from your AI workflows.

What is MCP?

The Model Context Protocol is an open standard for connecting AI models to external tools and data sources. Instead of building custom integrations, AI agents can use MCP to discover and invoke tools from any compatible server.

Universal protocol - works with any MCP-compatible client
Automatic tool discovery - agents learn available capabilities
Secure authentication - uses your existing API keys

Configuring MCP Clients

Add Mulberry to your MCP client configuration. The exact format depends on your client:

Claude Desktop

Add to your claude_desktop_config.json:

claude_desktop_config.json
{
  "mcpServers": {
    "mulberry": {
      "url": "https://your-server/mcp",
      "headers": {
        "Authorization": "Bearer sk_your_api_key"
      }
    }
  }
}

Claude Code

Add to your project's .mcp.json:

.mcp.json
{
  "servers": {
    "mulberry": {
      "url": "https://your-server/mcp",
      "headers": {
        "Authorization": "Bearer sk_your_api_key"
      }
    }
  }
}

Available MCP Tools

The Mulberry MCP server exposes these tools:

crawl_list Read

List all crawls for the account.

Parameters

{
  "status": "completed",  // Optional: filter by status
  "limit": 10             // Optional: max results
}

Returns

{
  "crawls": [
    {
      "id": "crawl_abc123",
      "url": "https://docs.example.com",
      "status": "completed",
      "pages_crawled": 120
    }
  ]
}
crawl_get Read

Get details and content from a specific crawl.

Parameters

{
  "crawl_id": "crawl_abc123",
  "include_pages": true,  // Optional: include page content
  "page_limit": 50        // Optional: max pages to return
}

Returns

{
  "crawl": {
    "id": "crawl_abc123",
    "url": "https://docs.example.com",
    "status": "completed"
  },
  "pages": [
    {
      "url": "https://docs.example.com/intro",
      "title": "Introduction",
      "content": "# Introduction\n\n..."
    }
  ]
}
crawl_create Write

Create a new crawl job.

Parameters

{
  "url": "https://example.com",
  "depth": 2,
  "format": "markdown",
  "include_patterns": ["/docs/"],
  "wait": true  // Optional: wait for completion
}

Returns

{
  "crawl": {
    "id": "crawl_xyz789",
    "url": "https://example.com",
    "status": "completed"
  },
  "pages_crawled": 45
}
Pro tip: Use "wait": true to have the agent wait for the crawl to complete before continuing. This is useful when the agent needs the results immediately.
crawl_url_sync Write

Crawl a single URL and get immediate results without creating a background job.

Parameters

{
  "url": "https://example.com",
  "result_format": "markdown",  // Optional: html, text, markdown, json
  "timeout": 30000              // Optional: timeout in milliseconds
}

Returns

{
  "data": [
    {
      "url": "https://example.com",
      "title": "Example Domain",
      "content": "# Example Domain\n\n...",
      "status_code": 200,
      "response_time_ms": 150
    }
  ],
  "errors": []
}
Use case: Perfect for quick lookups, API integrations, or when you need immediate results without polling. See Sync Crawls for full documentation.
crawl_sitemap Read

Generate an XML sitemap from a completed crawl.

Parameters

{
  "id": "crawl_abc123",
  "page": 1  // Optional: page number for large sitemaps
}

Returns

{
  "data": {
    "sitemap": "<?xml version=\"1.0\" encoding=\"UTF-8\"?>..."
  }
}
Use case: Perfect for SEO analysis, site audits, and content inventory. See Sitemap Generation for full documentation.

Authentication

MCP requests use the same API keys as REST API calls. Include your key in the Authorization header:

Authorization Header
Authorization: Bearer sk_your_api_key

Scopes and Permissions

sk_ Private Keys

Full access: create crawls, read results, manage settings

pk_ Public Keys

Read-only: list and read crawls, but cannot create new ones

Example Agent Workflows

Research Assistant

An agent that crawls documentation before answering questions:

"Before answering questions about React Router, I'll crawl their documentation to get the latest information."
Agent Workflow
// Agent uses crawl_create
{
  "url": "https://reactrouter.com/docs",
  "depth": 2,
  "format": "markdown",
  "wait": true
}

// Then uses crawl_get to access the content

Content Monitor

An agent that checks for updates on specific pages:

"I'll check if the pricing page has changed since last week."
Agent Workflow
// Agent uses crawl_create with URL list mode
{
  "urls": ["https://example.com/pricing"],
  "mode": "url_list",
  "wait": true
}

SEO Analysis

An agent that analyzes site structure and SEO health:

"I'll generate a sitemap from the crawl to analyze your site structure, identify orphan pages, and recommend SEO improvements."
Agent Workflow
// Agent uses crawl_create
{
  "url": "https://example.com",
  "depth": 3,
  "format": "markdown",
  "wait": true
}

// Then uses crawl_sitemap to get the sitemap
{
  "id": "crawl_xyz"
}

// Agent analyzes structure and provides recommendations

Error Handling

MCP tool calls may return these errors:

unauthorized Invalid or missing API key
forbidden Key lacks required permissions
not_found Crawl ID doesn't exist
rate_limited Too many requests

Agents should handle these gracefully and inform the user when issues occur.

Rate Limits

MCP requests count toward your standard API rate limits (100 requests per minute by default). The wait option on crawl_create holds the connection open but only counts as one request.

Note: Long-running crawls with wait: true do not block your rate limit while waiting for completion.

Next Steps