MCP Integration

Mulberry includes a built-in MCP (Model Context Protocol) server that allows AI agents like Claude to interact with your crawl data directly. This enables agents to create crawls, retrieve results, and search content as part of their workflows.

What is MCP?

The Model Context Protocol is an open standard for connecting AI models to external tools and data sources. Instead of building custom integrations, AI agents can use MCP to discover and invoke tools from any compatible server.

Configuring MCP Clients

Add Mulberry to your MCP client configuration. The exact format depends on your client, but here's a typical example:

Claude Desktop

Add to your claude_desktop_config.json:

{
  "mcpServers": {
    "mulberry": {
      "url": "https://your-server/mcp",
      "headers": {
        "Authorization": "Bearer sk_your_api_key"
      }
    }
  }
}

Claude Code

Add to your project's .mcp.json:

{
  "servers": {
    "mulberry": {
      "url": "https://your-server/mcp",
      "headers": {
        "Authorization": "Bearer sk_your_api_key"
      }
    }
  }
}

Available MCP Tools

The Mulberry MCP server exposes these tools:

crawl_list

List all crawls for the account.

// Parameters
{
  "status": "completed",  // Optional: filter by status
  "limit": 10             // Optional: max results
}

// Returns
{
  "crawls": [
    {
      "id": "crawl_abc123",
      "url": "https://docs.example.com",
      "status": "completed",
      "pages_crawled": 120
    }
  ]
}

crawl_get

Get details and content from a specific crawl.

// Parameters
{
  "crawl_id": "crawl_abc123",
  "include_pages": true,  // Optional: include page content
  "page_limit": 50        // Optional: max pages to return
}

// Returns
{
  "crawl": {
    "id": "crawl_abc123",
    "url": "https://docs.example.com",
    "status": "completed"
  },
  "pages": [
    {
      "url": "https://docs.example.com/intro",
      "title": "Introduction",
      "content": "# Introduction\n\n..."
    }
  ]
}

crawl_create

Create a new crawl job.

// Parameters
{
  "url": "https://example.com",
  "depth": 2,
  "format": "markdown",
  "include_patterns": ["/docs/"],
  "wait": true  // Optional: wait for completion
}

// Returns
{
  "crawl": {
    "id": "crawl_xyz789",
    "url": "https://example.com",
    "status": "completed"
  },
  "pages_crawled": 45
}
Pro tip: Use "wait": true to have the agent wait for the crawl to complete before continuing. This is useful when the agent needs the results immediately.

Authentication

MCP requests use the same API keys as REST API calls. Include your key in the Authorization header:

Authorization: Bearer sk_your_api_key

Scopes and Permissions

  • Private keys (sk_) - Full access: can create crawls, read results, manage settings
  • Public keys (pk_) - Read-only: can list and read crawls, but cannot create new ones

Example Agent Workflows

Research Assistant

An agent that crawls documentation before answering questions:

"Before answering questions about React Router, I'll crawl their documentation to get the latest information."
// Agent uses crawl_create
{
  "url": "https://reactrouter.com/docs",
  "depth": 2,
  "format": "markdown",
  "wait": true
}

// Then uses crawl_get to access the content

Content Monitor

An agent that checks for updates on specific pages:

"I'll check if the pricing page has changed since last week."
// Agent uses crawl_create with URL list mode
{
  "urls": ["https://example.com/pricing"],
  "mode": "url_list",
  "wait": true
}

Error Handling

MCP tool calls may return errors:

  • unauthorized - Invalid or missing API key
  • forbidden - Key lacks required permissions
  • not_found - Crawl ID doesn't exist
  • rate_limited - Too many requests

Agents should handle these gracefully and inform the user when issues occur.

Rate Limits

MCP requests count toward your standard API rate limits (100 requests per minute by default). The wait option on crawl_create holds the connection open but only counts as one request.

Next Steps