Mulberry Documentation

Welcome to Mulberry, a self-hosted web crawling platform that integrates with AI agents via MCP (Model Context Protocol). This documentation will help you get started and make the most of the platform.

Getting Started

Deploy Mulberry, create your account, and make your first API call.

Your First Crawl

Learn how to create and manage crawl jobs via the API.

Webhooks

Set up real-time notifications for crawl events.

MCP Integration

Connect AI agents to Mulberry using the Model Context Protocol.

Core Concepts

Mulberry is built around a few key concepts:

Core

Crawls

Jobs that fetch and process web pages. Each crawl targets a URL or list of URLs and extracts content in your chosen format.

Auth

API Keys

Authentication tokens for accessing the API. Private keys (sk_) have full access, public keys (pk_) are read-only.

Events

Webhooks

HTTP callbacks that notify your systems when crawl events occur (started, completed, failed).

MCP Server

A Model Context Protocol server that allows AI agents to interact with Mulberry directly.

Output Formats

Mulberry can extract content in multiple formats to suit your needs:

html Raw HTML content as crawled

text Plain text with HTML tags stripped

markdown Structured markdown, ideal for AI consumption Recommended

json Structured data with metadata

Additional Resources

Sync Crawls Instant single-page crawling

Sitemap Generation Generate XML sitemaps from crawls

API Keys Authentication and permissions

Docker Hub Latest releases and versions