Mulberry Documentation
Welcome to the Mulberry documentation. Mulberry is a self-hosted web crawling platform that integrates with AI agents via MCP (Model Context Protocol). This documentation will help you get started and make the most of the platform.
Quick Links
Getting Started
Deploy Mulberry, create your account, and make your first API call.
Your First Crawl
Learn how to create and manage crawl jobs via the API.
Webhooks
Set up real-time notifications for crawl events.
MCP Integration
Connect AI agents to Mulberry using the Model Context Protocol.
Core Concepts
Mulberry is built around a few key concepts:
- Crawls - Jobs that fetch and process web pages. Each crawl targets a URL or list of URLs and extracts content in your chosen format.
- API Keys - Authentication tokens for accessing the API. Private keys (
sk_) have full access, public keys (pk_) are read-only. - Webhooks - HTTP callbacks that notify your systems when crawl events occur (started, completed, failed).
- MCP Server - A Model Context Protocol server that allows AI agents to interact with Mulberry directly.
Output Formats
Mulberry can extract content in multiple formats:
- HTML - Raw HTML content as crawled
- Text - Plain text with HTML tags stripped
- Markdown - Structured markdown, ideal for AI consumption
- JSON - Structured data with metadata
Need Help?
If you run into issues or have questions, check the GitHub Issues page or open a new issue.