Free Software

Web Data Platform for the AI Era

Self-hosted platform that crawls websites, monitors business listings, extracts structured data, and integrates with AI agents. Your infrastructure, your data, your control.

Install Now Explore Features

Built with battle-tested technologies

Elixir/Phoenix PostgreSQL Docker

mulberry-api
 # Create a crawl job 
 curl  -X POST \ 
 -H  "Authorization: Bearer sk_..."  \ 
 -d  '{"url": "https://example.com", "depth": 2}'  \ 
 https://your-server/api/crawls 
 # Response 
{
 "id": "crawl_abc123", 
 "status": "running", 
 "pages_crawled": 47, 
 "format": "markdown" 
}

MCP Ready

Everything You Need to Own Your Web Data

A complete platform for web data extraction, event processing, and AI integration—all self-hosted on your infrastructure.

Web Crawling Engine

Powerful, configurable crawling with multiple modes and real-time monitoring.

Website traversal & URL lists
Configurable depth & workers
Regex include/exclude patterns
HTML, Text, Markdown, JSON output
Real-time progress monitoring
Web UI for management

MCP Server

Native AI agent integration.

Bearer token auth
crawl_list, crawl_get, crawl_create
Scope-based permissions

Authentication

Secure, flexible auth options.

Magic link (passwordless)
Optional password auth
Sudo mode for sensitive ops

API Keys

Granular access control.

Private (sk_) & Public (pk_) keys
Hashed storage, expiration
Last-used tracking

Webhooks & Events

Real-time notifications.

Lifecycle events (started, completed)
Wildcard patterns (crawl.*)
Auto-retry with backoff

Multi-Tenant

Team-ready organization.

Accounts & organizations
Role-based access (owner, admin)
Account-level settings

Live Databases

Monitor business listings and track changes across Google Maps in real time.

Google Maps business monitoring
Automated change detection
Review tracking & alerts
Configurable watched fields
Event notifications (listing.*, review.*)
Web UI for management

Page Data Extraction

Structured data from any page.

Markdown, metadata, or both
Schema-based structured extraction
Custom LLM providers (OpenAI, Anthropic, Google)
REST API & MCP tool

Data Retention

Configurable 1-7 day retention

Rate Limiting

100 req/60s with standard headers

Web Dashboard

Manage crawls, listings & keys from the browser

Up and Running in Minutes

Deploy Mulberry on your VM and start crawling. No complex setup, no vendor dependencies.

1

Deploy to Your VM

Clone the repo and run a single Docker command. Mulberry comes with everything pre-configured—PostgreSQL, reverse proxy, and SSL.

2

Create Your Account

Sign up with magic link or password authentication. Set up your organization and invite team members with role-based access.

3

Generate API Keys

Create private keys for full access or public keys for read-only operations. Configure expiration and track usage automatically.

4

Start Working

Start crawling, monitor business listings with Live Databases, extract structured data from pages, or connect AI agents via MCP. Set up webhooks for real-time notifications.

Quick Start
# Pull and deploy
 $ docker pull agoodway/mulberry_bot
 $ docker run -d -p 4000:4000 agoodway/mulberry_bot
# Create your first crawl
 $ curl -X POST \

-H "Authorization: Bearer $API_KEY" \

-d '{"url": "https://docs.example.com"}' \

https://your-mulberry-server/api/crawls
# Connect AI agent via MCP
{
 "mcpServers": {
 "mulberry": {
 "url": "https://your-server/mcp", 
 "token": "sk_..." 
}
}
}

Built for Real Workflows

From AI agents to data pipelines, Mulberry powers production workloads at any scale.

AI Agent Integration

Give your AI agents the ability to crawl and understand any website. Native MCP support means Claude, GPT, and other agents can request crawls and access results directly.

MCP Tools RAG Pipelines Research Agents

Data Pipelines

Build automated data collection workflows. Webhooks notify your systems when crawls complete, and the REST API integrates with any ETL tool or data platform.

Webhooks REST API JSON Export

Content Aggregation

Monitor documentation sites, news sources, or competitor pages. Regex filtering lets you extract exactly what you need, and Markdown output is perfect for content systems.

Markdown URL Filtering Scheduling

Documentation Search

Index your own docs or crawl external documentation. Perfect for building searchable knowledge bases or feeding context to AI assistants.

Full Text Site Traversal Depth Control

Website Monitoring

Track changes across websites and business listings over time. Get notified when content changes, prices update, new reviews appear, or listing details are modified.

Change Detection Listing Monitoring Alerts

Reputation Monitoring

Track your business reviews and competitor listings across Google Maps. Get alerted to new reviews, rating changes, and listing modifications as they happen.

Review Tracking Local SEO Competitive Intel

Lead Generation

Build prospect lists from Google Maps listings and business directories. Extract contact details, hours, and ratings with Live Databases, then enrich with Page Extraction.

Live Databases Data Enrichment Prospect Lists

Market Research

Gather competitive intelligence and market data. URL list mode lets you crawl specific pages across multiple sites in a single job.

URL Lists Batch Jobs Multi-site

Your Data. Your Infrastructure.

In a world of SaaS sprawl and data concerns, Mulberry puts you back in control. Run it on your own servers, keep your data private, and never worry about vendor lock-in or surprise pricing.

Complete Data Privacy

Your crawled data never leaves your infrastructure. No third-party access, no data processing concerns, full GDPR compliance.

Zero Recurring Costs

Free software forever. Pay only for your server costs. No per-crawl pricing, no API call limits, no surprise bills.

Full Customization

Modify the source code, add custom features, integrate with internal systems. It's your software to extend as needed.

No Vendor Lock-in

Standard APIs, portable data formats, open protocols. Switch, fork, or modify without losing your work.

Free

Free core forever

∞

Crawls/Month

100%

Data Ownership

Runs on any VM with Docker. Recommended: 2 vCPU, 4GB RAM.

Choose Your Plan

Start free with the full crawling platform. Add Pro features when you need business monitoring and structured extraction.

Free

Open source core platform, free forever.

Web crawling engine
Webhooks & event system
MCP server for AI agents
REST API & Web Dashboard
Multi-tenant & API keys

Get Started

PRO

Pro

Business monitoring, extraction & higher limits.

Everything in Free
Live Databases & listing monitoring
Page Data Extraction (LLM-powered)
Review tracking & alerts
Higher rate limits

Contact Us

Mulberry Dashboard

Crawls

Live Databases

API Keys

Webhooks

Settings

Active Crawls

3

Listings Tracked

127

Pages Crawled

4,291

docs.example.com 247 pages Complete

blog.example.com 82 pages Running

Joe's Coffee — Google Maps 3 reviews Monitoring

Ready to Take Control?

Deploy Mulberry on your VM today. Free software, ready for production in minutes.

View on Docker Hub Explore Features

Quick Install

$ docker pull agoodway/mulberry_bot
$ docker run -d -p 4000:4000 agoodway/mulberry_bot