What it does
Browserbase MCP Server provides cloud browser automation capabilities powered by Browserbase and Stagehand. It enables LLMs to control headless browsers, navigate websites, interact with page elements (clicking, form-filling), capture screenshots, and extract structured data from web pages. The server supports multiple LLM models including OpenAI, Claude, and Gemini, with vision support for analyzing complex DOM structures through annotated screenshots. It can manage multiple parallel browser sessions and offers configuration options for stealth mode, proxy support, session persistence, and custom cookies.
Who it's for
Backend and AI engineers building LLM-powered applications that need web automation. Developers creating AI-powered IDEs, chat interfaces, or custom workflows requiring programmatic browser control and web data extraction.
Common use cases
- Extract structured data from websites using LLM vision analysis
- Automate form filling and web interactions from Claude Code or similar LLM interfaces
- Capture and analyze page screenshots for complex UI understanding
- Gather research data across multiple websites in parallel sessions
- Test web applications by simulating user interactions
Setup pitfalls
- Requires
BROWSERBASE_API_KEYandBROWSERBASE_PROJECT_IDenvironment variables; obtain these from the Browserbase dashboard - When using non-default LLM models (OpenAI, Gemini), corresponding API keys must be provided as environment variables
- Last commit was approximately 8 months ago—verify compatibility with current browser engine versions and Browserbase API changes
- Browser session persistence is enabled by default; disable with
--persist falseif you need fresh sessions for each request