B
Browser MCP Agent
作者 @gsgaurav0
A Streamlit application that allows you to browse and interact with websites using natural language commands through the Model Context Protocol (MCP)
创建于 1/17/2026
更新于 about 6 hours ago
README
Repository documentation and setup instructions
🌐 Browser MCP Agent
A Streamlit application that allows you to browse and interact with websites using natural language commands through the Model Context Protocol (MCP) and [MCP-Agent]
Features
- Natural Language Interface: Control a browser with simple English commands
- Full Browser Navigation: Visit websites and navigate through pages
- Interactive Elements: Click buttons, fill forms, and scroll through content
- Visual Feedback: Take screenshots of webpage elements
- Information Extraction: Extract and summarize content from webpages
- Multi-step Tasks: Complete complex browsing sequences through conversation
Setup
Requirements
- Python 3.8+
- Node.js and npm (for Playwright)
- This is a critical requirement! The app uses Playwright to control a headless browser
- Download and install from nodejs.org
- OpenAI or Anthropic API Key
Installation
-
Clone this repository:
git clone https://github.com/gsgaurav0/Browser-MCP-Agent.git cd mcp_ai_agents/browser_mcp_agent -
Install the required Python packages:
pip install -r requirements.txt -
Verify Node.js and npm are installed:
node --version npm --versionBoth commands should return version numbers. If they don't, please install Node.js.
-
Set up your API keys:
- Set OpenAI API Key as an environment variable:
export OPENAI_API_KEY=your-openai-api-key
- Set OpenAI API Key as an environment variable:
Running the App
-
Start the Streamlit app:
streamlit run main.py -
In the app interface:
- Enter your browsing command
- Click "Run Command"
- View the results and screenshots
Example Commands
Basic Navigation
- "Go to www.mcp-agent.com"
- "Go back to the previous page"
Interaction
- "Click on the login button"
- "Scroll down to see more content"
Content Extraction
- "Summarize the main content of this page"
- "Extract the navigation menu items"
- "Take a screenshot of the hero section"
Multi-step Tasks
- "Go to the blog, find the most recent article, and summarize its key points"
Architecture
The application uses:
- Streamlit for the user interface
- MCP (Model Context Protocol) to connect the LLM with tools
- Playwright for browser automation
- OpenAI's models to interpret commands and generate responses
快速设置
此服务器的安装指南
安装包 (如果需要)
uvx browser-mcp-agent
Cursor 配置 (mcp.json)
{
"mcpServers": {
"gsgaurav0-browser-mcp-agent": {
"command": "uvx",
"args": [
"browser-mcp-agent"
]
}
}
}