🌐 Browser MCP Agent

A Streamlit application that allows you to browse and interact with websites using natural language commands through the Model Context Protocol (MCP) and [MCP-Agent]

Features

Natural Language Interface: Control a browser with simple English commands
Full Browser Navigation: Visit websites and navigate through pages
Interactive Elements: Click buttons, fill forms, and scroll through content
Visual Feedback: Take screenshots of webpage elements
Information Extraction: Extract and summarize content from webpages
Multi-step Tasks: Complete complex browsing sequences through conversation

Setup

Requirements

Python 3.8+
Node.js and npm (for Playwright)
- This is a critical requirement! The app uses Playwright to control a headless browser
- Download and install from nodejs.org
OpenAI or Anthropic API Key

Installation

Clone this repository:

git clone https://github.com/gsgaurav0/Browser-MCP-Agent.git
cd mcp_ai_agents/browser_mcp_agent

Install the required Python packages:
```
pip install -r requirements.txt
```
Verify Node.js and npm are installed:
```
node --version
npm --version
```
Both commands should return version numbers. If they don't, please install Node.js.
Set up your API keys:
- Set OpenAI API Key as an environment variable:
```
export OPENAI_API_KEY=your-openai-api-key
```

Running the App

Start the Streamlit app:
```
streamlit run main.py
```
In the app interface:
- Enter your browsing command
- Click "Run Command"
- View the results and screenshots

Example Commands

Basic Navigation

"Go to www.mcp-agent.com"
"Go back to the previous page"

Interaction

"Click on the login button"
"Scroll down to see more content"

Content Extraction

"Summarize the main content of this page"
"Extract the navigation menu items"
"Take a screenshot of the hero section"

Multi-step Tasks

"Go to the blog, find the most recent article, and summarize its key points"

Architecture

The application uses:

Streamlit for the user interface
MCP (Model Context Protocol) to connect the LLM with tools
Playwright for browser automation
OpenAI's models to interpret commands and generate responses

MCP Servers

🌐 Browser MCP Agent

Features

Setup

Requirements

Installation

Running the App

Example Commands

Basic Navigation

Interaction

Content Extraction

Multi-step Tasks

Architecture

安装包（如果需要）

Cursor 配置 (mcp.json)

Github MCP Agent

🌐 Browser MCP Agent

Features

Setup

Requirements

Installation

Running the App

Example Commands

Basic Navigation

Interaction

Content Extraction

Multi-step Tasks

Architecture

安装包 （如果需要）

Cursor 配置 (mcp.json)

Github MCP Agent

安装包（如果需要）