MCP Servers

A collection of Model Context Protocol servers, templates, tools and more.

D
Data Query Builder MCP

MCP Server that turns natural-language questions into SQL queries on CSV data. Built with Python, FastMCP, and SQLite for Gemini CLI integration.

Created 3/5/2026
Updated about 4 hours ago
Repository documentation and setup instructions

Data Query Builder — MCP Server

Overview

An MCP server that turns natural-language data questions into SQL queries. Load any CSV file into an in-memory SQLite database, explore schemas, run read-only queries, and compute column statistics — all through AI-powered tool use via Gemini CLI.

Setup

cd Project
python -m venv .venv

# Windows
.venv\Scripts\activate

# macOS / Linux
# source .venv/bin/activate

pip install "mcp[cli]"

No additional dependencies — sqlite3 and csv are part of the Python standard library.

Gemini CLI Configuration

Add the following to ~/.gemini/settings.json:

{
  "mcpServers": {
    "data-query-builder": {
      "command": "C:\\Users\\abdul\\Downloads\\Electives\\MCP\\Project\\.venv\\Scripts\\python.exe",
      "args": ["C:\\Users\\abdul\\Downloads\\Electives\\MCP\\Project\\server.py"]
    }
  }
}

After editing, relaunch Gemini CLI to pick up the new server.

Tools

load_csv

Description: Load a CSV file into a new SQLite table with auto-detected column types (INTEGER, REAL, TEXT).

| Parameter | Type | Required | Description | |-------------|-------|----------|------------------------------------------| | file_path | str | Yes | Absolute or relative path to the CSV file | | table_name | str | Yes | Name for the new SQLite table |

Example: "Load the file sample_data.csv as a table called sales" → calls load_csv


list_tables

Description: List all tables currently loaded in the database with their row counts and column info.

| Parameter | Type | Required | Description | |-----------|------|----------|-------------| | (none) | | | |

Example: "What data do I have loaded?" → calls list_tables


describe_schema

Description: Show the full database schema — all tables, columns, and their data types.

| Parameter | Type | Required | Description | |-----------|------|----------|-------------| | (none) | | | |

Example: "What columns does the sales table have?" → calls describe_schema


run_query

Description: Execute a read-only SQL SELECT query and return formatted results. Rejects any write operations (DROP, DELETE, ALTER, INSERT, UPDATE) for safety.

| Parameter | Type | Required | Description | |-----------|------|----------|--------------------------------------------| | sql | str | Yes | A SQL SELECT query | | limit | int | No | Max rows to return (default 50, max 500) |

Example: "Show me the average price per category" → calls run_query with SELECT category, AVG(price) FROM sales GROUP BY category


get_statistics

Description: Compute summary statistics for a column — count, min, max, mean, sum, and null count (numeric columns) or count, distinct, nulls (text columns).

| Parameter | Type | Required | Description | |-------------|------|----------|---------------------------------| | table_name | str | Yes | Name of the table to analyze | | column | str | Yes | Column to compute statistics for |

Example: "Give me stats on the price column" → calls get_statistics with table_name="sales", column="price"

Resources

db://schema

Current database schema as JSON — lists all tables, their columns, and data types.

db://query-history

JSON list of all SQL queries executed this session, including SQL text, row count, and column names.

Security

  • run_query rejects any SQL containing DROP, DELETE, ALTER, INSERT, UPDATE, CREATE, or other write keywords.
  • The database is in-memory only — no persistent changes to disk.
  • A default row limit of 50 prevents excessively large outputs.

Limitations

  • In-memory database: Data does not persist between server restarts. CSVs must be reloaded each session.
  • CSV only: Does not support Excel, JSON, or other file formats directly.
  • No joins at load time: Each CSV becomes a separate table; joins must be done via SQL queries.
  • Type detection is heuristic: Based on the first 100 rows — mixed-type columns may be misdetected.

Test Scenarios

| Scenario | Expected Tool Sequence | |----------|----------------------| | "Load sales data and find highest revenue region" | load_csvdescribe_schemarun_query | | "Average price per product category?" | load_csvrun_queryget_statistics | | "What tables are loaded and what columns do they have?" | list_tablesdescribe_schema |

Comparison Results

With vs. Without Tools

| Dimension | Without Tools | With Tools | |-------------|-------------------------------------------|---------------------------------------------| | Accuracy | Hallucinates numbers or uses only pasted data | Uses real computed values from SQL queries | | Specificity | Generic advice about data analysis | Specific answers from actual data | | Completeness | Limited to what's in the prompt | Can explore schema, run multiple queries | | Confidence | Hedges and qualifies heavily | Cites exact tool results and row counts | | Latency | One fast response | Multiple tool-call round-trips |

Prompting Strategy Comparison

| Aspect | Strategy 1 (Minimal) | Strategy 3 (Expert Workflow) | |----------------------|------------------------------------------|--------------------------------------------------| | Tool calls triggered | Fewer, less targeted | More, systematically sequenced | | Planning | Dives in immediately | States plan before acting | | Synthesis quality | Shallow summary | Structured analysis citing tool outputs | | Errors / dead ends | May skip schema exploration | Explores schema first, then queries purposefully |

Quick Setup
Installation guide for this server

Install Package (if required)

uvx data-query-builder-mcp

Cursor configuration (mcp.json)

{ "mcpServers": { "albekbashy-data-query-builder-mcp": { "command": "uvx", "args": [ "data-query-builder-mcp" ] } } }