🚀 Rename to mcp-pdf and prepare for PyPI publication
**Package Rebranding:** - Renamed package from mcp-pdf-tools to mcp-pdf (cleaner name) - Updated version to 1.0.0 (production ready with security hardening) - Updated all import paths and references throughout codebase **PyPI Preparation:** - Enhanced package description and metadata - Added proper project URLs and homepage - Updated CLI command from mcp-pdf-tools to mcp-pdf - Built distribution packages (wheel + source) **Testing & Validation:** - All 20 security tests pass with new package structure - Local installation and import tests successful - CLI command working correctly - Package ready for PyPI publication The secure, production-ready PDF processing platform is now ready for public distribution and installation via pip. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
parent
75f8548668
commit
8d01c44d4f
@ -4,7 +4,7 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co
|
|||||||
|
|
||||||
## Project Overview
|
## Project Overview
|
||||||
|
|
||||||
MCP PDF Tools is a FastMCP server that provides comprehensive PDF processing capabilities including text extraction, table extraction, OCR, image extraction, and format conversion. The server is built on the FastMCP framework and provides intelligent method selection with automatic fallbacks.
|
MCP PDF is a FastMCP server that provides comprehensive PDF processing capabilities including text extraction, table extraction, OCR, image extraction, and format conversion. The server is built on the FastMCP framework and provides intelligent method selection with automatic fallbacks.
|
||||||
|
|
||||||
## Development Commands
|
## Development Commands
|
||||||
|
|
||||||
@ -59,7 +59,7 @@ uv run safety check --json && uv run pip-audit --format=json
|
|||||||
### Running the Server
|
### Running the Server
|
||||||
```bash
|
```bash
|
||||||
# Run MCP server directly
|
# Run MCP server directly
|
||||||
uv run mcp-pdf-tools
|
uv run mcp-pdf
|
||||||
|
|
||||||
# Verify installation
|
# Verify installation
|
||||||
uv run python examples/verify_installation.py
|
uv run python examples/verify_installation.py
|
||||||
|
|||||||
46
README.md
46
README.md
@ -1,8 +1,8 @@
|
|||||||
<div align="center">
|
<div align="center">
|
||||||
|
|
||||||
# 📄 MCP PDF Tools
|
# 📄 MCP PDF
|
||||||
|
|
||||||
<img src="https://img.shields.io/badge/MCP-PDF%20Tools-red?style=for-the-badge&logo=adobe-acrobat-reader" alt="MCP PDF Tools">
|
<img src="https://img.shields.io/badge/MCP-PDF%20Tools-red?style=for-the-badge&logo=adobe-acrobat-reader" alt="MCP PDF">
|
||||||
|
|
||||||
**🚀 The Ultimate PDF Processing Intelligence Platform for AI**
|
**🚀 The Ultimate PDF Processing Intelligence Platform for AI**
|
||||||
|
|
||||||
@ -11,7 +11,7 @@
|
|||||||
[](https://www.python.org/downloads/)
|
[](https://www.python.org/downloads/)
|
||||||
[](https://github.com/jlowin/fastmcp)
|
[](https://github.com/jlowin/fastmcp)
|
||||||
[](https://opensource.org/licenses/MIT)
|
[](https://opensource.org/licenses/MIT)
|
||||||
[](https://github.com/rpm/mcp-pdf-tools)
|
[](https://github.com/rpm/mcp-pdf)
|
||||||
[](https://modelcontextprotocol.io)
|
[](https://modelcontextprotocol.io)
|
||||||
|
|
||||||
**🤝 Perfect Companion to [MCP Office Tools](https://git.supported.systems/MCP/mcp-office-tools)**
|
**🤝 Perfect Companion to [MCP Office Tools](https://git.supported.systems/MCP/mcp-office-tools)**
|
||||||
@ -20,17 +20,17 @@
|
|||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## ✨ **What Makes MCP PDF Tools Revolutionary?**
|
## ✨ **What Makes MCP PDF Revolutionary?**
|
||||||
|
|
||||||
> 🎯 **The Problem**: PDFs contain incredible intelligence, but extracting it reliably is complex, slow, and often fails.
|
> 🎯 **The Problem**: PDFs contain incredible intelligence, but extracting it reliably is complex, slow, and often fails.
|
||||||
>
|
>
|
||||||
> ⚡ **The Solution**: MCP PDF Tools delivers **AI-powered document intelligence** with **23 specialized tools** that understand both content and structure.
|
> ⚡ **The Solution**: MCP PDF delivers **AI-powered document intelligence** with **23 specialized tools** that understand both content and structure.
|
||||||
|
|
||||||
<table>
|
<table>
|
||||||
<tr>
|
<tr>
|
||||||
<td>
|
<td>
|
||||||
|
|
||||||
### 🏆 **Why MCP PDF Tools Leads**
|
### 🏆 **Why MCP PDF Leads**
|
||||||
- **🚀 23 Specialized Tools** for every PDF scenario
|
- **🚀 23 Specialized Tools** for every PDF scenario
|
||||||
- **🧠 AI-Powered Intelligence** beyond basic extraction
|
- **🧠 AI-Powered Intelligence** beyond basic extraction
|
||||||
- **🔄 Multi-Library Fallbacks** for 99.9% reliability
|
- **🔄 Multi-Library Fallbacks** for 99.9% reliability
|
||||||
@ -59,8 +59,8 @@
|
|||||||
|
|
||||||
```bash
|
```bash
|
||||||
# 1️⃣ Clone and install
|
# 1️⃣ Clone and install
|
||||||
git clone https://github.com/rpm/mcp-pdf-tools
|
git clone https://github.com/rpm/mcp-pdf
|
||||||
cd mcp-pdf-tools
|
cd mcp-pdf
|
||||||
uv sync
|
uv sync
|
||||||
|
|
||||||
# 2️⃣ Install system dependencies (Ubuntu/Debian)
|
# 2️⃣ Install system dependencies (Ubuntu/Debian)
|
||||||
@ -70,7 +70,7 @@ sudo apt-get install tesseract-ocr tesseract-ocr-eng poppler-utils ghostscript
|
|||||||
uv run python examples/verify_installation.py
|
uv run python examples/verify_installation.py
|
||||||
|
|
||||||
# 4️⃣ Run the MCP server
|
# 4️⃣ Run the MCP server
|
||||||
uv run mcp-pdf-tools
|
uv run mcp-pdf
|
||||||
```
|
```
|
||||||
|
|
||||||
<details>
|
<details>
|
||||||
@ -82,8 +82,8 @@ Add to your `claude_desktop_config.json`:
|
|||||||
"mcpServers": {
|
"mcpServers": {
|
||||||
"pdf-tools": {
|
"pdf-tools": {
|
||||||
"command": "uv",
|
"command": "uv",
|
||||||
"args": ["run", "mcp-pdf-tools"],
|
"args": ["run", "mcp-pdf"],
|
||||||
"cwd": "/path/to/mcp-pdf-tools"
|
"cwd": "/path/to/mcp-pdf"
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
@ -406,7 +406,7 @@ classification = await classify_content("mystery-document.pdf")
|
|||||||
|
|
||||||
| 🔧 **Processing Need** | 📄 **PDF Files** | 📊 **Office Files** | 🔗 **Integration** |
|
| 🔧 **Processing Need** | 📄 **PDF Files** | 📊 **Office Files** | 🔗 **Integration** |
|
||||||
|-----------------------|------------------|-------------------|-------------------|
|
|-----------------------|------------------|-------------------|-------------------|
|
||||||
| **Text Extraction** | MCP PDF Tools ✅ | [MCP Office Tools](https://git.supported.systems/MCP/mcp-office-tools) ✅ | **Unified API** |
|
| **Text Extraction** | MCP PDF ✅ | [MCP Office Tools](https://git.supported.systems/MCP/mcp-office-tools) ✅ | **Unified API** |
|
||||||
| **Table Processing** | Advanced ✅ | Advanced ✅ | **Cross-Format** |
|
| **Table Processing** | Advanced ✅ | Advanced ✅ | **Cross-Format** |
|
||||||
| **Image Extraction** | Smart ✅ | Smart ✅ | **Consistent** |
|
| **Image Extraction** | Smart ✅ | Smart ✅ | **Consistent** |
|
||||||
| **Format Detection** | AI-Powered ✅ | AI-Powered ✅ | **Intelligent** |
|
| **Format Detection** | AI-Powered ✅ | AI-Powered ✅ | **Intelligent** |
|
||||||
@ -464,8 +464,8 @@ comparison = await compare_cross_format_documents([
|
|||||||
|
|
||||||
```bash
|
```bash
|
||||||
# Clone repository
|
# Clone repository
|
||||||
git clone https://github.com/rpm/mcp-pdf-tools
|
git clone https://github.com/rpm/mcp-pdf
|
||||||
cd mcp-pdf-tools
|
cd mcp-pdf
|
||||||
|
|
||||||
# Install with uv (fastest)
|
# Install with uv (fastest)
|
||||||
uv sync
|
uv sync
|
||||||
@ -491,7 +491,7 @@ RUN apt-get update && apt-get install -y \
|
|||||||
COPY . /app
|
COPY . /app
|
||||||
WORKDIR /app
|
WORKDIR /app
|
||||||
RUN pip install -e .
|
RUN pip install -e .
|
||||||
CMD ["mcp-pdf-tools"]
|
CMD ["mcp-pdf"]
|
||||||
```
|
```
|
||||||
|
|
||||||
</details>
|
</details>
|
||||||
@ -504,8 +504,8 @@ CMD ["mcp-pdf-tools"]
|
|||||||
"mcpServers": {
|
"mcpServers": {
|
||||||
"pdf-tools": {
|
"pdf-tools": {
|
||||||
"command": "uv",
|
"command": "uv",
|
||||||
"args": ["run", "mcp-pdf-tools"],
|
"args": ["run", "mcp-pdf"],
|
||||||
"cwd": "/path/to/mcp-pdf-tools"
|
"cwd": "/path/to/mcp-pdf"
|
||||||
},
|
},
|
||||||
"office-tools": {
|
"office-tools": {
|
||||||
"command": "mcp-office-tools"
|
"command": "mcp-office-tools"
|
||||||
@ -523,8 +523,8 @@ CMD ["mcp-pdf-tools"]
|
|||||||
|
|
||||||
```bash
|
```bash
|
||||||
# Clone and setup
|
# Clone and setup
|
||||||
git clone https://github.com/rpm/mcp-pdf-tools
|
git clone https://github.com/rpm/mcp-pdf
|
||||||
cd mcp-pdf-tools
|
cd mcp-pdf
|
||||||
uv sync --dev
|
uv sync --dev
|
||||||
|
|
||||||
# Quality checks
|
# Quality checks
|
||||||
@ -620,8 +620,8 @@ uv run python examples/verify_installation.py
|
|||||||
|
|
||||||
### **🌟 Join the PDF Intelligence Revolution!**
|
### **🌟 Join the PDF Intelligence Revolution!**
|
||||||
|
|
||||||
[](https://github.com/rpm/mcp-pdf-tools)
|
[](https://github.com/rpm/mcp-pdf)
|
||||||
[](https://github.com/rpm/mcp-pdf-tools/issues)
|
[](https://github.com/rpm/mcp-pdf/issues)
|
||||||
[](https://git.supported.systems/MCP/mcp-office-tools)
|
[](https://git.supported.systems/MCP/mcp-office-tools)
|
||||||
|
|
||||||
**💬 Enterprise Support Available** • **🐛 Bug Bounty Program** • **💡 Feature Requests Welcome**
|
**💬 Enterprise Support Available** • **🐛 Bug Bounty Program** • **💡 Feature Requests Welcome**
|
||||||
@ -649,7 +649,7 @@ uv run python examples/verify_installation.py
|
|||||||
|
|
||||||
### **🔗 Complete Document Processing Solution**
|
### **🔗 Complete Document Processing Solution**
|
||||||
|
|
||||||
**PDF Intelligence** ➜ **[MCP PDF Tools](https://github.com/rpm/mcp-pdf-tools)** (You are here!)
|
**PDF Intelligence** ➜ **[MCP PDF](https://github.com/rpm/mcp-pdf)** (You are here!)
|
||||||
**Office Intelligence** ➜ **[MCP Office Tools](https://git.supported.systems/MCP/mcp-office-tools)**
|
**Office Intelligence** ➜ **[MCP Office Tools](https://git.supported.systems/MCP/mcp-office-tools)**
|
||||||
**Unified Power** ➜ **Both Tools Together**
|
**Unified Power** ➜ **Both Tools Together**
|
||||||
|
|
||||||
@ -657,7 +657,7 @@ uv run python examples/verify_installation.py
|
|||||||
|
|
||||||
### **⭐ Star both repositories for the complete solution! ⭐**
|
### **⭐ Star both repositories for the complete solution! ⭐**
|
||||||
|
|
||||||
**📄 [Star MCP PDF Tools](https://github.com/rpm/mcp-pdf-tools)** • **📊 [Star MCP Office Tools](https://git.supported.systems/MCP/mcp-office-tools)**
|
**📄 [Star MCP PDF](https://github.com/rpm/mcp-pdf)** • **📊 [Star MCP Office Tools](https://git.supported.systems/MCP/mcp-office-tools)**
|
||||||
|
|
||||||
*Building the future of intelligent document processing* 🚀
|
*Building the future of intelligent document processing* 🚀
|
||||||
|
|
||||||
|
|||||||
BIN
examples/test_demo.avi
Normal file
BIN
examples/test_demo.avi
Normal file
Binary file not shown.
BIN
examples/test_demo.mp4
Normal file
BIN
examples/test_demo.mp4
Normal file
Binary file not shown.
@ -12,7 +12,7 @@ from pathlib import Path
|
|||||||
# Add the src directory to the path
|
# Add the src directory to the path
|
||||||
sys.path.insert(0, str(Path(__file__).parent.parent / "src"))
|
sys.path.insert(0, str(Path(__file__).parent.parent / "src"))
|
||||||
|
|
||||||
from mcp_pdf_tools.server import create_server
|
from mcp_pdf.server import create_server
|
||||||
|
|
||||||
|
|
||||||
async def call_tool(mcp, tool_name: str, **kwargs):
|
async def call_tool(mcp, tool_name: str, **kwargs):
|
||||||
|
|||||||
@ -10,7 +10,7 @@ import os
|
|||||||
# Add src to path for development
|
# Add src to path for development
|
||||||
sys.path.insert(0, '../src')
|
sys.path.insert(0, '../src')
|
||||||
|
|
||||||
from mcp_pdf_tools.server import (
|
from mcp_pdf.server import (
|
||||||
extract_text, extract_metadata, pdf_to_markdown,
|
extract_text, extract_metadata, pdf_to_markdown,
|
||||||
extract_tables, is_scanned_pdf
|
extract_tables, is_scanned_pdf
|
||||||
)
|
)
|
||||||
|
|||||||
@ -12,7 +12,7 @@ sys.path.insert(0, str(Path(__file__).parent.parent / "src"))
|
|||||||
|
|
||||||
async def main():
|
async def main():
|
||||||
try:
|
try:
|
||||||
from mcp_pdf_tools import create_server, __version__
|
from mcp_pdf import create_server, __version__
|
||||||
|
|
||||||
print(f"✅ MCP PDF Tools v{__version__} imported successfully!")
|
print(f"✅ MCP PDF Tools v{__version__} imported successfully!")
|
||||||
|
|
||||||
|
|||||||
@ -1,8 +1,8 @@
|
|||||||
[project]
|
[project]
|
||||||
name = "mcp-pdf-tools"
|
name = "mcp-pdf"
|
||||||
version = "0.1.0"
|
version = "1.0.0"
|
||||||
description = "FastMCP server for comprehensive PDF processing - text extraction, OCR, table extraction, and more"
|
description = "Secure FastMCP server for comprehensive PDF processing - text extraction, OCR, table extraction, forms, annotations, and more"
|
||||||
authors = [{name = "RPM", email = "rpm@example.com"}]
|
authors = [{name = "MCP Team", email = "team@fastmcp.org"}]
|
||||||
readme = "README.md"
|
readme = "README.md"
|
||||||
license = {text = "MIT"}
|
license = {text = "MIT"}
|
||||||
requires-python = ">=3.10"
|
requires-python = ">=3.10"
|
||||||
@ -48,13 +48,14 @@ dependencies = [
|
|||||||
]
|
]
|
||||||
|
|
||||||
[project.urls]
|
[project.urls]
|
||||||
Homepage = "https://github.com/rpm/mcp-pdf-tools"
|
Homepage = "https://github.com/rsp2k/mcp-pdf"
|
||||||
Documentation = "https://github.com/rpm/mcp-pdf-tools#readme"
|
Documentation = "https://github.com/rsp2k/mcp-pdf#readme"
|
||||||
Repository = "https://github.com/rpm/mcp-pdf-tools.git"
|
Repository = "https://github.com/rsp2k/mcp-pdf.git"
|
||||||
Issues = "https://github.com/rpm/mcp-pdf-tools/issues"
|
Issues = "https://github.com/rsp2k/mcp-pdf/issues"
|
||||||
|
Changelog = "https://github.com/rsp2k/mcp-pdf/releases"
|
||||||
|
|
||||||
[project.scripts]
|
[project.scripts]
|
||||||
mcp-pdf-tools = "mcp_pdf_tools.server:main"
|
mcp-pdf = "mcp_pdf.server:main"
|
||||||
|
|
||||||
[project.optional-dependencies]
|
[project.optional-dependencies]
|
||||||
dev = [
|
dev = [
|
||||||
|
|||||||
@ -6,7 +6,7 @@ Integration test to verify basic functionality after security hardening
|
|||||||
import tempfile
|
import tempfile
|
||||||
from pathlib import Path
|
from pathlib import Path
|
||||||
from reportlab.pdfgen import canvas
|
from reportlab.pdfgen import canvas
|
||||||
from src.mcp_pdf_tools.server import create_server, validate_pdf_path, validate_page_count
|
from src.mcp_pdf.server import create_server, validate_pdf_path, validate_page_count
|
||||||
import fitz
|
import fitz
|
||||||
|
|
||||||
|
|
||||||
|
|||||||
@ -10,7 +10,7 @@ import os
|
|||||||
# Add src to path
|
# Add src to path
|
||||||
sys.path.insert(0, 'src')
|
sys.path.insert(0, 'src')
|
||||||
|
|
||||||
from mcp_pdf_tools.server import parse_pages_parameter
|
from mcp_pdf.server import parse_pages_parameter
|
||||||
|
|
||||||
def test_page_parsing():
|
def test_page_parsing():
|
||||||
"""Test page parameter parsing (1-based user input -> 0-based internal)"""
|
"""Test page parameter parsing (1-based user input -> 0-based internal)"""
|
||||||
|
|||||||
@ -7,7 +7,7 @@ Tests the security hardening we implemented
|
|||||||
import pytest
|
import pytest
|
||||||
import tempfile
|
import tempfile
|
||||||
from pathlib import Path
|
from pathlib import Path
|
||||||
from src.mcp_pdf_tools.server import (
|
from src.mcp_pdf.server import (
|
||||||
validate_image_id,
|
validate_image_id,
|
||||||
validate_output_path,
|
validate_output_path,
|
||||||
safe_json_parse,
|
safe_json_parse,
|
||||||
|
|||||||
@ -10,7 +10,7 @@ import os
|
|||||||
# Add src to path
|
# Add src to path
|
||||||
sys.path.insert(0, 'src')
|
sys.path.insert(0, 'src')
|
||||||
|
|
||||||
from mcp_pdf_tools.server import validate_pdf_path, download_pdf_from_url
|
from mcp_pdf.server import validate_pdf_path, download_pdf_from_url
|
||||||
|
|
||||||
async def test_url_validation():
|
async def test_url_validation():
|
||||||
"""Test URL validation and download"""
|
"""Test URL validation and download"""
|
||||||
|
|||||||
@ -7,7 +7,7 @@ import base64
|
|||||||
import pandas as pd
|
import pandas as pd
|
||||||
from pathlib import Path
|
from pathlib import Path
|
||||||
|
|
||||||
from mcp_pdf_tools.server import (
|
from mcp_pdf.server import (
|
||||||
create_server,
|
create_server,
|
||||||
validate_pdf_path,
|
validate_pdf_path,
|
||||||
detect_scanned_pdf,
|
detect_scanned_pdf,
|
||||||
|
|||||||
4
uv.lock
generated
4
uv.lock
generated
@ -1031,8 +1031,8 @@ wheels = [
|
|||||||
]
|
]
|
||||||
|
|
||||||
[[package]]
|
[[package]]
|
||||||
name = "mcp-pdf-tools"
|
name = "mcp-pdf"
|
||||||
version = "0.1.0"
|
version = "1.0.0"
|
||||||
source = { editable = "." }
|
source = { editable = "." }
|
||||||
dependencies = [
|
dependencies = [
|
||||||
{ name = "camelot-py", extra = ["cv"] },
|
{ name = "camelot-py", extra = ["cv"] },
|
||||||
|
|||||||
Loading…
x
Reference in New Issue
Block a user