🚀 Release v1.0.1: Bug fixes and local development tools
- Fix variable scope bug in extract_text function
- Add local development setup with claude-mcp-manager
- Update author information
- Add comprehensive local development documentation
🤖 Generated with Claude Code
Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
parent
8d01c44d4f
commit
ebf6bb8a43
186
LOCAL_DEVELOPMENT.md
Normal file
186
LOCAL_DEVELOPMENT.md
Normal file
@ -0,0 +1,186 @@
|
||||
# 🔧 Local Development Guide for MCP PDF
|
||||
|
||||
This guide shows how to test MCP PDF locally during development before publishing to PyPI.
|
||||
|
||||
## 📋 Prerequisites
|
||||
|
||||
- Python 3.10+
|
||||
- uv package manager
|
||||
- Claude Desktop app
|
||||
- Git repository cloned locally
|
||||
|
||||
## 🚀 Quick Start for Local Testing
|
||||
|
||||
### 1. Clone and Setup
|
||||
|
||||
```bash
|
||||
# Clone the repository
|
||||
git clone https://github.com/rsp2k/mcp-pdf.git
|
||||
cd mcp-pdf
|
||||
|
||||
# Install dependencies
|
||||
uv sync --dev
|
||||
|
||||
# Verify installation
|
||||
uv run python -c "from mcp_pdf.server import create_server; print('✅ MCP PDF loads successfully')"
|
||||
```
|
||||
|
||||
### 2. Test with Claude Code (Local Development)
|
||||
|
||||
Use the `-t local` flag to point Claude Code to your local development copy:
|
||||
|
||||
```bash
|
||||
# Start Claude Code with local MCP PDF server
|
||||
claude-code -t local /path/to/mcp-pdf
|
||||
```
|
||||
|
||||
Or if you're already in the mcp-pdf directory:
|
||||
|
||||
```bash
|
||||
claude-code -t local .
|
||||
```
|
||||
|
||||
### 3. Alternative: Manual Server Testing
|
||||
|
||||
You can also run the server manually for debugging:
|
||||
|
||||
```bash
|
||||
# Run the MCP server directly
|
||||
uv run mcp-pdf
|
||||
|
||||
# Or run with specific FastMCP options
|
||||
uv run python -m mcp_pdf.server
|
||||
```
|
||||
|
||||
### 4. Test Core Functionality
|
||||
|
||||
Once connected to Claude Code, test these key features:
|
||||
|
||||
#### Basic PDF Processing
|
||||
```
|
||||
"Extract text from this PDF file: /path/to/test.pdf"
|
||||
"Get metadata from this PDF: /path/to/document.pdf"
|
||||
"Check if this PDF is scanned: /path/to/scan.pdf"
|
||||
```
|
||||
|
||||
#### Security Features
|
||||
```
|
||||
"Try to extract text from a very large PDF"
|
||||
"Process a PDF with 2000 pages" (should be limited to 1000)
|
||||
```
|
||||
|
||||
#### Advanced Features
|
||||
```
|
||||
"Extract tables from this PDF: /path/to/tables.pdf"
|
||||
"Convert this PDF to markdown: /path/to/document.pdf"
|
||||
"Add annotations to this PDF: /path/to/target.pdf"
|
||||
```
|
||||
|
||||
## 🔒 Security Testing
|
||||
|
||||
Verify the security hardening works:
|
||||
|
||||
### File Size Limits
|
||||
- Try processing a PDF larger than 100MB
|
||||
- Should see: "PDF file too large: X bytes > 104857600"
|
||||
|
||||
### Page Count Limits
|
||||
- Try processing a PDF with >1000 pages
|
||||
- Should see: "PDF too large for processing: X pages > 1000"
|
||||
|
||||
### Path Traversal Protection
|
||||
- Test with malicious paths like `../../../etc/passwd`
|
||||
- Should be blocked with security error
|
||||
|
||||
### JSON Input Validation
|
||||
- Large JSON inputs (>10KB) should be rejected
|
||||
- Malformed JSON should return clean error messages
|
||||
|
||||
## 🐛 Debugging
|
||||
|
||||
### Enable Debug Logging
|
||||
```bash
|
||||
export DEBUG=true
|
||||
uv run mcp-pdf
|
||||
```
|
||||
|
||||
### Check Security Functions
|
||||
```bash
|
||||
# Test security validation functions
|
||||
uv run python test_security_features.py
|
||||
|
||||
# Run integration tests
|
||||
uv run python test_integration.py
|
||||
```
|
||||
|
||||
### Verify Package Structure
|
||||
```bash
|
||||
# Check package builds correctly
|
||||
uv build
|
||||
|
||||
# Verify package metadata
|
||||
uv run twine check dist/*
|
||||
```
|
||||
|
||||
## 📊 Testing Checklist
|
||||
|
||||
Before publishing, verify:
|
||||
|
||||
- [ ] All 23 PDF tools work correctly
|
||||
- [ ] Security limits are enforced (file size, page count)
|
||||
- [ ] Error messages are clean and helpful
|
||||
- [ ] No sensitive information leaked in errors
|
||||
- [ ] Path traversal protection works
|
||||
- [ ] JSON input validation works
|
||||
- [ ] Memory limits prevent crashes
|
||||
- [ ] CLI command `mcp-pdf` works
|
||||
- [ ] Package imports correctly: `from mcp_pdf.server import create_server`
|
||||
|
||||
## 🚀 Publishing Pipeline
|
||||
|
||||
Once local testing passes:
|
||||
|
||||
1. **Version Bump**: Update version in `pyproject.toml`
|
||||
2. **Build**: `uv build`
|
||||
3. **Test Upload**: `uv run twine upload --repository testpypi dist/*`
|
||||
4. **Test Install**: `pip install -i https://test.pypi.org/simple/ mcp-pdf`
|
||||
5. **Production Upload**: `uv run twine upload dist/*`
|
||||
|
||||
## 🔧 Development Commands
|
||||
|
||||
```bash
|
||||
# Format code
|
||||
uv run black src/ tests/
|
||||
|
||||
# Lint code
|
||||
uv run ruff check src/ tests/
|
||||
|
||||
# Run tests
|
||||
uv run pytest
|
||||
|
||||
# Security scan
|
||||
uv run pip-audit
|
||||
|
||||
# Build package
|
||||
uv build
|
||||
|
||||
# Install editable for development
|
||||
pip install -e . # (in a venv)
|
||||
```
|
||||
|
||||
## 🆘 Troubleshooting
|
||||
|
||||
### "Module not found" errors
|
||||
- Ensure you're in the right directory
|
||||
- Run `uv sync` to install dependencies
|
||||
- Check Python path with `uv run python -c "import sys; print(sys.path)"`
|
||||
|
||||
### MCP server won't start
|
||||
- Check that all system dependencies are installed (tesseract, java, ghostscript)
|
||||
- Verify with: `uv run python examples/verify_installation.py`
|
||||
|
||||
### Security tests fail
|
||||
- Run `uv run python test_security_features.py -v` for detailed output
|
||||
- Check that security constants are properly set
|
||||
|
||||
This setup allows for rapid development and testing without polluting your system Python or needing to publish to PyPI for every change.
|
||||
239
claude-mcp-manager
Normal file
239
claude-mcp-manager
Normal file
@ -0,0 +1,239 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
Claude MCP Manager - Easy management of MCP servers in Claude Desktop
|
||||
Usage: claude mcp add <name> <command> [args...]
|
||||
"""
|
||||
|
||||
import json
|
||||
import sys
|
||||
import os
|
||||
from pathlib import Path
|
||||
import shutil
|
||||
import subprocess
|
||||
from typing import Dict, List, Any, Optional
|
||||
|
||||
|
||||
class ClaudeMCPManager:
|
||||
def __init__(self):
|
||||
self.config_path = Path.home() / ".config" / "Claude" / "claude_desktop_config.json"
|
||||
self.config_backup_dir = Path.home() / ".config" / "Claude" / "backups"
|
||||
self.config_backup_dir.mkdir(exist_ok=True)
|
||||
|
||||
def load_config(self) -> Dict[str, Any]:
|
||||
"""Load Claude Desktop configuration"""
|
||||
if not self.config_path.exists():
|
||||
return {"mcpServers": {}, "globalShortcut": ""}
|
||||
|
||||
try:
|
||||
with open(self.config_path) as f:
|
||||
return json.load(f)
|
||||
except json.JSONDecodeError as e:
|
||||
print(f"❌ Error parsing config: {e}")
|
||||
sys.exit(1)
|
||||
|
||||
def save_config(self, config: Dict[str, Any]):
|
||||
"""Save configuration with backup"""
|
||||
# Create backup
|
||||
if self.config_path.exists():
|
||||
backup_name = f"claude_desktop_config_backup_{int(__import__('time').time())}.json"
|
||||
backup_path = self.config_backup_dir / backup_name
|
||||
shutil.copy2(self.config_path, backup_path)
|
||||
print(f"📁 Config backed up to: {backup_path}")
|
||||
|
||||
# Save new config
|
||||
with open(self.config_path, 'w') as f:
|
||||
json.dump(config, f, indent=2)
|
||||
print(f"✅ Configuration saved to: {self.config_path}")
|
||||
|
||||
def add_server(self, name: str, command: str, args: List[str], env: Optional[Dict[str, str]] = None, directory: Optional[str] = None):
|
||||
"""Add a new MCP server"""
|
||||
config = self.load_config()
|
||||
|
||||
if name in config["mcpServers"]:
|
||||
print(f"⚠️ Server '{name}' already exists. Use 'claude mcp update' to modify.")
|
||||
return False
|
||||
|
||||
server_config = {
|
||||
"command": command,
|
||||
"args": args
|
||||
}
|
||||
|
||||
if env:
|
||||
server_config["env"] = env
|
||||
|
||||
if directory:
|
||||
server_config["cwd"] = directory
|
||||
|
||||
config["mcpServers"][name] = server_config
|
||||
self.save_config(config)
|
||||
print(f"🚀 Added MCP server: {name}")
|
||||
return True
|
||||
|
||||
def remove_server(self, name: str):
|
||||
"""Remove an MCP server"""
|
||||
config = self.load_config()
|
||||
|
||||
if name not in config["mcpServers"]:
|
||||
print(f"❌ Server '{name}' not found")
|
||||
return False
|
||||
|
||||
del config["mcpServers"][name]
|
||||
self.save_config(config)
|
||||
print(f"🗑️ Removed MCP server: {name}")
|
||||
return True
|
||||
|
||||
def list_servers(self):
|
||||
"""List all configured MCP servers"""
|
||||
config = self.load_config()
|
||||
servers = config.get("mcpServers", {})
|
||||
|
||||
if not servers:
|
||||
print("📭 No MCP servers configured")
|
||||
return
|
||||
|
||||
print("📋 Configured MCP servers:")
|
||||
print("=" * 50)
|
||||
|
||||
for name, server_config in servers.items():
|
||||
command = server_config.get("command", "")
|
||||
args = server_config.get("args", [])
|
||||
env = server_config.get("env", {})
|
||||
cwd = server_config.get("cwd", "")
|
||||
|
||||
print(f"🔧 {name}")
|
||||
print(f" Command: {command}")
|
||||
if args:
|
||||
print(f" Args: {' '.join(args)}")
|
||||
if env:
|
||||
print(f" Environment: {dict(list(env.items())[:3])}{'...' if len(env) > 3 else ''}")
|
||||
if cwd:
|
||||
print(f" Directory: {cwd}")
|
||||
print()
|
||||
|
||||
def add_mcp_pdf_local(self, directory: str):
|
||||
"""Add MCP PDF from local development directory"""
|
||||
abs_dir = os.path.abspath(directory)
|
||||
|
||||
if not os.path.exists(abs_dir):
|
||||
print(f"❌ Directory not found: {abs_dir}")
|
||||
return False
|
||||
|
||||
# Check if it's a valid MCP PDF directory
|
||||
required_files = ["pyproject.toml", "src/mcp_pdf/server.py"]
|
||||
for file in required_files:
|
||||
if not os.path.exists(os.path.join(abs_dir, file)):
|
||||
print(f"❌ Not a valid MCP PDF directory (missing: {file})")
|
||||
return False
|
||||
|
||||
return self.add_server(
|
||||
name="mcp-pdf-local",
|
||||
command="uv",
|
||||
args=[
|
||||
"--directory", abs_dir,
|
||||
"run", "mcp-pdf"
|
||||
],
|
||||
env={"PDF_TEMP_DIR": "/tmp/mcp-pdf-processing"},
|
||||
directory=abs_dir
|
||||
)
|
||||
|
||||
def add_mcp_pdf_pip(self):
|
||||
"""Add MCP PDF from pip installation"""
|
||||
return self.add_server(
|
||||
name="mcp-pdf",
|
||||
command="mcp-pdf",
|
||||
args=[],
|
||||
env={"PDF_TEMP_DIR": "/tmp/mcp-pdf-processing"}
|
||||
)
|
||||
|
||||
|
||||
def print_usage():
|
||||
"""Print usage information"""
|
||||
print("""
|
||||
🔧 Claude MCP Manager - Easy MCP server management
|
||||
|
||||
USAGE:
|
||||
claude mcp add <name> <command> [args...] # Add generic MCP server
|
||||
claude mcp add-local <directory> # Add MCP PDF from local dev
|
||||
claude mcp add-pip # Add MCP PDF from pip
|
||||
claude mcp remove <name> # Remove MCP server
|
||||
claude mcp list # List all servers
|
||||
claude mcp help # Show this help
|
||||
|
||||
EXAMPLES:
|
||||
# Add MCP PDF from local development
|
||||
claude mcp add-local /home/user/mcp-pdf
|
||||
|
||||
# Add MCP PDF from pip (after pip install mcp-pdf)
|
||||
claude mcp add-pip
|
||||
|
||||
# Add generic MCP server
|
||||
claude mcp add memory npx -y @modelcontextprotocol/server-memory
|
||||
|
||||
# Add server with environment variables
|
||||
claude mcp add github docker run -i --rm -e GITHUB_TOKEN ghcr.io/github/github-mcp-server
|
||||
|
||||
# Remove a server
|
||||
claude mcp remove mcp-pdf-local
|
||||
|
||||
# List all configured servers
|
||||
claude mcp list
|
||||
|
||||
NOTES:
|
||||
• Configuration saved to: ~/.config/Claude/claude_desktop_config.json
|
||||
• Automatic backups created before changes
|
||||
• Restart Claude Desktop after adding/removing servers
|
||||
""")
|
||||
|
||||
|
||||
def main():
|
||||
if len(sys.argv) < 2:
|
||||
print_usage()
|
||||
sys.exit(1)
|
||||
|
||||
manager = ClaudeMCPManager()
|
||||
command = sys.argv[1].lower()
|
||||
|
||||
if command == "add":
|
||||
if len(sys.argv) < 4:
|
||||
print("❌ Usage: claude mcp add <name> <command> [args...]")
|
||||
sys.exit(1)
|
||||
|
||||
name = sys.argv[2]
|
||||
command = sys.argv[3]
|
||||
args = sys.argv[4:] if len(sys.argv) > 4 else []
|
||||
|
||||
manager.add_server(name, command, args)
|
||||
|
||||
elif command == "add-local":
|
||||
if len(sys.argv) != 3:
|
||||
print("❌ Usage: claude mcp add-local <directory>")
|
||||
sys.exit(1)
|
||||
|
||||
directory = sys.argv[2]
|
||||
manager.add_mcp_pdf_local(directory)
|
||||
|
||||
elif command == "add-pip":
|
||||
manager.add_mcp_pdf_pip()
|
||||
|
||||
elif command == "remove":
|
||||
if len(sys.argv) != 3:
|
||||
print("❌ Usage: claude mcp remove <name>")
|
||||
sys.exit(1)
|
||||
|
||||
name = sys.argv[2]
|
||||
manager.remove_server(name)
|
||||
|
||||
elif command == "list":
|
||||
manager.list_servers()
|
||||
|
||||
elif command in ["help", "--help", "-h"]:
|
||||
print_usage()
|
||||
|
||||
else:
|
||||
print(f"❌ Unknown command: {command}")
|
||||
print_usage()
|
||||
sys.exit(1)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
@ -1,8 +1,8 @@
|
||||
[project]
|
||||
name = "mcp-pdf"
|
||||
version = "1.0.0"
|
||||
version = "1.0.1"
|
||||
description = "Secure FastMCP server for comprehensive PDF processing - text extraction, OCR, table extraction, forms, annotations, and more"
|
||||
authors = [{name = "MCP Team", email = "team@fastmcp.org"}]
|
||||
authors = [{name = "Ryan Malloy", email = "ryan@malloys.us"}]
|
||||
readme = "README.md"
|
||||
license = {text = "MIT"}
|
||||
requires-python = ">=3.10"
|
||||
@ -98,4 +98,5 @@ dev = [
|
||||
"pytest-cov>=6.2.1",
|
||||
"reportlab>=4.4.3",
|
||||
"safety>=3.2.11",
|
||||
"twine>=6.1.0",
|
||||
]
|
||||
|
||||
@ -547,6 +547,9 @@ async def extract_text(
|
||||
}
|
||||
doc.close()
|
||||
|
||||
# Enforce MCP hard limit regardless of user max_tokens setting
|
||||
effective_max_tokens = min(max_tokens, 24000) # Stay safely under MCP's 25000 limit
|
||||
|
||||
# Early chunking decision based on size analysis
|
||||
should_chunk_early = (
|
||||
total_pages > 50 or # Large page count
|
||||
@ -592,9 +595,6 @@ async def extract_text(
|
||||
# Estimate token count (rough approximation: 1 token ≈ 4 characters)
|
||||
estimated_tokens = len(text) // 4
|
||||
|
||||
# Enforce MCP hard limit regardless of user max_tokens setting
|
||||
effective_max_tokens = min(max_tokens, 24000) # Stay safely under MCP's 25000 limit
|
||||
|
||||
# Handle large responses with intelligent chunking
|
||||
if estimated_tokens > effective_max_tokens:
|
||||
# Calculate chunk size based on effective token limit
|
||||
|
||||
4
uv.lock
generated
4
uv.lock
generated
@ -1032,7 +1032,7 @@ wheels = [
|
||||
|
||||
[[package]]
|
||||
name = "mcp-pdf"
|
||||
version = "1.0.0"
|
||||
version = "1.0.1"
|
||||
source = { editable = "." }
|
||||
dependencies = [
|
||||
{ name = "camelot-py", extra = ["cv"] },
|
||||
@ -1073,6 +1073,7 @@ dev = [
|
||||
{ name = "pytest-cov" },
|
||||
{ name = "reportlab" },
|
||||
{ name = "safety" },
|
||||
{ name = "twine" },
|
||||
]
|
||||
|
||||
[package.metadata]
|
||||
@ -1112,6 +1113,7 @@ dev = [
|
||||
{ name = "pytest-cov", specifier = ">=6.2.1" },
|
||||
{ name = "reportlab", specifier = ">=4.4.3" },
|
||||
{ name = "safety", specifier = ">=3.2.11" },
|
||||
{ name = "twine", specifier = ">=6.1.0" },
|
||||
]
|
||||
|
||||
[[package]]
|
||||
|
||||
Loading…
x
Reference in New Issue
Block a user