🚀 Release v1.0.1: Bug fixes and local development tools

- Fix variable scope bug in extract_text function - Add local development setup with claude-mcp-manager - Update author information - Add comprehensive local development documentation 🤖 Generated with Claude Code Co-Authored-By: Claude <noreply@anthropic.com>
2025-09-07 00:58:51 -06:00 · 2025-09-07 00:58:51 -06:00 · ebf6bb8a43
commit ebf6bb8a43
parent 8d01c44d4f
5 changed files with 434 additions and 6 deletions
--- a/LOCAL_DEVELOPMENT.md
+++ b/LOCAL_DEVELOPMENT.md
@ -0,0 +1,186 @@
 # 🔧 Local Development Guide for MCP PDF
 This guide shows how to test MCP PDF locally during development before publishing to PyPI.
 ## 📋 Prerequisites
 - Python 3.10+
 - uv package manager
 - Claude Desktop app
 - Git repository cloned locally
 ## 🚀 Quick Start for Local Testing
 ### 1. Clone and Setup
 ```bash
 # Clone the repository
 git clone https://github.com/rsp2k/mcp-pdf.git
 cd mcp-pdf
 # Install dependencies
 uv sync --dev
 # Verify installation
 uv run python -c "from mcp_pdf.server import create_server; print('✅ MCP PDF loads successfully')"
 ```
 ### 2. Test with Claude Code (Local Development)
 Use the `-t local` flag to point Claude Code to your local development copy:
 ```bash
 # Start Claude Code with local MCP PDF server
 claude-code -t local /path/to/mcp-pdf
 ```
 Or if you're already in the mcp-pdf directory:
 ```bash
 claude-code -t local .
 ```
 ### 3. Alternative: Manual Server Testing
 You can also run the server manually for debugging:
 ```bash
 # Run the MCP server directly
 uv run mcp-pdf
 # Or run with specific FastMCP options
 uv run python -m mcp_pdf.server
 ```
 ### 4. Test Core Functionality
 Once connected to Claude Code, test these key features:
 #### Basic PDF Processing
 ```
 "Extract text from this PDF file: /path/to/test.pdf"
 "Get metadata from this PDF: /path/to/document.pdf"
 "Check if this PDF is scanned: /path/to/scan.pdf"
 ```
 #### Security Features
 ```
 "Try to extract text from a very large PDF"
 "Process a PDF with 2000 pages" (should be limited to 1000)
 ```
 #### Advanced Features
 ```
 "Extract tables from this PDF: /path/to/tables.pdf"
 "Convert this PDF to markdown: /path/to/document.pdf"
 "Add annotations to this PDF: /path/to/target.pdf"
 ```
 ## 🔒 Security Testing
 Verify the security hardening works:
 ### File Size Limits
 - Try processing a PDF larger than 100MB
 - Should see: "PDF file too large: X bytes > 104857600"
 ### Page Count Limits  
 - Try processing a PDF with >1000 pages
 - Should see: "PDF too large for processing: X pages > 1000"
 ### Path Traversal Protection
 - Test with malicious paths like `../../../etc/passwd`
 - Should be blocked with security error
 ### JSON Input Validation
 - Large JSON inputs (>10KB) should be rejected
 - Malformed JSON should return clean error messages
 ## 🐛 Debugging
 ### Enable Debug Logging
 ```bash
 export DEBUG=true
 uv run mcp-pdf
 ```
 ### Check Security Functions
 ```bash
 # Test security validation functions
 uv run python test_security_features.py
 # Run integration tests
 uv run python test_integration.py
 ```
 ### Verify Package Structure
 ```bash
 # Check package builds correctly
 uv build
 # Verify package metadata
 uv run twine check dist/*
 ```
 ## 📊 Testing Checklist
 Before publishing, verify:
 - [ ] All 23 PDF tools work correctly
 - [ ] Security limits are enforced (file size, page count)
 - [ ] Error messages are clean and helpful  
 - [ ] No sensitive information leaked in errors
 - [ ] Path traversal protection works
 - [ ] JSON input validation works
 - [ ] Memory limits prevent crashes
 - [ ] CLI command `mcp-pdf` works
 - [ ] Package imports correctly: `from mcp_pdf.server import create_server`
 ## 🚀 Publishing Pipeline
 Once local testing passes:
 1. **Version Bump**: Update version in `pyproject.toml`
 2. **Build**: `uv build`  
 3. **Test Upload**: `uv run twine upload --repository testpypi dist/*`
 4. **Test Install**: `pip install -i https://test.pypi.org/simple/ mcp-pdf`
 5. **Production Upload**: `uv run twine upload dist/*`
 ## 🔧 Development Commands
 ```bash
 # Format code
 uv run black src/ tests/
 # Lint code  
 uv run ruff check src/ tests/
 # Run tests
 uv run pytest
 # Security scan
 uv run pip-audit
 # Build package
 uv build
 # Install editable for development
 pip install -e .  # (in a venv)
 ```
 ## 🆘 Troubleshooting
 ### "Module not found" errors
 - Ensure you're in the right directory
 - Run `uv sync` to install dependencies
 - Check Python path with `uv run python -c "import sys; print(sys.path)"`
 ### MCP server won't start
 - Check that all system dependencies are installed (tesseract, java, ghostscript)
 - Verify with: `uv run python examples/verify_installation.py`
 ### Security tests fail
 - Run `uv run python test_security_features.py -v` for detailed output
 - Check that security constants are properly set
 This setup allows for rapid development and testing without polluting your system Python or needing to publish to PyPI for every change.
--- a/239
+++ b/239
@ -0,0 +1,239 @@
 #!/usr/bin/env python3
 """
 Claude MCP Manager - Easy management of MCP servers in Claude Desktop
 Usage: claude mcp add <name> <command> [args...]
 """
 import json
 import sys
 import os
 from pathlib import Path
 import shutil
 import subprocess
 from typing import Dict, List, Any, Optional
 class ClaudeMCPManager:
    def __init__(self):
        self.config_path = Path.home() / ".config" / "Claude" / "claude_desktop_config.json"
        self.config_backup_dir = Path.home() / ".config" / "Claude" / "backups"
        self.config_backup_dir.mkdir(exist_ok=True)
    def load_config(self) -> Dict[str, Any]:
        """Load Claude Desktop configuration"""
        if not self.config_path.exists():
            return {"mcpServers": {}, "globalShortcut": ""}
        try:
            with open(self.config_path) as f:
                return json.load(f)
        except json.JSONDecodeError as e:
            print(f"❌ Error parsing config: {e}")
            sys.exit(1)
    def save_config(self, config: Dict[str, Any]):
        """Save configuration with backup"""
        # Create backup
        if self.config_path.exists():
            backup_name = f"claude_desktop_config_backup_{int(__import__('time').time())}.json"
            backup_path = self.config_backup_dir / backup_name
            shutil.copy2(self.config_path, backup_path)
            print(f"📁 Config backed up to: {backup_path}")
        # Save new config
        with open(self.config_path, 'w') as f:
            json.dump(config, f, indent=2)
        print(f"✅ Configuration saved to: {self.config_path}")
    def add_server(self, name: str, command: str, args: List[str], env: Optional[Dict[str, str]] = None, directory: Optional[str] = None):
        """Add a new MCP server"""
        config = self.load_config()
        if name in config["mcpServers"]:
            print(f"⚠️  Server '{name}' already exists. Use 'claude mcp update' to modify.")
            return False
        server_config = {
            "command": command,
            "args": args
        }
        if env:
            server_config["env"] = env
        if directory:
            server_config["cwd"] = directory
        config["mcpServers"][name] = server_config
        self.save_config(config)
        print(f"🚀 Added MCP server: {name}")
        return True
    def remove_server(self, name: str):
        """Remove an MCP server"""
        config = self.load_config()
        if name not in config["mcpServers"]:
            print(f"❌ Server '{name}' not found")
            return False
        del config["mcpServers"][name]
        self.save_config(config)
        print(f"🗑️  Removed MCP server: {name}")
        return True
    def list_servers(self):
        """List all configured MCP servers"""
        config = self.load_config()
        servers = config.get("mcpServers", {})
        if not servers:
            print("📭 No MCP servers configured")
            return
        print("📋 Configured MCP servers:")
        print("=" * 50)
        for name, server_config in servers.items():
            command = server_config.get("command", "")
            args = server_config.get("args", [])
            env = server_config.get("env", {})
            cwd = server_config.get("cwd", "")
            print(f"🔧 {name}")
            print(f"   Command: {command}")
            if args:
                print(f"   Args: {' '.join(args)}")
            if env:
                print(f"   Environment: {dict(list(env.items())[:3])}{'...' if len(env) > 3 else ''}")
            if cwd:
                print(f"   Directory: {cwd}")
            print()
    def add_mcp_pdf_local(self, directory: str):
        """Add MCP PDF from local development directory"""
        abs_dir = os.path.abspath(directory)
        if not os.path.exists(abs_dir):
            print(f"❌ Directory not found: {abs_dir}")
            return False
        # Check if it's a valid MCP PDF directory
        required_files = ["pyproject.toml", "src/mcp_pdf/server.py"]
        for file in required_files:
            if not os.path.exists(os.path.join(abs_dir, file)):
                print(f"❌ Not a valid MCP PDF directory (missing: {file})")
                return False
        return self.add_server(
            name="mcp-pdf-local",
            command="uv",
            args=[
                "--directory", abs_dir,
                "run", "mcp-pdf"
            ],
            env={"PDF_TEMP_DIR": "/tmp/mcp-pdf-processing"},
            directory=abs_dir
        )
    def add_mcp_pdf_pip(self):
        """Add MCP PDF from pip installation"""
        return self.add_server(
            name="mcp-pdf",
            command="mcp-pdf",
            args=[],
            env={"PDF_TEMP_DIR": "/tmp/mcp-pdf-processing"}
        )
 def print_usage():
    """Print usage information"""
    print("""
 🔧 Claude MCP Manager - Easy MCP server management
 USAGE:
    claude mcp add <name> <command> [args...]     # Add generic MCP server
    claude mcp add-local <directory>              # Add MCP PDF from local dev
    claude mcp add-pip                            # Add MCP PDF from pip
    claude mcp remove <name>                      # Remove MCP server
    claude mcp list                               # List all servers
    claude mcp help                               # Show this help
 EXAMPLES:
    # Add MCP PDF from local development
    claude mcp add-local /home/user/mcp-pdf
    # Add MCP PDF from pip (after pip install mcp-pdf)  
    claude mcp add-pip
    # Add generic MCP server
    claude mcp add memory npx -y @modelcontextprotocol/server-memory
    # Add server with environment variables
    claude mcp add github docker run -i --rm -e GITHUB_TOKEN ghcr.io/github/github-mcp-server
    # Remove a server
    claude mcp remove mcp-pdf-local
    # List all configured servers
    claude mcp list
 NOTES:
    • Configuration saved to: ~/.config/Claude/claude_desktop_config.json
    • Automatic backups created before changes
    • Restart Claude Desktop after adding/removing servers
    """)
 def main():
    if len(sys.argv) < 2:
        print_usage()
        sys.exit(1)
    manager = ClaudeMCPManager()
    command = sys.argv[1].lower()
    if command == "add":
        if len(sys.argv) < 4:
            print("❌ Usage: claude mcp add <name> <command> [args...]")
            sys.exit(1)
        name = sys.argv[2]
        command = sys.argv[3]
        args = sys.argv[4:] if len(sys.argv) > 4 else []
        manager.add_server(name, command, args)
    elif command == "add-local":
        if len(sys.argv) != 3:
            print("❌ Usage: claude mcp add-local <directory>")
            sys.exit(1)
        directory = sys.argv[2]
        manager.add_mcp_pdf_local(directory)
    elif command == "add-pip":
        manager.add_mcp_pdf_pip()
    elif command == "remove":
        if len(sys.argv) != 3:
            print("❌ Usage: claude mcp remove <name>")
            sys.exit(1)
        name = sys.argv[2]
        manager.remove_server(name)
    elif command == "list":
        manager.list_servers()
    elif command in ["help", "--help", "-h"]:
        print_usage()
    else:
        print(f"❌ Unknown command: {command}")
        print_usage()
        sys.exit(1)
 if __name__ == "__main__":
    main()
--- a/pyproject.toml
+++ b/pyproject.toml
@ -1,8 +1,8 @@
 [project]
 name = "mcp-pdf"
-version = "1.0.0"
+version = "1.0.1"
 description = "Secure FastMCP server for comprehensive PDF processing - text extraction, OCR, table extraction, forms, annotations, and more"
-authors = [{name = "MCP Team", email = "team@fastmcp.org"}]
+authors = [{name = "Ryan Malloy", email = "ryan@malloys.us"}]
 readme = "README.md"
 license = {text = "MIT"}
 requires-python = ">=3.10"
@ -98,4 +98,5 @@ dev = [
    "pytest-cov>=6.2.1",
    "reportlab>=4.4.3",
    "safety>=3.2.11",
    "twine>=6.1.0",
 ]
--- a/src/mcp_pdf/server.py
+++ b/src/mcp_pdf/server.py
@ -547,6 +547,9 @@ async def extract_text(
        }
        doc.close()
        # Enforce MCP hard limit regardless of user max_tokens setting
        effective_max_tokens = min(max_tokens, 24000)  # Stay safely under MCP's 25000 limit
        # Early chunking decision based on size analysis
        should_chunk_early = (
            total_pages > 50 or  # Large page count
@ -592,9 +595,6 @@ async def extract_text(
        # Estimate token count (rough approximation: 1 token ≈ 4 characters)
        estimated_tokens = len(text) // 4
        # Enforce MCP hard limit regardless of user max_tokens setting
        effective_max_tokens = min(max_tokens, 24000)  # Stay safely under MCP's 25000 limit
        # Handle large responses with intelligent chunking
        if estimated_tokens > effective_max_tokens:
            # Calculate chunk size based on effective token limit
--- a/uv.lock
+++ b/uv.lock
@ -1032,7 +1032,7 @@ wheels = [
 [[package]]
 name = "mcp-pdf"
-version = "1.0.0"
+version = "1.0.1"
 source = { editable = "." }
 dependencies = [
    { name = "camelot-py", extra = ["cv"] },
@ -1073,6 +1073,7 @@ dev = [
    { name = "pytest-cov" },
    { name = "reportlab" },
    { name = "safety" },
    { name = "twine" },
 ]
 [package.metadata]
@ -1112,6 +1113,7 @@ dev = [
    { name = "pytest-cov", specifier = ">=6.2.1" },
    { name = "reportlab", specifier = ">=4.4.3" },
    { name = "safety", specifier = ">=3.2.11" },
    { name = "twine", specifier = ">=6.1.0" },
 ]
 [[package]]