5 Commits

Author SHA1 Message Date
f0365a0d75 Implement comprehensive PDF processing suite with 15 additional advanced tools
Major expansion from 8 to 23 total tools covering:

**Document Analysis & Intelligence:**
- analyze_pdf_health: Comprehensive quality and health analysis
- analyze_pdf_security: Security features and vulnerability assessment
- classify_content: AI-powered document type classification
- summarize_content: Intelligent content summarization with key insights
- compare_pdfs: Advanced document comparison (text, structure, metadata)

**Layout & Visual Analysis:**
- analyze_layout: Page layout analysis with column detection
- extract_charts: Chart, diagram, and visual element extraction
- detect_watermarks: Watermark detection and analysis

**Content Manipulation:**
- extract_form_data: Interactive PDF form data extraction
- split_pdf: Split PDFs at specified pages
- merge_pdfs: Merge multiple PDFs into one
- rotate_pages: Rotate pages by 90°/180°/270°

**Optimization & Utilities:**
- convert_to_images: Convert PDF pages to image files
- optimize_pdf: File size optimization with quality levels
- repair_pdf: Corrupted PDF repair and recovery

**Technical Enhancements:**
- All tools support HTTPS URLs with intelligent caching
- Fixed MCP parameter validation for pages parameter
- Comprehensive error handling and validation
- Updated documentation with usage examples

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-08-11 04:27:04 -06:00
58d43851b9 Add HTTPS URL support and fix MCP parameter validation
Features:
- HTTPS URL support: Process PDFs directly from URLs with intelligent caching
- Smart caching: 1-hour cache to avoid repeated downloads
- Content validation: Verify downloads are actually PDF files
- Security: Proper User-Agent headers, HTTPS preferred over HTTP
- MCP parameter fixes: Handle pages parameter as string "[2,3]" format
- Backward compatibility: Still supports local file paths and list parameters

Technical changes:
- Added download_pdf_from_url() with caching and validation
- Updated validate_pdf_path() to handle URLs and local paths
- Added parse_pages_parameter() for flexible parameter parsing
- Updated all 8 tools to accept string pages parameters
- Enhanced error handling for network and validation issues

All tools now support:
- Local paths: "/path/to/file.pdf"
- HTTPS URLs: "https://example.com/document.pdf"
- Flexible pages: "[2,3]", "1,2,3", or [1,2,3]

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-08-11 02:25:53 -06:00
478ab41b1f Merge remote repository with local MCP PDF Tools implementation
Resolved README.md conflict by preserving comprehensive documentation
while maintaining repository structure from git.supported.systems

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-08-10 17:00:49 -06:00
dfc6fe1149 Initial commit 2025-08-10 22:59:46 +00:00
c902e81e4d Initial commit: Complete MCP PDF Tools server implementation
Features:
- 8 comprehensive PDF processing tools with intelligent fallbacks
- Text extraction (PyMuPDF, pdfplumber, pypdf with auto-selection)
- Table extraction (Camelot → pdfplumber → Tabula fallback chain)
- OCR processing with Tesseract and preprocessing options
- Document analysis (structure, metadata, scanned detection)
- Image extraction with filtering capabilities
- PDF to markdown conversion with metadata
- Built on FastMCP framework with full MCP protocol support
- Comprehensive error handling and user-friendly messages
- Docker support and cross-platform compatibility
- Complete test suite and examples

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-08-10 16:36:21 -06:00