Implement proper MCP resource protocol for image access, eliminating the need
for clients to handle local file paths and enabling seamless image integration.
Key Features:
• MCP Resource Endpoint: pdf-image://{image_id} for direct image access
• extract_images(): Returns resource_uri field with MCP resource links
• pdf_to_markdown(): Embeds resource URIs in markdown image references
• Automatic MIME type detection (image/png, image/jpeg)
• Seamless client integration without file path handling
Benefits:
✅ Direct image access via MCP resource protocol
✅ No local file path dependencies for MCP clients
✅ Proper MIME type handling for image display
✅ Clean markdown with working image links
✅ Standards-compliant MCP resource implementation
Response Format Enhancement:
+ "resource_uri": "pdf-image://page_1_image_0"
+ Works in markdown: \
+ MIME Type: image/png or image/jpeg
+ Direct client access without file system dependencies
This resolves the limitation where extracted images were only available
as local file paths, making them truly accessible to MCP clients
through the standardized resource protocol.
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
Features:
- 8 comprehensive PDF processing tools with intelligent fallbacks
- Text extraction (PyMuPDF, pdfplumber, pypdf with auto-selection)
- Table extraction (Camelot → pdfplumber → Tabula fallback chain)
- OCR processing with Tesseract and preprocessing options
- Document analysis (structure, metadata, scanned detection)
- Image extraction with filtering capabilities
- PDF to markdown conversion with metadata
- Built on FastMCP framework with full MCP protocol support
- Comprehensive error handling and user-friendly messages
- Docker support and cross-platform compatibility
- Complete test suite and examples
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>