5 Commits

Author SHA1 Message Date
3327137536 🚀 v2.0.5: Fix page range parsing across all PDF tools
Major architectural improvements and bug fixes in the v2.0.x series:

## v2.0.5 - Page Range Parsing (Current Release)
- Fix page range parsing bug affecting 6 mixins (e.g., "93-95" or "11-30")
- Create shared parse_pages_parameter() utility function
- Support mixed formats: "1,3-5,7,10-15"
- Update: pdf_utilities, content_analysis, image_processing, misc_tools, table_extraction, text_extraction

## v2.0.4 - Chunk Hint Fix
- Fix next_chunk_hint to show correct page ranges
- Dynamic calculation based on actual pages being extracted
- Example: "30-50" now correctly shows "40-49" for next chunk

## v2.0.3 - Initial Range Support
- Add page range support to text extraction ("11-30")
- Fix _parse_pages_parameter to handle ranges with Python's range()
- Convert 1-based user input to 0-based internal indexing

## v2.0.2 - Lazy Import Fix
- Fix ModuleNotFoundError for reportlab on startup
- Implement lazy imports for optional dependencies
- Graceful degradation with helpful error messages

## v2.0.1 - Dependency Restructuring
- Move reportlab to optional [forms] extra
- Document installation: uvx --with mcp-pdf[forms] mcp-pdf

## v2.0.0 - Official FastMCP Pattern Migration
- Migrate to official fastmcp.contrib.mcp_mixin pattern
- Create 12 specialized mixins with 42 tools total
- Architecture: mixins_official/ using MCPMixin base class
- Backwards compatibility: server_legacy.py preserved

Technical Improvements:
- Centralized utility functions (DRY principle)
- Consistent behavior across all PDF tools
- Better error messages with actionable instructions
- Library-specific adapters for table extraction

Files Changed:
- New: src/mcp_pdf/mixins_official/utils.py (shared utilities)
- Updated: 6 mixins with improved page parsing
- Version: pyproject.toml, server.py → 2.0.5

PyPI: https://pypi.org/project/mcp-pdf/2.0.5/
2025-11-03 17:12:37 -07:00
8cbf542df1 🔧 Fix output path security with MCP_PDF_ALLOWED_PATHS environment variable
BREAKING ISSUE FIXED:
- Users reported "Output path not allowed: images" error
- extract_images tool was rejecting relative paths due to overly restrictive security

NEW SECURITY MODEL:
- MCP_PDF_ALLOWED_PATHS environment variable controls allowed output directories
- If unset: Allows any directory with "security theater" warnings
- If set: Restricts outputs to specified colon-separated paths
- Cross-platform compatible (: on Unix, ; on Windows)

SECURITY PHILOSOPHY ENHANCED:
- "TRUST NO ONE" - honest about application-level security limitations
- Clear warnings that this is "security theater"
- Emphasis on OS-level permissions and process isolation
- Educational guidance on real security practices

TECHNICAL CHANGES:
- validate_output_path() rewritten with environment variable control
- Path validation uses relative_to() for proper containment checking
- Enhanced warning messages with security education
- Updated documentation with honest security assessment

DOCUMENTATION UPDATES:
- Added MCP_PDF_ALLOWED_PATHS to configuration section
- New "REAL Security" section with OS-level recommendations
- Clear explanation of security theater vs actual protection

Version: 1.1.1 (patch version for critical bugfix)
2025-09-23 23:40:05 -06:00
856dd41996 Add comprehensive link extraction tool (24th PDF tool)
New Features:
- extract_links: Extract all PDF hyperlinks with advanced filtering
- Page-specific filtering (e.g., "1,3,5" or "1-5,8,10-12")
- Link type categorization: external URLs, internal pages, emails, documents
- Coordinate tracking for precise link positioning
- FastMCP integration with proper tool registration
- Version banner display following CLAUDE.md guidelines

Technical Improvements:
- Enhanced startup banner with package version display
- Updated documentation to reflect 24 specialized tools
- Proper FastMCP @mcp.tool() decorator usage
- Comprehensive error handling and security validation

Documentation Updates:
- README.md: Updated tool count and installation guides
- CLAUDE.md: Added link extraction to implemented features
- LOCAL_DEVELOPMENT.md: Enhanced with scoped installation commands

Version: 1.1.0 (minor version bump for new feature)
2025-09-23 20:41:16 -06:00
ebf6bb8a43 🚀 Release v1.0.1: Bug fixes and local development tools
- Fix variable scope bug in extract_text function
- Add local development setup with claude-mcp-manager
- Update author information
- Add comprehensive local development documentation

🤖 Generated with Claude Code

Co-Authored-By: Claude <noreply@anthropic.com>
2025-09-07 00:58:51 -06:00
8d01c44d4f 🚀 Rename to mcp-pdf and prepare for PyPI publication
**Package Rebranding:**
- Renamed package from mcp-pdf-tools to mcp-pdf (cleaner name)
- Updated version to 1.0.0 (production ready with security hardening)
- Updated all import paths and references throughout codebase

**PyPI Preparation:**
- Enhanced package description and metadata
- Added proper project URLs and homepage
- Updated CLI command from mcp-pdf-tools to mcp-pdf
- Built distribution packages (wheel + source)

**Testing & Validation:**
- All 20 security tests pass with new package structure
- Local installation and import tests successful
- CLI command working correctly
- Package ready for PyPI publication

The secure, production-ready PDF processing platform is now ready
for public distribution and installation via pip.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-09-06 15:42:59 -06:00