Bug fixes:
- Remove 100-paragraph cap that prevented extracting content past ~page 4
Now calculates limit based on number of pages requested (300 paras/page)
- Add fallback page estimation when docs lack explicit page breaks
Uses ~25 paragraphs per page for navigation in non-paginated docs
- Fix _get_available_headings to scan full document (was only first 100 elements)
Headings like Chapter 10 at element 1524 were invisible
- Fix MCP connection by disabling FastMCP banner (show_banner=False)
ASCII art banner was corrupting stdout JSON-RPC protocol
Changes:
- Default image_mode changed from 'base64' to 'files' to avoid huge responses
- Add proper .mcp.json config with command/args format
- Add test document to .gitignore for privacy
- Comprehensive Microsoft Office document processing server
- Support for Word (.docx, .doc), Excel (.xlsx, .xls), PowerPoint (.pptx, .ppt), CSV
- 6 universal tools: extract_text, extract_images, extract_metadata, detect_office_format, analyze_document_health, get_supported_formats
- Multi-library fallback system for robust processing
- URL support with intelligent caching
- Legacy Office format support (97-2003)
- FastMCP integration with async architecture
- Production ready with comprehensive documentation
🤖 Generated with Claude Code (claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>