From 19bdeddcdf30ab46f18de044d8f78184f42e27da Mon Sep 17 00:00:00 2001 From: Ryan Malloy Date: Sat, 8 Nov 2025 20:12:40 -0700 Subject: [PATCH] =?UTF-8?q?=F0=9F=93=9D=20Update=20README:=2040=20tools,?= =?UTF-8?q?=20v2.0.7=20table=20features,=20token=20management?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- README.md | 21 +++++++++++++-------- 1 file changed, 13 insertions(+), 8 deletions(-) diff --git a/README.md b/README.md index f4685d5..8172d88 100644 --- a/README.md +++ b/README.md @@ -24,19 +24,19 @@ > 🎯 **The Problem**: PDFs contain incredible intelligence, but extracting it reliably is complex, slow, and often fails. > -> ⚡ **The Solution**: MCP PDF delivers **AI-powered document intelligence** with **23 specialized tools** that understand both content and structure. +> ⚡ **The Solution**: MCP PDF delivers **AI-powered document intelligence** with **40 specialized tools** that understand both content and structure.
### 🏆 **Why MCP PDF Leads** -- **🚀 24 Specialized Tools** for every PDF scenario +- **🚀 40 Specialized Tools** for every PDF scenario - **🧠 AI-Powered Intelligence** beyond basic extraction - **🔄 Multi-Library Fallbacks** for 99.9% reliability - **⚡ 10x Faster** than traditional solutions - **🌐 URL Processing** with smart caching -- **👥 User-Friendly** 1-based page numbering +- **🎯 Smart Token Management** prevents MCP overflow errors @@ -117,9 +117,14 @@ Add to your `claude_desktop_config.json`: ```python # Complete financial report analysis in seconds health = await analyze_pdf_health("quarterly-report.pdf") -classification = await classify_content("quarterly-report.pdf") +classification = await classify_content("quarterly-report.pdf") summary = await summarize_content("quarterly-report.pdf", summary_length="medium") -tables = await extract_tables("quarterly-report.pdf", pages=[5,6,7]) + +# Smart table extraction - prevents token overflow on large tables +tables = await extract_tables("quarterly-report.pdf", pages="5-7", max_rows_per_table=100) +# Or get just table structure without data +table_summary = await extract_tables("quarterly-report.pdf", pages="5-7", summary_only=True) + charts = await extract_charts("quarterly-report.pdf") # Get instant insights @@ -177,7 +182,7 @@ citations = await extract_text("research-paper.pdf", pages=[15,16,17]) --- -## 🛠️ **Complete Arsenal: 23 Specialized Tools** +## 🛠️ **Complete Arsenal: 40+ Specialized Tools**
@@ -195,8 +200,8 @@ citations = await extract_text("research-paper.pdf", pages=[15,16,17]) | 🔧 **Tool** | 📋 **Purpose** | ⚡ **Speed** | 🎯 **Accuracy** | |-------------|---------------|-------------|----------------| -| `extract_text` | Multi-method text extraction | **Ultra Fast** | 99.9% | -| `extract_tables` | Intelligent table processing | **Fast** | 98% | +| `extract_text` | Multi-method text extraction with auto-chunking | **Ultra Fast** | 99.9% | +| `extract_tables` | Smart table extraction with token overflow protection | **Fast** | 98% | | `ocr_pdf` | Advanced OCR for scanned docs | **Moderate** | 95% | | `extract_images` | Media extraction & processing | **Fast** | 99% | | `pdf_to_markdown` | Structure-preserving conversion | **Fast** | 97% |