From f159efab2c49d5575ba70012a5641106e8eb8a35 Mon Sep 17 00:00:00 2001
From: Ryan Malloy <ryan@supported.systems>
Date: Sun, 11 Jan 2026 00:49:34 -0700
Subject: [PATCH] Improve README tone and clarity

- Replace generic opener with direct description
- Make feature bullets more conversational (less "feature list" mode)
- Add context before format support table
- Clarify pagination example with "three ways" structure
- Lead testing section with the dashboard hook
- Add architecture design rationale
- Remove "comprehensive" and "intelligent" buzzwords
---
 README.md | 56 +++++++++++++++++++++++++------------------------------
 1 file changed, 25 insertions(+), 31 deletions(-)

diff --git a/README.md b/README.md
index 2d199cc..d6dafd8 100644
--- a/README.md
+++ b/README.md
@@ -2,14 +2,14 @@
 
 # 📊 MCP Office Tools
 
-**Comprehensive Microsoft Office document processing for AI agents**
+**MCP server for extracting text, tables, images, and data from Microsoft Office files**
 
 [![Python 3.11+](https://img.shields.io/badge/python-3.11+-blue.svg?style=flat-square)](https://www.python.org/downloads/)
 [![FastMCP](https://img.shields.io/badge/FastMCP-0.5+-green.svg?style=flat-square)](https://gofastmcp.com)
 [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg?style=flat-square)](https://opensource.org/licenses/MIT)
 [![MCP Protocol](https://img.shields.io/badge/MCP-Protocol-purple?style=flat-square)](https://modelcontextprotocol.io)
 
-*Extract text, tables, images, formulas, and metadata from Word, Excel, PowerPoint, and CSV files*
+*Word, Excel, PowerPoint, CSV — all the formats your AI agent needs to read but can't*
 
 [Installation](#-installation) • [Tools](#-available-tools) • [Examples](#-usage-examples) • [Testing](#-testing)
 
@@ -19,12 +19,12 @@
 
 ## ✨ Features
 
-- **Universal extraction** - Text, images, and metadata from any Office format
-- **Format-specific tools** - Deep analysis for Word, Excel, and PowerPoint
-- **Intelligent pagination** - Large documents automatically chunked for AI context limits
-- **Multi-library fallbacks** - Never fails silently; tries multiple extraction methods
-- **URL support** - Process documents directly from HTTP/HTTPS URLs with caching
-- **Legacy format support** - Handles .doc, .xls, .ppt from Office 97-2003
+- **Universal extraction** — Pull text, images, and metadata from any Office format
+- **Format-specific tools** — Deep analysis for Word (tables, structure), Excel (formulas, charts), PowerPoint
+- **Automatic pagination** — Large documents get chunked so they don't blow up your context window
+- **Fallback processing** — When one library chokes on a weird file, we try another. No silent failures.
+- **URL support** — Pass a URL instead of a file path; we'll download and cache it
+- **Legacy formats** — Yes, even those .doc and .xls files from 2003 still work
 
 ---
 
@@ -96,6 +96,8 @@ claude mcp add office-tools "uvx mcp-office-tools"
 
 ## 📋 Format Support
 
+Here's what works and what's "good enough" — legacy formats from Office 97-2003 have more limited extraction, but they still work:
+
 | Format | Extension | Text | Images | Metadata | Tables | Formulas |
 |--------|-----------|:----:|:------:|:--------:|:------:|:--------:|
 | **Word (Modern)** | `.docx` | ✅ | ✅ | ✅ | ✅ | - |
@@ -134,28 +136,22 @@ result = await extract_text(
 
 ### Convert Word to Markdown (with Pagination)
 
-```python
-# For large documents, results are automatically paginated
-result = await convert_to_markdown("big-manual.docx")
+Large documents get paginated automatically. Three ways to handle it:
 
-# Continue with cursor for next page
+```python
+# Option 1: Follow the cursor for each chunk
+result = await convert_to_markdown("big-manual.docx")
 if result.get("pagination", {}).get("has_more"):
     next_page = await convert_to_markdown(
         "big-manual.docx",
         cursor_id=result["pagination"]["cursor_id"]
     )
 
-# Or use page ranges to get specific sections
-result = await convert_to_markdown(
-    "big-manual.docx",
-    page_range="1-10"
-)
+# Option 2: Grab specific pages
+result = await convert_to_markdown("big-manual.docx", page_range="1-10")
 
-# Or extract by chapter name
-result = await convert_to_markdown(
-    "big-manual.docx",
-    chapter_name="Introduction"
-)
+# Option 3: Extract by chapter heading
+result = await convert_to_markdown("big-manual.docx", chapter_name="Introduction")
 ```
 
 ### Analyze Excel Data Quality
@@ -266,29 +262,27 @@ result = await extract_text("https://example.com/report.docx")
 
 ## 🧪 Testing
 
-The project includes a comprehensive test suite with an interactive HTML dashboard:
+We built a visual test dashboard because staring at pytest output gets old. Run `make test` and you get an HTML report with pass/fail stats, detailed I/O for each test, and expandable tracebacks when things break.
 
 ```bash
-# Run all tests with dashboard generation
+# Run tests and generate the dashboard
 make test
 
-# Run just pytest
+# Just pytest, no dashboard
 make test-pytest
 
-# View the test dashboard
+# Open existing dashboard
 make view-dashboard
 ```
 
-The test dashboard shows:
-- Pass/fail statistics with MS Office-themed styling
-- Detailed inputs and outputs for each test
-- Expandable error tracebacks for failures
-- Category breakdown (Word, Excel, PowerPoint)
+The dashboard has an MS Office-inspired theme (Word blue, Excel green, PowerPoint orange) and groups tests by category so you can see what's working at a glance.
 
 ---
 
 ## 🏗 Architecture
 
+The mixin pattern keeps things modular — universal tools work on everything, format-specific tools go deeper. When the primary library can't handle something (corrupted files, weird formatting), we fall back to alternatives.
+
 ```
 mcp-office-tools/
 ├── src/mcp_office_tools/