Ryan Malloy c335ba0e1e Initial commit: LLM Fusion MCP Server

- Unified access to 4 major LLM providers (Gemini, OpenAI, Anthropic, Grok)
- Real-time streaming support across all providers
- Multimodal capabilities (text, images, audio)
- Intelligent document processing with smart chunking
- Production-ready with health monitoring and error handling
- Full OpenAI ecosystem integration (Assistants, DALL-E, Whisper)
- Vector embeddings and semantic similarity
- Session-based API key management
- Built with FastMCP and modern Python tooling

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>

2025-09-05 05:47:51 -06:00

4.0 KiB

Raw Permalink Blame History

Large Document Analysis Test

Introduction

This is a test document designed to test the large file analysis capabilities of our LLM MCP server. It contains multiple sections to test different chunking strategies and provider selection.

Chapter 1: Technical Overview

Modern large language models have revolutionized how we process and analyze text. The key challenge when working with large documents is managing context windows effectively. Different providers offer different context window sizes:

Gemini 2.5 can handle up to 1 million tokens
GPT-4.1 also supports 1 million tokens
Claude 3.5 supports up to 200,000 tokens
Grok supports approximately 100,000 tokens

The optimal strategy depends on the document size and the analysis required.

Chapter 2: Chunking Strategies

Fixed Chunking

Fixed chunking divides content into equal-sized chunks with overlap. This is simple but may break semantic units.

Semantic Chunking

Semantic chunking respects natural boundaries like paragraphs and sections. This preserves meaning but may create uneven chunks.

Hierarchical Chunking

Hierarchical chunking follows document structure, using headers to create logical divisions. This works well for structured documents.

Auto Chunking

Auto chunking analyzes the document structure and selects the best strategy automatically.

Chapter 3: Provider Selection

The system automatically selects the optimal provider based on:

Document size (estimated token count)
Available API keys
Provider capabilities
Cost considerations

For large documents that exceed context windows, the system uses intelligent chunking with synthesis.

Chapter 4: Implementation Details

The llm_analyze_large_file function performs several steps:

File Extraction: Supports multiple file formats (txt, md, py, json, csv, log)
Token Estimation: Estimates token count to select appropriate provider
Provider Selection: Chooses optimal provider/model combination
Processing Strategy: Direct for small files, chunked for large files
Result Synthesis: Combines chunk analyses for coherent final result

Chapter 5: Supported File Types

Text Files (.txt)

Plain text files are read directly with UTF-8 encoding, with fallback to latin-1.

Markdown Files (.md)

Markdown files are cleaned to remove excessive formatting while preserving structure.

Code Files (.py)

Python and other code files are read as-is to preserve syntax and structure.

Data Files (.json, .csv)

JSON files are formatted with proper indentation. CSV files are processed with pandas when available.

Log Files (.log)

Log files receive special handling to truncate extremely long lines that might waste tokens.

Chapter 6: Streaming and Progress Tracking

The analysis provides real-time progress updates:

Analysis start notification
Chunking progress (if needed)
Individual chunk processing
Synthesis phase
Completion with metadata

This allows clients to track progress and understand what processing strategy was used.

Chapter 7: Error Handling and Resilience

The system includes comprehensive error handling:

File existence checks
Content extraction validation
Provider availability verification
Chunk processing error recovery
Graceful fallbacks

Conclusion

The large file analysis tool represents a comprehensive solution for analyzing documents of any size across multiple LLM providers. By combining intelligent provider selection, adaptive chunking strategies, and robust error handling, it can handle everything from small configuration files to massive documentation sets.

The streaming architecture ensures responsive user experience while the synthesis step maintains coherent analysis across document chunks. This makes it ideal for use cases ranging from code review to document analysis to research paper summarization.

Whether you're analyzing a small README file or a massive codebase, the system automatically adapts to provide the best possible analysis using the most appropriate provider and processing strategy.

4.0 KiB Raw Permalink Blame History