diff --git a/README.md b/README.md index bd85de7..36ab585 100644 --- a/README.md +++ b/README.md @@ -4,6 +4,8 @@ > **Finally!** A Python library that handles React, Vue, Angular, and dynamic content without the headaches. When `requests` fails and Selenium feels like overkill, Crawailer delivers clean, AI-ready content extraction with bulletproof JavaScript execution. +> โšก **Perfect for Claude Code MCP servers** - Built specifically for AI agents and MCP integration + ```python pip install crawailer ``` @@ -16,7 +18,7 @@ pip install crawailer - **๐ŸŽฏ JavaScript-First**: Executes real JavaScript on React, Vue, Angular sites (unlike `requests`) - **โšก Lightning Fast**: 5-10x faster HTML processing with C-based selectolax - **๐Ÿค– AI-Optimized**: Clean markdown output perfect for LLM training and RAG -- **๐Ÿ”ง Three Ways to Use**: Library, CLI tool, or MCP server - your choice +- **๐Ÿ”ง Claude Code Ready**: Drop-in MCP server integration for AI agents - **๐Ÿ“ฆ Zero Config**: Works immediately with sensible defaults - **๐Ÿงช Battle-Tested**: 18 comprehensive test suites with 70+ real-world scenarios - **๐ŸŽจ Developer Joy**: Rich terminal output, helpful errors, progress tracking @@ -60,6 +62,26 @@ research = await web.discover( # Crawailer โ†’ Full content + dynamic data ``` +### ๐Ÿง  Claude Code MCP Integration + +```python +# Add to your Claude Code MCP server +from crawailer.mcp import create_mcp_server + +@mcp_tool("web_extract") +async def extract_content(url: str, script: str = ""): + """Extract content from any website with optional JavaScript execution""" + content = await web.get(url, script=script) + return { + "title": content.title, + "markdown": content.markdown, + "script_result": content.script_result, + "word_count": content.word_count + } + +# Now Claude can extract content from ANY website, including SPAs! +``` + ## ๐ŸŽฏ Design Philosophy ### For Robots, By Humans