diff --git a/README.md b/README.md index a86e4c4..ec79613 100644 --- a/README.md +++ b/README.md @@ -1,10 +1,10 @@ # ๐Ÿ•ท๏ธ Crawailer -**The JavaScript-first web scraper that actually works with modern websites** +**The web scraper that doesn't suck at JavaScript** โœจ -> **Finally!** A Python library that handles React, Vue, Angular, and dynamic content without the headaches. When `requests` fails and Selenium feels like overkill, Crawailer delivers clean, AI-ready content extraction with bulletproof JavaScript execution. +> **Stop fighting modern websites.** While `requests` gives you empty `
`, Crawailer actually executes JavaScript and extracts real content from React, Vue, and Angular apps. Finally, web scraping that works in 2025. -> โšก **Perfect for Claude Code MCP servers** - Built specifically for AI agents and MCP integration +> โšก **Claude Code's new best friend** - Your AI assistant can now access ANY website ```python pip install crawailer @@ -13,15 +13,25 @@ pip install crawailer [![PyPI version](https://badge.fury.io/py/crawailer.svg)](https://badge.fury.io/py/crawailer) [![Python Support](https://img.shields.io/pypi/pyversions/crawailer.svg)](https://pypi.org/project/crawailer/) -## โœจ Features +## โœจ Why Developers Choose Crawailer -- **๐ŸŽฏ JavaScript-First**: Executes real JavaScript on React, Vue, Angular sites (unlike `requests`) -- **โšก Lightning Fast**: 5-10x faster HTML processing with C-based selectolax -- **๐Ÿค– AI-Optimized**: Clean markdown output perfect for LLM training and RAG -- **๐Ÿ”ง Claude Code Ready**: Drop-in MCP server integration for AI agents -- **๐Ÿ“ฆ Zero Config**: Works immediately with sensible defaults -- **๐Ÿงช Battle-Tested**: 18 comprehensive test suites with 70+ real-world scenarios -- **๐ŸŽจ Developer Joy**: Rich terminal output, helpful errors, progress tracking +**๐Ÿ”ฅ JavaScript That Actually Works** +While other tools timeout or crash, Crawailer executes real JavaScript like a human browser + +**โšก Stupidly Fast** +5-10x faster than BeautifulSoup with C-based parsing that doesn't make you wait + +**๐Ÿค– AI Assistant Ready** +Perfect markdown output that your Claude/GPT/local model will love + +**๐ŸŽฏ Zero Learning Curve** +`pip install` โ†’ works immediately โ†’ no 47-page configuration guides + +**๐Ÿงช Production Battle-Tested** +18 comprehensive test suites covering every edge case we could think of + +**๐ŸŽจ Actually Enjoyable** +Rich terminal output, helpful errors, progress bars that don't lie ## ๐Ÿš€ Quick Start @@ -66,7 +76,7 @@ research = await web.discover( ### ๐Ÿง  Claude Code MCP Integration -> "Oh yeah, tell the model that writes your code for you, this is how it works:" +> *"Hey Claude, go grab that data from the React app"* โ† This actually works now ```python # Add to your Claude Code MCP server @@ -83,8 +93,9 @@ async def extract_content(url: str, script: str = ""): "word_count": content.word_count } -# Now Claude can extract content from ANY website, including SPAs! -# No more "I can't access that site" - Claude gets real, live web data +# ๐ŸŽ‰ No more "I can't access that site" +# ๐ŸŽ‰ No more copy-pasting content manually +# ๐ŸŽ‰ Your AI can now browse the web like a human ``` ## ๐ŸŽฏ Design Philosophy @@ -298,16 +309,18 @@ MIT License - see [LICENSE](LICENSE) for details. --- -## ๐Ÿš€ Ready to Stop Fighting JavaScript? +## ๐Ÿš€ Ready to Stop Losing Your Mind? ```bash pip install crawailer crawailer setup # Install browser engines ``` -**Join the revolution**: Stop losing data to `requests.get()` failures. Start extracting **real content** from **real websites** that actually use JavaScript. +**Life's too short** for empty `
` tags and "JavaScript required" messages. -โญ **Star us on GitHub** if Crawailer saves your scraping sanity! +Get content that actually exists. From websites that actually work. + +โญ **Star us if this saves your sanity** โ†’ [git.supported.systems/MCP/crawailer](https://git.supported.systems/MCP/crawailer) ---