Make README absolutely irresistible 🔥
- Changed tagline to 'doesn't suck at JavaScript' - more memorable - Rewrote features as benefit-focused 'Why Developers Choose Crawailer' - Added personality with 'Stupidly Fast' and 'Zero Learning Curve' - Enhanced MCP section with more relatable language - Updated CTA to 'Ready to Stop Losing Your Mind?' - emotional hook - Maintained PyPI compatibility while adding serious personality
This commit is contained in:
parent
3d5a2f3dec
commit
fad0b60b66
47
README.md
47
README.md
@ -1,10 +1,10 @@
|
|||||||
# 🕷️ Crawailer
|
# 🕷️ Crawailer
|
||||||
|
|
||||||
**The JavaScript-first web scraper that actually works with modern websites**
|
**The web scraper that doesn't suck at JavaScript** ✨
|
||||||
|
|
||||||
> **Finally!** A Python library that handles React, Vue, Angular, and dynamic content without the headaches. When `requests` fails and Selenium feels like overkill, Crawailer delivers clean, AI-ready content extraction with bulletproof JavaScript execution.
|
> **Stop fighting modern websites.** While `requests` gives you empty `<div id="root"></div>`, Crawailer actually executes JavaScript and extracts real content from React, Vue, and Angular apps. Finally, web scraping that works in 2025.
|
||||||
|
|
||||||
> ⚡ **Perfect for Claude Code MCP servers** - Built specifically for AI agents and MCP integration
|
> ⚡ **Claude Code's new best friend** - Your AI assistant can now access ANY website
|
||||||
|
|
||||||
```python
|
```python
|
||||||
pip install crawailer
|
pip install crawailer
|
||||||
@ -13,15 +13,25 @@ pip install crawailer
|
|||||||
[](https://badge.fury.io/py/crawailer)
|
[](https://badge.fury.io/py/crawailer)
|
||||||
[](https://pypi.org/project/crawailer/)
|
[](https://pypi.org/project/crawailer/)
|
||||||
|
|
||||||
## ✨ Features
|
## ✨ Why Developers Choose Crawailer
|
||||||
|
|
||||||
- **🎯 JavaScript-First**: Executes real JavaScript on React, Vue, Angular sites (unlike `requests`)
|
**🔥 JavaScript That Actually Works**
|
||||||
- **⚡ Lightning Fast**: 5-10x faster HTML processing with C-based selectolax
|
While other tools timeout or crash, Crawailer executes real JavaScript like a human browser
|
||||||
- **🤖 AI-Optimized**: Clean markdown output perfect for LLM training and RAG
|
|
||||||
- **🔧 Claude Code Ready**: Drop-in MCP server integration for AI agents
|
**⚡ Stupidly Fast**
|
||||||
- **📦 Zero Config**: Works immediately with sensible defaults
|
5-10x faster than BeautifulSoup with C-based parsing that doesn't make you wait
|
||||||
- **🧪 Battle-Tested**: 18 comprehensive test suites with 70+ real-world scenarios
|
|
||||||
- **🎨 Developer Joy**: Rich terminal output, helpful errors, progress tracking
|
**🤖 AI Assistant Ready**
|
||||||
|
Perfect markdown output that your Claude/GPT/local model will love
|
||||||
|
|
||||||
|
**🎯 Zero Learning Curve**
|
||||||
|
`pip install` → works immediately → no 47-page configuration guides
|
||||||
|
|
||||||
|
**🧪 Production Battle-Tested**
|
||||||
|
18 comprehensive test suites covering every edge case we could think of
|
||||||
|
|
||||||
|
**🎨 Actually Enjoyable**
|
||||||
|
Rich terminal output, helpful errors, progress bars that don't lie
|
||||||
|
|
||||||
## 🚀 Quick Start
|
## 🚀 Quick Start
|
||||||
|
|
||||||
@ -66,7 +76,7 @@ research = await web.discover(
|
|||||||
|
|
||||||
### 🧠 Claude Code MCP Integration
|
### 🧠 Claude Code MCP Integration
|
||||||
|
|
||||||
> "Oh yeah, tell the model that writes your code for you, this is how it works:"
|
> *"Hey Claude, go grab that data from the React app"* ← This actually works now
|
||||||
|
|
||||||
```python
|
```python
|
||||||
# Add to your Claude Code MCP server
|
# Add to your Claude Code MCP server
|
||||||
@ -83,8 +93,9 @@ async def extract_content(url: str, script: str = ""):
|
|||||||
"word_count": content.word_count
|
"word_count": content.word_count
|
||||||
}
|
}
|
||||||
|
|
||||||
# Now Claude can extract content from ANY website, including SPAs!
|
# 🎉 No more "I can't access that site"
|
||||||
# No more "I can't access that site" - Claude gets real, live web data
|
# 🎉 No more copy-pasting content manually
|
||||||
|
# 🎉 Your AI can now browse the web like a human
|
||||||
```
|
```
|
||||||
|
|
||||||
## 🎯 Design Philosophy
|
## 🎯 Design Philosophy
|
||||||
@ -298,16 +309,18 @@ MIT License - see [LICENSE](LICENSE) for details.
|
|||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## 🚀 Ready to Stop Fighting JavaScript?
|
## 🚀 Ready to Stop Losing Your Mind?
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
pip install crawailer
|
pip install crawailer
|
||||||
crawailer setup # Install browser engines
|
crawailer setup # Install browser engines
|
||||||
```
|
```
|
||||||
|
|
||||||
**Join the revolution**: Stop losing data to `requests.get()` failures. Start extracting **real content** from **real websites** that actually use JavaScript.
|
**Life's too short** for empty `<div>` tags and "JavaScript required" messages.
|
||||||
|
|
||||||
⭐ **Star us on GitHub** if Crawailer saves your scraping sanity!
|
Get content that actually exists. From websites that actually work.
|
||||||
|
|
||||||
|
⭐ **Star us if this saves your sanity** → [git.supported.systems/MCP/crawailer](https://git.supported.systems/MCP/crawailer)
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
|
Loading…
x
Reference in New Issue
Block a user