Fetch-MCP is a MCP server built on Playwright, designed for efficient web page content fetching. It excels at retrieving content from both static and dynamic websites, leveraging Playwright's powerful headless browser capabilities. Key features include `fetch_url` for single page retrieval and `fetch_urls` for high-performance batch fetching of multiple URLs in parallel. Fetch-MCP intelligently extracts main content, supports Markdown conversion, and is easily configurable, making it an ideal tool for developers needing robust and scalable web scraping capabilities.
Increasingly I want to stop spending time on twitter, but it’s also where the AI news drops first - and I can’t just scrape the data without being logged in.
If there was a way to have the ai go ahead and gather the data for me, that would be great.
This is something I am building. Herd[0] gives you a puppeteer-like API over your own browser, in effect allowing you to use your session seamlessly for automation and data extraction (and avoid bot detection as a nice side effect)
Speaking of implementation, i don’t mind if a browser extension forward cookies from my browser to the automation (privacy and security is an issue of course, and i’d ideally want the cookies to not leave my device, but personally i’m okay with some trade off).
Thanks for this. I'm not familiar with MCP, but having (briefly) read your link it appears to enable a use case I've been expecting where a chat window could replace the entire website experience (probably better suited to larger enterprise style websites) to provide tailored information for a company/product.
Would you know if it's possible to use this approach to constrain an LLM to only a specific context of information (For example, on the Microsoft site, any question related to CRMs would answer with information about Dynamics but never Salesforce)?