Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Fetch-MCP: Playwright-Based MCP Server with Batch URL Fetching Support (github.com/jae-jae)
64 points by Sulfide6416 10 months ago | hide | past | favorite | 14 comments


Fetch-MCP is a MCP server built on Playwright, designed for efficient web page content fetching. It excels at retrieving content from both static and dynamic websites, leveraging Playwright's powerful headless browser capabilities. Key features include `fetch_url` for single page retrieval and `fetch_urls` for high-performance batch fetching of multiple URLs in parallel. Fetch-MCP intelligently extracts main content, supports Markdown conversion, and is easily configurable, making it an ideal tool for developers needing robust and scalable web scraping capabilities.


Check out https://pure.md for a REST API version of this


Is there any example how an agent can interact with MCP? I imagine it will replace / complement Tools interface.


it can be either stdio or SSE.


Cool, but playwright doesn’t use your cookies.

Increasingly I want to stop spending time on twitter, but it’s also where the AI news drops first - and I can’t just scrape the data without being logged in.

If there was a way to have the ai go ahead and gather the data for me, that would be great.


This is something I am building. Herd[0] gives you a puppeteer-like API over your own browser, in effect allowing you to use your session seamlessly for automation and data extraction (and avoid bot detection as a nice side effect)

0: https://herd.garden


Playwright can actually use your existing browser cookies if you connect it through Chrome's debugging protocol. Launch Chrome with the flag:

--remote-debugging-port=9222

Then connect via CDP in Playwright like this:

const browser = await chromium.connectOverCDP('http://localhost:9222');


I would agree to this point as well.

Speaking of implementation, i don’t mind if a browser extension forward cookies from my browser to the automation (privacy and security is an issue of course, and i’d ideally want the cookies to not leave my device, but personally i’m okay with some trade off).


Can’t you just have it login?


What's MCP?


A simple explanation can be seen here: https://www.youtube.com/watch?v=7j_NE6Pjv-E


Model Context Protocol


I shared some notes about it here. Well worth exploring right now: https://notes.dsebastien.net/30+Areas/33+Permanent+notes/33....


Thanks for this. I'm not familiar with MCP, but having (briefly) read your link it appears to enable a use case I've been expecting where a chat window could replace the entire website experience (probably better suited to larger enterprise style websites) to provide tailored information for a company/product.

Would you know if it's possible to use this approach to constrain an LLM to only a specific context of information (For example, on the Microsoft site, any question related to CRMs would answer with information about Dynamics but never Salesforce)?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: