Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> FYI

The alternative format (used by the Internet Archive and Wayback Machine) is WARC. It's also a single file, but it's preserving the HTTP headers as well; so its applications is specifically for archival purposes. [1] The "wget" tool which is co-maintained by the Web Archive people also has support for it via CLI flags.

Though when it comes to mobile browser support I'd recommend to use MHTML, because webkit and chromium both have support for it upstream.

[1] http://iipc.github.io/warc-specifications/

[2] https://www.gnu.org/software/wget/wget.html



WARC is also used by the Webrecorder project. They made an app called Wabac which does entirely client-side WARC or HAR replays using service workers and it seems to have pretty good browser support, but I haven't really dug into the specifics.

https://github.com/webrecorder/wabac.js-1.0


There is a project that uses a headless browser to implement HAR.

https://github.com/wabarc/screenshot


Is there any objection to adding WARC support to webkit/chromium? Seems like a not-so-complex project...


I know that WebKit relies on either libsoup [1] (on Linux/Unices) or curl [2] (legacy Windows and maybe WPE(?)) as a network adapter, so the header handling and parsing mechanisms would have to be implemented in there.

Though, on MacOS, WebKit tries to migrate most APIs to the Core Foundation Framework, which makes it kind of impossible to implement as a non-Apple-employee because it's basically a dump-it-and-never-care Open Source approach. [3]

Don't know about chromium (my knowledge is ~2012ish about their architecture, and pre-Blink).

[1] https://github.com/WebKit/WebKit/tree/main/Source/WebKit/Net...

[2] https://github.com/WebKit/WebKit/tree/main/Source/WebKit/Net...

[3] https://github.com/opensource-apple/CF


GTK/WPE use libsoup. Playstation/Windows uses curl. And yes Apples networking is proprietary.


I wasn't sure about WPE in regards to libsoup due to the glib dependencies and all the InjectedBundle hacks that I thought they wanted to avoid.

I mean, in principal curl would run on the other platforms, too...but as far as I can tell there's an initiative to move as much as possible to the CF framework (strings, memory allocation, https and tls, sockets etc) and away from the cross-platform implementations.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: