Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I'm not sure.

https://github.com/rockdaboot/libpsl/blob/master/src/psl.c is 2K lines, most of it appears to deal with memory management, character twiddling, and processing/conversion.

Maybe the cookie management?



For one, it appears to handle punycode internally.

Punycode is a method for encoding international unicode into ascii prefixed with XN--. So this will correctly associate cookies for either the unicode term or its punycode ascii equivalent.

As exanples, wikipedia indicates the international domains with the most name registrations as being the russia's "рф", taiwan's "台灣" and china's "中国", which are represented in DNS as "xn--p1ai", "xn--kpry57d", and "xn--fiqs8s", respectively.

The data appears to indicate hosting sites where users can register their own names against a providers domain ( username.example.com ) as well as exceptions to this where the host's own site then uses subdomains ( www.example.com, admin.example.com, cdn.example.com ) and the host's cookies should still be used.

It lists specific tlds, wildcards where appropriate with *, and notes exceptions to wildcarding by prefixing with !.

Certainly far from something that would be impossible to write your own parser for, but getting everything right on your first go would be harder than one might expect, and getting things wrong here would be likely to leak the user's information between various sites.


https://publicsuffix.org/list/ (See the Algorithm)

Is actually hard to implement correctly and interoperably, even among browsers, and there are sharp edge cases along the way (such as holes within domain trees).

The author of the library referenced at least worked with the PSL maintainers and browsers to make sure they were faithfully and correctly implementing things :)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: