Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

For many use cases I would imagine that an index that was a bit delayed might actually be preferred. I'm not entirely sure what you meant to imply by 'research purposes' but many of the use cases I imagine are scholarly use cases where something a more stable would be preferable. That said I seem to recall Henry Thompson telling a story about trying to do a study of the statistics of the net using common crawl. By the time he was done he ended up being less certain of the results, the understanding, and the methodological validity of anything related to trying to measure the internet by looking at a single snapshot of a subset of the link structure. Too hard to understand what you are actually counting.

edit: yep here it is https://doi.org/10.1145/3184558.3191636



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: