Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Just to vouch, though not in Facebook's case, it is complicated.

Many sites use a CDN (Akamai, Limelight, Amazon's Cloudfront, etc.). The whole idea of a CDN is that it distributes content. Even if the origin goes away (your copy), the CDN may continue serving it for a long time. If someone has a specific item URL within that network, they can still access it. Working with CDN APIs to delete content (especially if, say, that content has various instances based on sizes, previews, etc.,) can be ... interesting.

And if third parties are presenting your content, they might also persist it, say as a Google preview or cache, or Archive.org, or other tools.

Even within your own systems, data can be replicated in ways which are difficult to access fully. Backups can exist which cannot be easily accessed for wiping. There are war stories of magically re-appearing data resulting from data recovery operations.

So, while it's possible to flag content as "don't present" pretty easily, actually rooting all of it out thoroughly can be a much more involved task.

Un-seeing is difficult.



Facebook uses CDNs; how is it not complicated in the way you describe for them too?


I phrased that poorly: I'm vouching for the general case, not for Facebook specifically. I've not worked for them or on their systems.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: