I don't trust the "delete" button to scrub it from FB's database.
I'd be slightly more confident (slightly) that editing the post might cause the core data in the db to be updated, however. In which case, I think the more effective script would be one that goes through all of your FB posts and scrambles them, or replaces the text with gibberish.
Lots of people in this thread are claiming that it doesn't based on gut feeling, but consider:
- Facebook publicly claims it does [1]
- Mark Zuckerberg testified in-front of congress and stated they do [2]
- Multiple government regulators have specifically checked that it does in their privacy audits. [3]
They might be lying and actively conspiring to not delete it, but they'd have to have a very good reason to take on that much legal risk.
Now, consider:
- A infinitesimal fraction of facebook users try to delete anything.
- Facebook makes money from your data by showing you ads. If you stop using facebook, you stop generating any revenue and your data becomes a liability, not an asset.
> Mark Zuckerberg testified in-front of congress and stated they do
I just want to point out that "testified in-front of Congress" seems to have no relation to the truth of any statement. I don't recall the last time (if ever) there have been legal ramifications for lying in front of Congress.
Zuckerberg also testified that shadow profiles don't exist and that users can always remove their personal information from Facebook (they do exist, and you can't). James Clapper testified that the NSA doesn't "wittingly" surveil hundreds of millions of Americans (they do, very "wittingly").
I agree that it is technically illegal, but if the most recent example of a conviction (that you can think of) was almost 30 years ago that tells you that it's effectively unenforced.
Does it? One conviction in 30 years is also consistent with the conclusion that not that many people get invited to give evidence to Congress, and the people that do tell the truth (or at least don't tell provable lies).
He lied to the FBI after a plea bargain which involved him telling them the truth. From memory, this all happened before his Congress testimony but it's definitely not related.
It is illegal to lie to Congress (unrelated to whether you're under oath or not), but I can think of very few examples where there were real ramifications for it.
Cohen pleaded guilty to several charges [1], including some that alleged that he lied in a written statement sent to the Senate Select Committee on Intelligence and House of Representatives Permanent Select Committee on Intelligence on 28 August 2017, and then repeated those lies in his testimony to the Senate committee on 25 October 2017.
> He lied to the FBI after a plea bargain which involved him telling them the truth.
Sure you're not thinking of Manafort? I don't think Cohen had a plea deal, and I don't think he was charged with lying to the FBI. Manafort and Flynn did, and Manafort I think was the one that then broke the deal. There's a lot of crimes to keep track of.
I guess it depends on what you mean by real ramifications as it's being served concurrently but Cohen is serving two months for lying to Congress, a charge he plead guilty to.
Yep, that looks right from his Wikipedia article. One count of making false statements to a congressional committee. It’s not immediately obvious to me when he was charged/convicted of that though. https://en.m.wikipedia.org/wiki/Michael_Cohen_(lawyer)
Was the nature of deletion explicitly defined as "deleted records are removed from all databases, data warehouses included"?
They technically wouldn't be lying if they only deleted the record from application systems, but retained it in historical data by simply setting a bit flag (eg "IsDeleted") to 1.
The question about what motive they have comes down to whether or not the types of posts a person deletes indicate something meaningful about their personality. Ad analytics are ultimately seeking to understand the kind of person you are, after all.
I read the audit, and here's something I noticed (other than that it's 8 years old):
"In determining appropriate retention periods for personal information, data controllers can have due regard to any statutory obligations to retain data. However, if the purpose for which the information was obtained has ceased and the personal information is no longer required for that purpose, the data must be deleted or disposed of in a secure manner. Full and irrevocable anonymisation would achieve the same objective." (Page 69, under 3.4 Retention)
So basically, as long as it's anonymised, the data can be retained.
But “if the purpose for which the information was obtained” was to enable better ad targeting, then “data retention” is still “appropriate”, eh?
Under this reading, is there even an obligation to anonymize? And how anonymous does anonymous need to be? Does a simple base64 encoding count as “anonymized”?
I had that thought originally too - however the keywords here are "statutory obligation" (basically, legal reasons.) So if the FBI asked them to retain the data, they can do so as long as the FBI needs it.
And the matter of anonymisation is a great question - "irrevocable" anonymisation certainly has a different meaning back in 2011, when swapping a name for a guid would do the job. Nowadays, it would require at least deleting all relationships as well, since social network analysis is much more advanced these days (especially at FB, of all places.) It wouldn't be impossible to derive the identity of an anonymous account/record based on the undeleted data associated with it. And since FB's "ghost profiles" are something we know exist, I think it's safe to assume those relationships are being maintained somehow.
<cynical interpretation> The data can be retained so long as Facebook is in the business of targeted advertising, and that's "the purpose for which the information was obtained".
Also, I'm now looking forward to Zuck's next apology for "a breach of trust" and his explanation of how they "failed to live up to their own standards" when it becomes publicly undeniable that their "Full and irrevocable anonymisation" of data they've claimed is deleted is as flawed as all the other attempts we've seen of doing that.
Facebook don't need to keep your past posts, their algorithms have already run the data. The real question is whether you can reset the "personas" they assign to you....
#3 it has a submodel which detects you are feeding it false data.
Given that people change over time, #2 seems unlikely. The question to me is "how advanced is #3?". Another thing I wonder about is how relative are these to eaxh other. At odds?
#1 is a well-established technique called "online learning", and I'd bet money that it's how many of these "industrial scale" ML algorithms are trained.
#2 makes no sense from a business perspective.
#3 is also well-established. This is how Google's Captchas work, for example.
> "Facebook makes money from your data by showing you ads. If you stop using facebook, you stop generating any revenue and your data becomes a liability, not an asset."
If my friends or acquaintances continue to use facebook, then information about me continues to be useful to facebook as that information can be used to complete their model of what sort of people my friends are, that they would associate with somebody like me.
I wouldn't trust Zuckerberg to pour a glass of water on a burning orphan, let alone delete data for real. Nor would I trust any politician to know anything about the subject.
Maybe not lie per se but they might have multiple data stores, one for the "social media" side of things and an archive for machine learning, one for targeted ads, etc. So they could legitimately say "we delete your data [from the perspective of social media]" while still having an archive for their other business units.
It wouldn't be the first time Facebook (nor any other business that profits from people's data) have been deceptive while treading a careful line between honesty and lies.
edit: I should add that if they were keeping archived records in a separate business unit, there is the possibility that they keep the original or even all edits. So scrambling your data might not do much aside confusing those who have you in their feed.
The UI doesn't say "delete", it says "remove". Remove could mean "hide from view" or it could mean "delete the underlying database entry". It's a bit of grey area where it could be argued that users have consented to by using Facebook.
Granted GDPR is supposed to catch businesses that pull those kind of stunts, you have to remember that Facebook do already break GDPR in number of public ways too. So it's pretty clear they have a relatively open interpretation of the regulations (and an army of lawyers who are confident they can proceed in such a way). Or it might just be the case that even the worst fine issued by the GDPR is worth the risk given the financial benefits awarded to Facebook for retaining data.
>- Facebook makes money from your data by showing you ads. If you stop using facebook, you stop generating any revenue and your data becomes a liability, not an asset.
This part at least is incorrect. Your data will always be an asset, and FB makes money not just by showing you ads but by selling your data.
And they obviously give away some data for free, especially pertaining to users who click through OAuth consent screens as in the Cambridge Analytica case: https://developers.facebook.com/docs/graph-api
What are some examples of transactions where they exchanged data for money?
Do you know what the legal status is of models derived from someone else's data? EG, can I train an AI from every Marvel movie, then use that to produce other movies?
That's not a great analogy because I would be violating other people's IP, whereas FB generally has rights to the data they collect.
Facebook cares about “your” data, but that doesn’t really mean your painstakingly edited review of the new avengers movie. It’s more what you spend time looking at, what you click on, etc.
They did not sell information to Cambridge Analytica. As stated in the page you linked, the data was scraped using a 3rd party application requiring user opt-in. Now, there was a bug that allowed the app developer to also scrape data from the friends of the actual users but that has been patched and the developer's license has been revoked as the entire ordeal was against FB's licensing agreement.
I am not a fan of Facebook but this entire thread is just filled with misinformation.
I work at a company you've heard of, and you've probably (P > .5) used at least one of our products. We're international (including EU). I personally wrote our user data deletion logic for GDPR compliance.
We delete all of your PII. It doesn't matter if you're from the EU or not, because it's too expensive to figure it out and too risky because you're going to miss some weird edge case - Legal doesn't have much of a sense of humor when it comes to wiggle room.
My experience has been that larger organizations consistently spend more effort (relative to size) to genuinely comply with privacy regulations than smaller ones. The risk:reward ratio for deliberately ignoring or subverting privacy regulations is insanely bad. There are too many surfaces along which that would leak out, and the gains would be pretty marginal. Pretty much anyone who works at a large tech company can and will confirm that this is the case (see Jeff Kaufman's posts on this thread). There is no conspiracy between the six-digit number of engineers who work at these companies to keep quiet about it.
>There is no conspiracy between the six-digit number of engineers who work at these companies to keep quiet about it.
I wouldn't say "conspiracy" but they've been mums-the-word about things like shadow-profiles. Perhaps, it comes down to too much kool-aid but to say that engineers would initially speak-up in such cases has been proven wrong, time and again.
Take the Snowden revelations, as an example: How many years were the programs in service before Snowden went public? How long have we known about shadow profiles and no one from Facebook has come forward to say, "yes, this is what they're doing and it's wrong"?
Relying on people to do the good that should be done by the organisations doesn't take into consideration that those engineers face severe penalties for "going pubic" about such things - namely because whistle-blower laws do not supercede such thing as NDAs.
Doesn't Facebook still have a zoo of MySQL schemas in production for each table? While I'd believe the product works as intended in some fraction of production, I think it's reasonable to suspect some or all of the system is broken for a non-trivial set of users and/or groups.
And then, what about all the data copied into Hive? Copied to 3rd parties? Does Facebook go and delete data that 3rd parties took?
You'll no doubt soon be scolded for mentioning that quote, but it will continue to be relevant for as long as Zuckerberg's actions continue display that mentality. We'll soon have somebody telling us he said that years ago and people change, but the only thing that changed with Zuck is he became a bit more diplomatic and guarded with his language. Actions speak louder than words and Zuckerberg's actions speak clearly.
He's the same old dirt bag he ever was. Here are some fresh stories to back that up:
Yeah. If there was some evidence that Zuckerberg's overall approach had changed, I too would argue that continually resurfacing an old quote until the the end of time is unfair and unproductive.
All of Zuckerberg's and Facebook's actions, however, continue to suggest that his approach hasn't changed.
Yeah, for a funny example, we seem to have (mostly) stopped reminding Google of their old "don't be evil" slogan, since it became obvious their overall approach had changed ;-)
I used to be one of the scoldiers saying things like "oh all young nerds say dumb stuff when bragging to their friends"
Honestly the vibe's still there if you pay him/fb enough attention. I'm working on extracting myself completely from them (Still got a few group messages on upcoming events to clear)
Not sure why you are downvoted - always good to remember. In many ways thats what he kept saying until last year, even to congress - just with nicer, politically more appropriate words.
It's always good to remember this and we should keep repeating it.
This is the most profound insight we have into the mind of Mark Zuckerberg and what he thinks of his user base.
Stop listening to the PR and examine his actions since then. You can see how that comment was not merely a youthful indiscretion, but his entire business.
Zuckerberg continues to display this exact same mentality, only now he has billions of dollars in resources.
One of the first thing I learned in the IT industry...do not delete any data no matter what the circumstance is! And I believe internet companies whose life and soul is data would not want to delete it
> - Facebook makes money from your data by showing you ads. If you stop using facebook, you stop generating any revenue and your data becomes a liability, not an asset.
> Facebook publicly claims it does
Hahahahaha YMMD!
> Mark Zuckerberg testified in-front of congress and stated they do
He would tell us everything that helps him as he is an opportunist
> Multiple government regulators have specifically checked that it does in their privacy audits
You're talking about the governments secretly spying on us and lying about that since decades? Ouch.
There is no legal risk for those who observe if you abide the laws.
> A infinitesimal fraction of facebook users try to delete anything
What does that tell you about the majority of ppl?
> Facebook makes money from your data by showing you ads. If you stop using facebook, you stop generating any revenue and your data becomes a liability, not an asset.
That's just what you think you know. What about psychological profiling, law enforcement etc.?
Also did you hear about the "shadow profiles" about users who don't have an account? The moment you're surfing into their net (as we know even through embedded like buttons etc.) you'll be milked like a dairy cow.
> What reason would there be to lie about it?
What reason is there not to be honest about it? The recent events have shown very clearly that they don't have to fear much.
> They might be lying and actively conspiring to not delete it, but they'd have to have a very good reason to take on that much legal risk.
Or they can just claim, again, that they accidentally forgot to delete copies of these posts from their backups. It just completely slipped their mind.
Mark Zuckerberg is a liar who has demonstrated that he will do and say whatever is in his and Facebook's best interests. If you believe anything other than that, you're a fool.
Hmm ... I wonder if they have an edit limit where old edits get erased if you keep editing, sort of like that trick someone came up with years ago for pinging the credit bureau so many times so that eventually your credit looked better since old items were erased or not considered for your credit score ... or something.
It's called "bumpage" or "B*" on forums. People were leery of using the word back in the day because they thought the bureaus might catch on. It doesn't work anymore.
This is the case with many other online platforms (such as reddit) in that the original content of a post is still reachable if it was deleted, so a common method is to "scrub" edit your posts then delete them.
However at Facebook's size, and given they're known for 0 privacy, they likely track all changes anyway.
Reddit doesn't actually store the history of your edits, only that you edited it (which is why, when you scrub reddit history, people recommend editing it because then your posting is gone except for say a backup). Facebook, however, allows you to see the history of posts so if someone edits it you can see what they originally posted.
I don't think editing a Facebook post first would have the desired effect.
I wouldn't be surprised if it's not being scrubbed from FB database, but that's not the author's goal.
The README states the intent is to clean up publicly facing content. It's meant to tidy up internet presence to the general viewing public, not escape the grip of FB's data vacume.
I'm a big fan a Selenium, and this is great usage of it! Scripting a boring, repetitive browser task, that would take a large amount of time & effort to do manually.
I'm pretty sure they use some sort of event store for their backend in which case all versions/changes/updates/deletes are stored as a separate revision alongside the original content.
All the big companies are doing immutable, append-only event logging and probably have no mechanism to expunge this data. All because storage is cheap and they need to hold on to everything for testing or whatever future need that might arise.
> All the big companies are doing immutable, append-only event logging and probably have no mechanism to expunge this data. All because storage is cheap and they need to hold on to everything for testing or whatever future need that might arise.
I'm very confident that the data is fully removed, because properly deleting data within NN days is treated as very serious internally. But I don't know the details of how it's done for cold storage.
(I would love to see someone subpoena something deleted, say, 1y ago and write up whether it was produced.)
I wonder how that works with Facebook's Blu-Ray cold storage [1]. Are optical disks treated like paper documents or is it still considered electronic storage?
Ah, that's interesting. If they store customer data on BlueRay disks I assume Facebook took the necessary steps to delete/destroy records according to GDPR requirements when customers request deletion....
It's currently possible to view previous versions of posts, just click "# edits" in the bottom-right by the "# comments" button (neither of which look like a link)
Tape backup is re-writable though isn't it? I had inherited one in the 90s and could re-write, granted it was painfully slow. Or are you saying that Blu-ray is re-writable too? The first iteration I used a few years ago could only write once.
Maybe disks are cheaper, even if you can't reuse them. As an added benefit, you can also restore info from the more distant past if you recognize errors too late.
I believe there is zero doubt that somewhere the text has been saved. Not only the original text, but also various results of analysis. Since they store edit history, too, replacing it with other text would simply be an indication that you’re trying to cover it up.
I would guess that this has been demonstrated in court.
> Since they store edit history, too, replacing it with other text would simply be an indication that you’re trying to cover it up.
It's text you wrote, why would there be any problem "covering up" your own text?
Note that, I'm not arguing you should have a right to use Selenium to scrape their site and replace the text, but if you went through every post in a non-automated fashion and changed the text, from a perspective of "covering up" the text, the purpose would be the same. Facebook almost certainly has an interest in keeping 100M users from scraping with Selenium, that's a separate thing entirely.
If you’re trying to hide the content from someone who has access to the edit history, editing it is ineffective and simply shows that you wish for it to be hidden. When would this be an issue? In court.
It’s true that editing could serve to hide it from other users, but privacy settings could do that also.
This is information that you originally put out, wrote down on Facebook, so it's not that you're hiding something you didn't want anyone to know originally. If it was libelous or something so you wanted to delete it after the fact, it's still likely possible to get it back -- even if Facebook deleted the actual message in the database, there's likely a number of methods -- server logs, db logs, cached resources, db backups, or even a simple screenshot -- that would likely be sufficient proof.
Plenty of people post things that they shouldn't have and regret. It could be plainly incriminating, like when people ill-advisedly make a post essentially admitting to a crime, threatening someone, or discussing events related to a lawsuit in public. It's possible deleting or obscuring such a post could even be construed to be destruction of evidence.
Point being, the ability to edit and view edit history changes very little with regard to the courts. Instead of needing to present Facebook with a subpoena for information from e.g. logs, you're able to simply view history. It doesn't change anything whether you can delete the data or "delete" the data, it just changes how accessible it is.
I think the benefit of this is that other people can't easily see what you posted. For example recruiters for jobs, journalists or activists trying to publicly doxx or embarrass you, stalkers, people with a grudge against you, etc.
If FB has this data squirrelled away somewhere in cold storage then in order for anyone to harm you with it, FB has to admit to keeping it (which would be a scandal) and it has to get out into the wild. That is a reduction in risk compared to someone just looking at your profile.
There was a post a long time ago on HN on how to taint your FB data over time so that it's difficult to tell where your real data stopped and your fake data began. I can't find it now, but part of the advice was don't replace your real data with gibberish, but rather with non-gibberish - public domain texts for example, or better yet AI generated text). And don't do it all at once but spread it out over a year or so.
With Facebook's distributed data stores, I'd be really surprised if any deletion was immediate.
It's more likely that they tombstone a post. That's where they store a "this post has been deleted" flag in their database. Using this style of deletion, the post and the tombstone would be eventually removed from the data store when it is periodically compacted.
I had deleted all my posts and photos. Even my profile photos. Then I disabled my account. After a year I enabled it back. Facebook used to prompt me to add some photos - both before disabling and after reenabling.
The eerie part? It used to show faded thumbnails of my photos that I had deleted more than a year ago in Photos section when prompting me to add my pics.
Probably will want to scramble multiple times, each one randomly with realistic looking sentences (better yet - just take about 100 preprogrammed sentences and permute a subset of them), and making sure there is a random spread to the total number of edits for each post
If you use Reddit and want to delete your comment/post history I would recommend Shreddit[0] for the reason you mentioned[1]. It's important to do this frequently because there are quite a few sites out there that cache Reddit content periodically.
Something like this is probably against some terms of use that you implicitly agree to as a Facebook user, and I wouldn’t be surprised if they would auto detect it and ban you.
> I'd be slightly more confident (slightly) that editing the post might cause the core data in the db to be updated, however.
FWIW, I believe this is still true of reddit. Deleting a post doesn't actually delete it. It just no longer shows on the webpage. To really remove it from the database, you need to edit the post/comment, replace the contents with something simple (e.g., the character "a"), save it, then delete it.
Post-modern privacy models include those developed by University of Chicago law professor Richard Hasen, who argued the Obama administration could regulate Facebook under the guise of protecting privacy. (He was later fired from his posts.) Other authors have argued that allowing Facebook users to choose what information they share online would weaken privacy, and that if Facebook were able to take over the content management systems of websites and allow people to censor content, that was something the First Amendment should prohibit.
This is pretty good perhaps even for auto-generated absurdities:
Seed with: "Why does my dog speak poor English?"
> It's possible your dog has a rare genetic condition called dyslexia, which puts him or her at risk for learning difficulties. If the problem has progressed beyond a certain point—say, if your dog has been in the hospital for over a year—and when you go to bring him out of the hospital—not for exercise or exercise training—he may not speak English in the proper way. Your veterinarian will work on the problem while you are in the hospital.
What should I do with a dog with dyslexia?
It's important to be vigilant and familiar with what is going on with your dog in hospital. If you find that your dog may be learning, take him home and have him trained as soon as possible. It may take a while for your dog to learn to associate the letters of English with words.
I might have to use this to reply to unclear tasks from now on, great find.
There are times when we need to ask ourselves, are our interests really served by having this ability, when so many of us are now doing so much online anyway?
probably safe to say that the idea of a user on a platform like Facebook trying to obfuscate their history on the platform... is a short jog from an individual trying to avoid government backed hacking on them.
Facebook has many years of pretty smart work, and probably safe to say tens of millions of dollars just in meetings about specifically this (re costs associated with effort of user data retention).
What will be the consequences though? I'm sure they can easily discover that the user is trying to overwrite his history, batch edits are easy to spot - but what will they do about it? Ban him & delete his data? Stop him from editing his own posts? Just throw away all changes that don't pass the smell test?
I would speculate that they have built systems in a way that it would not matter what any user does. any edits to a already existing post probably just append
About a two years ago my friends and I used to scroll back into each other’s posts from the early days and ‘like’ some of the most cringeworthy posts and photos we could find. It would cause them to bubble up into everyone’s newsfeeds again. Was a fun game.
This script would have saved me some manual labour back then.
(Edit: we also used to endorse each other on LinkedIn for the most bizarre skills we could find. Toilets, divorce and animal husbandry to recall a few)
A fun artifact of deleting your facebook account is that all of your comments get deleted (hidden, I guess, is more accurate) too, which means that in old threads people who interacted with you now appear to be interacting with themselves.
My friend had a profile picture in which he was doing something silly with his eyebrows. I commented "Nice eyebrows" and he replied "You're just jealous." I've since deleted my account and now there is only one comment on this profile picture, by the person the picture is of, reading "You're just jealous."
Aww, man. This just made me scroll way back to 13 years ago, and this is exactly why I love facebook. Obviously I said some stupid things when I was 18, but I was a child. But it was really cool looking back at the much more romantic version of myself when I was a kid.
What do you think the chances are that Facebook actually does any delete operation on their data? I'm sure this is simply marking it as hidden in their database. Maybe I'm just paranoid, but to me the only way to take back something you say on social media, is to never have posted it in the first place.
There are a million reasons to agree that the data is never actually deleted
* need to retain data to fulfill government requests
* internal auditing
* it's all backed up in some "data lake" somewhere to do internal ml or analytics on
* hundreds of copies in database backups from different times
* internal logs that contain the data
* it's already been analyzed and aggregated into learning products and models that aren't going to be recomputed
It's not being "paranoid". As someone who has worked on large scale saas, I say: there is zero, 0, ZERO, 0.00 chance of that data every actually being deleted
> As someone who has worked on large scale saas, I say: there is zero, 0, ZERO, 0.00 chance of that data every actually being deleted
Unless that is built by design. I happen to also work on a large scale SaaS where we take this stuff very seriously and I can say it is possible to protect this data. However I will agree that this adds considerable complexity, but for some organizations, that is totally worth it.
> need to retain data to fulfill government requests
That's a choice, not a requirement. If you encrypt the data and purposely don't store the keys yourself but instead have the customer store them, then you don't have anything of value for the government.
> internal auditing
Personally Identifiable Info is not something we want to peruse. In fact we purposely don't want to see it because that eliminates a potential for mishandling.
> it's all backed up in some "data lake" somewhere to do internal ml or analytics on
That kind of application shouldn't give carte blanche to disregard retention policies. You can run those applications against replicated shards of the original data; and when the original gets reclaimed, so does the replica.
> hundreds of copies in database backups from different times
Storing useless data forever is not cheap, especially at scale. Better store what needs to be stored and free up what can be freed when retention policies kick in (or user requests it).
> internal logs that contain the data
That's ground for failing certain compliance audits. Logs should never contain PII in the first place, that's an operational failure.
> it's already been analyzed and aggregated into learning products and models that aren't going to be recomputed
That's a tricky one, but if those are actual models instead of giant lookup tables, one could assume the data is not reconstructible. However, that needs to be a design consideration of the models themselves, to prevent user data from persisting.
For similar reasons, I doubt very much that they actually deleted everything they said they did. I know that they were backing up to tape and storing it at Iron Mountain years prior.
But someone has to pay that bill. If Facebook ever fails, I doubt that someone would keep data centers full of information around without getting paid.
FWIW I’ve seen comments from a privacy engineer at google who posted here and said they actually do work hard to delete your data.
I’d expect Facebook and google to successfully delete the data (after giving 3-6 months for backups to age out) but wouldn’t trust most smaller operations to do so. And yeah, that doesn’t mean ML models or whatever but just the retrievable copies of your photos and text posts.
After knowing Google have been tracking few hundreds millions people's precise locations for the last 10+ years, I went thru the exercise of deleting all the activities history from Map, Youtube and Search and disable all activities tracking.
For the first few hours, I see new default youtube suggestions. After a few days, I see a lot of old search/view videos pop back to the youtube home page. YT still seems "suggested" videos for me to watch base on the past viewing info.
The model is probably based off of your watch history before you deleted everything. There would be no way (theoretically) for anyone to see your watch history but YouTube still thinks those old videos are ones that you'd be interested in and fortunately you haven't watched them yet (as far as the algorithm is concerned) so it's recommending them to you again.
It's likely they have an ML "recommender" model that was built off of your history, but doesn't actually store the history nor is it reverse engineer-able. Once you hit a certain threshold of activity or time duration, the model will probably rebuild.
This is a big reason I view the new “we’ll automatically delete your data!” Crap as disingenuous at best. They’ve already learned everything from the data in a few hours, it’s useless after that anyhow.
A big problem with data is not the company itself using it, but other ways it can come back to bite you.
E.g. a sexual photo retrieved from a private social media album is no threat to your reputation when it’s been assumed into some machine learning model for detecting sexual photos, but it’s certainly a threat if a future data leak allows your enemies to get the actual .PNG or .JPG and send it to the news media or your loved ones. Knowing that the photo can actually be deleted is valuable in this case and I’m sure there are many other similar ones that could be listed.
Users want to delete watch history, which they can. Classifier models predicting what you want to watch are not the video watch history, nor are the models capable of producing it.
> Classifier models predicting what you want to watch
But that probably is not the only model generated from your data, is it? They probably have many other models generated from your data, everything from ad-displaying models to profiling models for Hydra.
That's what I'm talking about. Think of the recommender model as your own personal neural net that is trained, over time, to show you videos it thinks you might like. It is not capable of telling you which videos you watched, because it is not a database or list of watched videos, but it is instead a classifier that predicts what you probably want to watch.
That's a far, far cry from the reality of "possibly prosecutable in EU in some scenarios." We aren't even using a website right now that's in GDPR jurisdiction. But there's also a weird fetishization of GDPR I've noticed where people invoke it like "heh, my dad works for Xbox. just you wait buddy, he'll have you banned."
I think it’s changed what is best practice. Unless you plan to never introduce your product in the EU, designing your product so that it cannot ever delete data is a bad idea.
In my experience I’ve definitely seen GDPR result in a large company having developers looking at how data can actually be deleted and not just set to deleted=true. I don’t think my company’s lawyers were alone in thinking this suddenly became more important than before.
When a feature (e.g. thoroughly scrubbed data deletion procedure) is introduced into a product, it is often far easier to apply it laterally to all customers than just a subset. For this reason, GDPR has knock-on effects that benefit all users of certain services.
> What do you think the chances are that Facebook actually does any delete operation on their data?
Close to 0%. There is no way they're actually deleting things on the backend, but this could help prevent your content from indexed by search engines inside and outside of facebook.
Ouch, almost all of my pictures of me have been uploaded by other people. I had this suspicion yet kept telling myself I'd once build it. Thanks for the heads up.
I hate the pic attrition on Facebook. As people unfriend me or delete albums, some of the best moments captured of my life are erased. They really should be included in the archive download and more strongly link to your account.
Does anyone know if there is an equivalent to delete Facebook friends or unlike pages/leave groups? I really want to clean out my Facebook since I've had it for a very long time and much of it doesn't appeal to me anymore, but I don't necessarily want to lose the tiny social ecosystem I've created with my family.
However, I would like to remove the hundreds of friends and pages that I don't talk to/don't represent me/are mining my data even more than the platform is.
I did this a year ago, and it considerably improved my Fb feed’s signal/noise ratio.
Specific procedure I’d recommend is to make this a regular process: open fb feed, read until you find a thing which does not improves your life, click on owner (author / group) -click “friend” -> unfollow. Iterate until sanity is restored.
And one moral of this story (hi Facebook, please understand this) is that people are multidimensional. Just because you know someone once, in one context in real life, does not, actually, imply positive relationship on informational ecologies for all time in the future. “Authentic self” as defined by Facebook as a single coherent identity is a lie.
My experience is that I have hundreds of people to snooze, and would rather just have a fresh start and let those people trickle back in if they want to. Maybe I should just make a new account.
I did that last year, created a new account. Added back only a few real friends.
But slowly started getting requests from many of the previous friends. Some thought I had unfriended them, so had to explain to them that it was a new account. Felt obligated to add back most of the people. Now my feed is useless again.
That process is what made me stop logging in (haven't actually logged back in to delete, it's been two years - stopped when I stopped needing it for university groups).
There was just nothing, or so little, that 'improves my life' as you put it, nor anything close. Just shit and drivel.
And it's not made for it. Quality doesn't equal traffic, nor even engagement in a monetisable sense. If it were, it would very quickly have learned that I didn't want to see memes, full stop. Nope, feed continued to be full of valueless memes.
For a brief period I considered I might be 'missing out', but it doesn't take long to realise that if you're missing out through not being on the right social media platform... who or what exactly are you missing out on.
Facebook's algorithm should mostly do this by default. If you don't interact with people's posts, you'll see less of them.
You can explicitly mute people as well or even just certain websites. My aunt has been posting a ton of weird alt-right news lately and I've had pretty good success by just blocking her news sources so I can still see posts from her that I care about (family stuff.)
I also have a few friends who I've muted entirely because I still want to be able to contact them, but I really don't care about any of the stuff they're posting to facebook.
This might be a little more manual than you want, but it's pretty simple to just casually block posts you don't like as you scroll through your feed.
It's not really a matter of scrolling and seeing their content so much as that there are people I'd just rather not be associated with. I've been contacted numerous times about people committing crimes or getting involved with radical political spectrums, and I'd rather just disassociate myself from a lot of the people I used to see or used to have relationships with but do no longer (these number in the hundreds, from a lot of other social platforms).
I also don't really scroll the feed anymore, so I'd rather just keep it to people I want to keep tabs on (close family, professional contacts)
I did it a while back (3 or so years ago) by scrolling through the dialog so that all of my "friends" were loaded, and then opened up the browser console and wrote a line of JS to toggle the checkbox to delete each of them and saved.
Not sure if that's still the interface today, but it worked for me back then.
* Edit: I checked, and you can still do this. It's under the main preferences dropdown -> News Feed Preferences -> Unfollow people and groups to hide their posts
It's no longer a checbox, but you can simulate a click in JS. Haven't tested so obviously YMMV
It takes a while but what I ended up doing was deleting friends on their birthdays. It takes at least a year but mostly weeds people out who don't fit in your circle.
I don't have my birthday listed there (because I find the constant birthday posts from friends to be annoying), so I'm functionally immortal under this scheme!
So I have been putting together a way for people to more easily make their own blogging platform. It would kind of mimic a social media platform, but since everything is committed to a repository using the JAMstack it could easily be converted to a full website or in your case you could simply delete the repository or any number of your posts because they are just files in your repository. Any feedback would be wonderful. https://your-media.netlify.com/post/make-your-own-media/ Everything is owned by the end user. This is only providing a recipe for people to use.
I will also mention that https://www.stackbit.com/ is doing basically the same thing but more from a “Make life easier for Website designers” perspective.
Here's a good app idea: Take all your facebook posts and replace them with copywrited sections of books. I.g. my cringy posts from highschool become chapters from Harry Potter.
No doubt Facebook would delete these posts permanently and DMCA them.
This may work against you, especially if Bloomsbury Publishing issues an infringement notice to Facebook. In that scenario, Facebook need to place your account on 'legal hold', retaining the account and history indefinitely while the copyright case is resolved. It's possible this would be cold-storage and all past edits.
Exactly the reason why I stopped using sources like Unsplash for free graphics when I was as a young designer. Imagine the Enis' surprise when their shiny new website came under attack by a zealous photographer for copyright infringement. This woman is now a chief editor for a major publisher. This is an awesome idea though, I think I'm going to go with Hunter S. Thompson though. ]:\
Only if somebody complains, I expect, e.g., a DMCA takedown from the publisher. You could also have your account banned as a repeat copyright infringer, which I suppose would be a bonus in this case.
I'd advise against it though, at least in the USA, due to the commercial value of the Harry Potter novels, and the penalties for infringement.
OP: I'm close to getting this to run but I get this error:
Traceback (most recent call last):
File "./deletefb.py", line 3, in <module>
from seleniumrequests import Chrome
ImportError: No module named seleniumrequests
Is there a “Marie Kondo” for the digital world? Seems like something or someone that could be just as relevant if not more than organizing your physical world.
Back in the day there was a similar js bookmarklet that could be used to export all your friends' contacts (name + emails) so you can leave without worrying that you'll be missing out on all your "connections." It would be nice to extend this with a "backup my posts + contacts" info before deleting everything.
I've been through this before and some posts 'came back'. The only way to be sure is to delete your account. If you really want an account just make a new one.
This was my experience using a Greasemonkey script some time ago. Some posts would "come back", and there were pockets years back it couldn't even find to delete, but were still accessible on my profile. Wound up just deleting my account and making a new one that I never, ever post on.
Alternatively, my friend turned on the "on this day" feature that would send her a notification every day of what she did on facebook for that day of the year.
So every day she just took a minute to the cringiest of that 1/365th of her content and in just one year she was free.
Some sure fire ways to get your account actually deleted, and quickly: post large amounts of hardcore porn, extreme gore, and anything that puts Jews in a negative light like Nazi symbols combined with the prior two methods. The moderation team will nuke your accounts very efficiently.
Love this script! I'm trying to use it to wipe 10 years of Facebook dumbness. It can handle about a month's worth of posts before it crashes though. Is there any try/catch/repeat loops you could add to resolve this?
Message: no such element: Unable to locate element: {"method":"class name","selector":"layerConfirm"}
Or... is there a Python equivalent of "nodemon" I could run it behind?
Thanks for the great work! I was thinking Jumbo.app would implement something similar by now but I realize the problem was hard to solve because facebook doesn't let apps delete posts.
At least FB lets you remove them from public view.
I recently tried to script clearing out my twitter linkes. (Some of which date back to before <3s, when a star could mean "yikes, I don't want to RT this but I want to note it". Others were just from when my twitter was more personal, and now that I use it for professional purposes I don't need random likes from the 2012 election lying around.
But Twitter now says I have about 4k likes, but only displays the ones since I tried to run my script.
Occasionally, a weeks or months later older ones resurface, but if I try to unlike them in bulk they disappear again to return at a random interval.
Similar, but OT: does anyone know of a script that will delete your tweets from before X date? There's tools that bulk delete, but I want to delete from before a certain date.
IIRC twitter's API doesn't allow you to bulk delete tweets from the outside. I've found that exporting all tweets, getting the IDs, and then deleting them in the browser using the twitter.com only browser API will allow you to delete tens of thousands of tweets with no rate limits. I did something similar here: https://github.com/kylehotchkiss/fakeblock but it's not documented well and Twitter's export file format actually tends to change on a somewhat regular basis.
Sweet and Simple. I have the habit of writing complex scripts leaning of super clean code that deviates from the actual purpose. This is so clear. I could understand the code in 5 min...That means it's super simple. And looking at the votes it has received, it works!
Hmm, I was catching this same issue while running this script today. Only in the chrome instance that Selenium opened though. Facebook was working at normal speed in my primary chrome instance.
There needs to also be an obfuscate profile method.
Any data is probably never deleted. What makes more sense if a long running script (multiple years??) that updates / inserts / deletes your profile with random information.
Wait about a week, guessing this won't work if it builds up any significant traction.
I want the ability to delete everything after some period of time. Every message I send, every message I receive, I want it all to work like conversation -- not contracts.
I don't want someone digging up an email from the past where I said something stupid in a moment of anger, or frustration. If all tech worked like Snap Chat, or Signal, I'd be really happy.
I have no trust that Facebook actually deletes anything. Guessing anyone who tries to delete things actually flags the user and then their team of Zucks go in and look at the juicy content you were trying to delete... seems like 5 years from now, Facebook will charge for the "not have anything dumb you said in your 20s show up in public search" feature. Cool. Cool, cool, cool.
In most of the tech companies, deleting only means flipping a flag in the column of the row so that those things won't show up to the user. "Hard delete" is what you want.
Whenever I try to delete old Facebook posts, I just get indefinite loading bar. Shrugs Seems like a well tested feature. /s (or maybe my posts are just stored in ice).
It's considered a better practice to explicitly wait for an element on a page before acting. How to do it is documented for the Python bindings here: https://selenium-python.readthedocs.io/waits.html
I'm working on fixing that, however it's a bit tricky because Facebook does some very weird things with Javascript that seem to interfere with Selenium. In particular, the only way I could get it to actually focus on the "Delete" button was by dropping down to Javascript. I'm not sure why this was. I also ran into issues with the Gecko driver initially where I couldn't get it to open the delete menu at all.
My new year resolution for 2019 was to keep my computer Google free. It's less of a statement and more a "fun" geeky game. So I rather not install Chrome:-)
If you can get this to work with the Gecko driver that would be fantastic. I initially tried using Gecko and ran into some difficult to debug issues, but it might be possible now.
I'd be slightly more confident (slightly) that editing the post might cause the core data in the db to be updated, however. In which case, I think the more effective script would be one that goes through all of your FB posts and scrambles them, or replaces the text with gibberish.