Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
[flagged] Apple Has Begun Scanning Your Local Image Files Without Consent (sneak.berlin)
185 points by sneak on Jan 15, 2023 | hide | past | favorite | 69 comments


Some very cursory googling suggests that `smoot.apple.com` is used for Spotlight. My guess is this is actually the "visual look up" feature recently added to iOS and macOS, where Quick Look and other system apps will perform OCR on images, and will also attempt to recognize animals/plants/landmarks/etc and offer to look them up for you.

Taking a look at System Settings, it seems the relevant part of the privacy information for Spotlight is here:

> When you use Siri Suggestions, Look Up, Visual Look Up, when you type in Search, Safari search, #images search in Messages, or when you invoke Spotlight, limited information will be sent to Apple to provide up-to-date suggestions. Any information sent to Apple does not identify you, and is associated with a 15-minute random, rotating device-generated identifier. This information may include location, topics of interest (for example, cooking or basketball), your search queries, including visual search queries, contextual information related to your search queries, suggestions you have selected, apps you use, and related device usage data to Apple. This information does not include search results that show files or content on your device. ...

> This information is used to process your request and provide more relevant suggestions and search results, and is not linked to your Apple ID, email address, or other data Apple may have from your use of other Apple services.

My understanding is that any actual image recognition is done on-device, so the image itself (or any compressed/hashed version of it) is not sent to Apple. At worst, I suppose Apple might receive a list of features identified by the local image recognizer (i.e., "siamese cat", "eiffel tower")

I'd be curious if disabling "Siri Suggestions" would disable this. It seems like it should.

I think it's reasonable for very privacy-conscious people to be concerned about behavior like this, but it comes off as paranoid when you wildly jump to conclusions without doing even 5 minutes of research.


Even if we trust Apple that this reading of our files is anonymous, we should not be forced to share any photo or other file details with them which they could use for their own purposes. I do not see why this article is flagged other than it can hurt Apple's PR.


The article is rightfully flagged because it’s BS. The guy found something on little snitch and is suddenly exposing some dark company secret is the story? More like someone has decided to write an article seemingly with zero context or research on how the OS works, past and present. mediaanalysisd has been around since forever, a process most associated with Photos’ facial recognition but later expanded to other things like text and subject recognition. There are non nefarious reasons for mediaanalysisd to connect to the internet, like to update its models, which are run locally.

> This information does not include search results that show files or content on your device

As parent said, all content scanning is performed on device. The only time photos content and metadata is sent to Apple is when those photos are stored in iCloud Photos. With advanced data protection turned on, they do not have access to any metadata other than a checksum iirc.


To give Apple the benefit of the doubt that this network connection is only updating the local SW models, is it possible to remove mediaanalysisd if someone does not trust it?


You can disable it, just like any launchd process (similar to systemd units). The default is to have it automatic, and that default is part of the OS installation (which often means OS updates reset the defaults).

To actually manage something that for the general consumer (for which it is targeted), you'd mostly be using MDM profiles. Those can be created, installed and managed for free, using both Apple software and third party software. This can be done using both up-to-date mobileconfig policies or on older operating system releases using MCX profiles (similar, but different and a bit older, more like GPOs).

In practise, you might have a better experience doing either of the following:

- Don't use the features that depend on this, this essentially makes it moot

- Don't use the OS since it is catered towards a market that wants those features and expects them to work consistently on every device (or so Apple assumes), which results in considerable engineering effort being made by Apple to make the feature(s) work instead of not-work; any modification isn't assumed to be a user configuration but rather a defect that needs to be remedied (hence the "use MDM profiles" note).

As for just the networking part, Apple lists a bunch of guides for network consumption which is mainly aimed at network administrators. Any outbound firewall in your network can be configured to block the traffic as desired. If it was purely about trust, this would also be the only 'correct' way as an 'untrusted' OS isn't suddenly trusted just because you turned off a daemon. Either you trust it enough to let it roll, or don't trust it and do the policies elsewhere (off-device). But that goes for any hardware/software.

Addendum: this is the first page you'll find when you ask a search engine for ports used by macOS: https://support.apple.com/en-us/HT202944 which right at the top contains a link to https://support.apple.com/en-gb/HT210060 which explicitly has an entry for *.smoot.apple.com. Since your image will need to be analysed to be able to do a lookup it will first have to extract features from it. If it finds something like an address or coordinate or text or anything that it can't resolve locally (i.e. because it is temporal data that couldn't be preloaded), it has to reach out to a service that is kept up to date. Those are the pages that are so easy to find, even Apple's chatbot on the support page redirects you there.


> it comes off as paranoid when you wildly jump to conclusions without doing even 5 minutes of research.

since your guess--*"this is actually the "visual look up" feature recently added to iOS and macOS, where Quick Look and other system apps will perform OCR on images, and will also attempt to recognize animals/plants/landmarks/etc and offer to look them up for you"--is entirely compatible with scanning the computer for "CP/etc", it seems to me that you are also jumping to conclusions.


> I'd be curious if disabling "Siri Suggestions" would disable this. It seems like it should.

Siri Suggestions is disabled on the machine in question (and has been, though I just confirmed that it is off) and this still happened. I have manually disabled every Apple network service that I know of (in the OS UI), and have denied all network access (via LS) to a dozen+ OS processes that phone home.


Wouldn't it be more likely that medianalysisd (which is present in macOS since many years) does something like object recognition [1] and connects to Apple servers for updating the model or telemetry or something similar?

[1] https://eclecticlight.co/2022/03/23/how-visual-look-up-works...


If indeed it does that, then the title is correct: Apple has begun scanning your local image files without consent. I don't think any reasonable person would expect the OS to do this if you don't use the Apple apps to manage or search images, and have opted out of all analytics and such.


This could also have something to do with the new feature to detect duplicate photos (added in macOS Ventura) [0]

0: https://support.apple.com/guide/photos/remove-duplicates-pht...


I’d be interested to see whether this is checking more than hashes.


Hashing can be done locally.

I am increasingly weary of my employer providing me with a device that I need to have on me at all times at home and that is doing all of these outbound connections.


Not sure what brought to the conclusion that "medianalysisd running" => CSAM. Article is a bit on the trashy/alarmistic side IMO with no concrete or technical evidence whatsoever.


I guess if you really want to get technical, they aren’t explicitly drawing that conclusion. They’re simply saying that Apple announced they’d scan files for CSAM, and now Apple is scanning files for “some reason.” Draw your own conclusion.

I think sending a one-way encrypted hash of a file, a la PhotoDNA [0], is a fair compromise. But if Apple is outright uploading your entire photo library for “analysis” without an explicit opt-in, that’s a different beast altogether. And if Apple today isn’t using this for surveillance, Apple tomorrow may be.

[0] https://www.microsoft.com/en-us/photodna


Neither description is accurate of what Apple did. On a local match, they would send alternate (encrypted) hashes. If those matched as well, they’d send a downscaled version of the file for human review if I recall correctly. Not quite the same as PhotoDNA but also not as alarmist as “they’re uploading your entire photo library for analysis”. The vast majority of analysis is offline and only the assets that matched local AND remote would get uploaded.


mediaanalysisd has been around for far longer than three years.


that shit was crashing our machines a decade ago


I have no sympathy for those who might harm, abuse, and exploit children ... how long until this technology falsely identifies CSAM and results in more victimization (either through false accusations, a botched raid, or death of a family pet).


It falsely identifies CSAM all the time. That's one of the issues. You have to be careful when taking pictures of your kids in the bath on your iPhone for this reason.


Best to just not own any Apple products or use their services, IMHO.


What products have better privacy protection than Apple, according to you?


I think you'll find that's google's services, not Apple's....


I never trust any claims that a surveillance tool will only be used to stop child trafficking. It is used as a method for law enforcement to get their foot in the door.


Apple has confirmed to the media that as of December 2022 that they are not pursuing on-device CSAM scanning:

https://www.theverge.com/2022/12/9/23500838/apple-csam-plans...

macOS Ventura exposes new object and scene recognition features for images, including background removal in Preview (analogous to "Copy Subject" in Photos but without requiring the use of that app):

https://www.tomsguide.com/how-to/how-to-remove-image-backgro...

You can also use Spotlight search with keywords such as "flower" or "cow" to find images on your local storage that have those subjects; I don't recall if that is new to macOS Ventura or has been possible for several versions of the OS.

mediaanalysisd is a daemon which has for years been responsible for, well, media analysis and would perform any network tasks required to support features such as the above when previewing an image. It would not surprise me if not all image recognition tasks are done exclusively locally, although I haven't found documentation of how much exactly is or isn't. It has long been the case for speech recognition, for example, that not all of it could be done offline (although more of it can be on Apple Silicon machines).

The only way to opt out of OS-level features that send any data to Apple at all is to opt out of using macOS; even then, as I recall, the Asahi Linux installer has to pull certain bits from Apple servers for copyright reasons.


No, now they just appear to be sending the same hashes that they'd cryptographically match against a database to some server of their to do with whatever they please (or with they're ordered at gunpoint to do).


This is not new. mediaanalysisd and photoanalysisd have been running on your machine forever doing exactly what it says analyzing media and photos for organizing and creating your fun little montages. mediaanalysisd does things like perform visual searches when you search for a contact. It's the thing that made your macbook BSOD while it was idling or sleeping.

You can unload the plist of both of them to disable them.

This kind of alarmist nonsense makes it hard for people to believe when big companies are actually doing something wrong :[


Little Snitch was a must have when I used a Mac. Open Snitch for Linux is still being maintained.

https://github.com/evilsocket/opensnitch


There are so many things wrong with this article, but the most relevant thing: the EULA contains plenty of consent you give upon agreeing and so do the features with their own data exchange sheets that you have to act on before using the feature (at least the first time).

Besides that, same as posted elsewhere: mediaanalysisd has existed for a pretty long time and doesn't require iCloud, an AppleID or CSAM scanning to exist.

Fun fact: the binary in its many revisions still has an embedded plist with a copyright of 2015 in all currently supported OS releases.


Are you implying that any amount of surveillance is acceptable so long as it is added to the ELUA?


A couple of notes:

1) I'm still on macOS 12.6.2 Monterey, with Little Snitch, and mediaanalysisd never makes any network requests (though it does exist and is running). So there may be something new on Ventura.

2) You can see the https request and response headers (but not the decrypted data) using this technique: https://lapcatsoftware.com/articles/logging-https.html


Side question for the author (@sneak): I've noticed some _pk cookie (Motomo?) being injected while visiting your page and some attempted requests to t.sneak.cloud (and others) but haven't (yet) found a consent notice on your site.

What's up?


My site sends cookie headers to your browser, which can optionally be returned to my server on subsequent requests if you so desire. Your browser will have settings to configure this behavior so that you can set it to whatever you wish. My site still works fine if you do not send any cookie headers whatsoever, so you will have no issues browsing it with cookies disabled or cleared (or not persisted) at your option.

I find the Cookie Autodelete browser extension to be extremely useful in this regard.


That’s ok. As you seem to be based in Germany would you please point me to the relevant consent notice?


Nobody cares about your consent notices, stay on topic here, this is about apple.


It is on point: if you stopped a second to think and consider the context you might have gotten it.


It looks like you're also settings some mailchimp cookies?


Basic research would have shown this person that mediaanalysisd has been apart of macOS for long before Apple's CSAM local-scanning plan.

This is essentially a conspiracy theorist post.

1. There is a thing happening I do not know anything about.

2. There is a thing I read about that sounds nefarious to me.

3. Therefor, A is B.


Does the basic research also reveal what information mediaanalysisd is sending to Apple, since when, and why? If so, could you link to it? I can't find much about it. (Also, just because it's existed for a long time doesn't mean it's behavior hasn't changed.)


That’s all true, but I don’t see any evidence that it did change. There are any number of background processes that might have changed. There’s not a story unless it appears that one did.


The post provides one data point that it did: Little Snitch started reporting traffic from mediaanalysisd. It's not very extensive evidence, but not quite none, either.


It's as good as none.


You can wire up a proxy and watch the traffic if you were so interested.

The fact the articles author didn't bother to and report back what they found, means this person didn't even do the bare minimum of technical research before posting an alarmist article.

They saw a thing happening they didn't understand and decided to post some nonsense online.


I can't, because I don't have a Mac. I'm interested what Apple is doing because of the wider discussion, though. So if someone else can, I'd be grateful :)


If I made any false claims in the article, please let me know ASAP so I can correct them. I was very careful to report facts only and let the readers draw their own conclusions about what is happening here.


> let the readers draw their own conclusions

That's not how this works. A system-level background process is not going to be 'concluded upon' by random readers on the internet.

The only thing you should have written is: "a process made a connection, but I don't know what the process is for or what the connection is for, and I do not know how to investigate it". Your readers could then conclude a bunch:

  - You use little snitch
  - But you don't really know how noisy an OS can be
  - Noise seems scary to you
  - But you don't know why
  - And you don't know how to find out

You could have made your position more credible by doing some extremely basic tests, like checking for strings inside the binary, or finding out what frameworks and libraries it is linked to. That's 2 built-in commands you can run as a normal user, with no arguments or ordering to think about. But if you did that, your position wouldn't be what it is.


I have an entire blog post about how noisy macOS is, with pcaps:

https://sneak.berlin/20210202/macos-11.2-network-privacy/

I'm not interested in projecting credibility, only disseminating facts for analysis.


Well, that's not what you did, no matter your intentions.

> Apple Has Begun Scanning Your Local Image Files Without Consent

Begun? Nope, ancient process and system that has been there for many major releases.

Consent? Definitely gave consent, both at the macro and the micro level. You can't actually get up and running or use Finder unless you gave that consent, because it's the first thing that you have to do when you boot up for the first time.

So both of your key statements are measurably false, and your blog is riddled with fantasies and propositions that don't have the facts to back them up.


I see no evidence in your post that mediaanalysisd reaching out to an Apple server is related to CSAM detection. Have you analyzed the traffic?


I am not to say you made false claims, but I'm not sure that your connection between what com.apple.mediaanalysisd and the com.apple.snoot API call is fully justified or backed by evidence.

It is interesting to understand what this call is for and what the mediaanalysisd daemon is actually doing/sending, but just triggering LittleSnitch while browsing local documents I'm not sure is enough to draw a direct connection. Correlation perhaps, but honestly I think before making the claim that Apple is sending specific document information to their servers, it'd be better to get something a bit more concrete.

Checking with just activity monitor on the mediaanalysisd process and its network traffic while browsing through pictures, the sent/received data is bytes in size; I'm not in a position to do a proper network capture, but already I question what those few bytes are and if it's really related to my media browsing activities or just background noise. There is a fairly significant amount of received traffic (like close to 200 GiB) compared to the sent traffic (13 GiB), so I'm not quite convinced that this correlates to media scanning.

Passing psgrep with mediaanalysisd to lsof shows a lot of fairly normal background processes/daemons like webkit, AppleSupport, the AppStore, etc., so I find myself thinking it's more a general process despite the name, but far more research of actual packet data is needed.

The only mistake I think you've made is that I think your conclusion is premature to connect file browsing to this daemon, as it seems to handle a lot of stuff besides the implication that it scans documents or that it's specifically related to CSAM.

Basically, more information is needed in my opinion.

Edit: Changed "feel" to "think" in 2nd to last text block to correct my own language issue.


> honestly I think before making the claim that Apple is sending specific document information to their servers, it'd be better to get something a bit more concrete.

I didn't make that claim, because I have not RE'd mediaanalysisd to know precisely what it's doing.

The strings in the binary suggest it does face and animal detection, and processes some sort of "blacklist". As I mentioned, I don't use Photos.app or iCloud so I'm not sure how any of my media would be eligible for analysis. I was literally viewing images in the Finder via spacebar QuickLook (not even in Preview) when it hit the network for the first time. Maybe it's downloading models? Maybe it's sending analytics? Maybe it's sending perceptual hashes? Who knows?

I'm hoping that this post causes someone with the time and skills to do an in-depth RE of the binary to take a closer look at precisely what is happening with regards to images that are not eligible for processing in Photos.app or iCloud.


Pretty much everything in UIKit, Cocoa etc. that uses native widgets use background frameworks and services that use things like searchpartyd, mediaanalysisd, knowledged, mds, touristd, and a whole bunch of other D's. That includes searching, previewing (QuickLook) and thumbnails. None of the applications do this themselves 'internally', they just use system frameworks for that. There are at least 1250 private frameworks and 215 shared frameworks, and all of them contain libraries, daemons, XPC services and all of them could be using networking features to act to their design.

If you disabled SIP and deleted iCloud and Photos, not much would change. Your media panel widgets would break since they depend on access to the library functions (even if there are empty), but mediaanalysis would still work for all your image renderers. You might think "well that is wasteful, I never use this feature", but you forget that the product was not designed for your personal taste and use case, but for a much broader market. A market where a crapload of money is made every day and a selling point is that there isn't a whole lot to fiddle about with for users.


Sneak, I think you're defending your article a little too loosely here.

Your article states that Apple is scanning your files and points to network traffic as evidence for this, combining in your summary that Apple scans the data and then tries to reach Apple owned APIs. You spend a non-trivial amount of the article discussing the CSAM plans Apple had, and making it very clear that Apple never said it wouldn't scan and send files/data to law enforcement.

Your article builds a case that Apple is exfiltrating data, be it document information, media analysis, etc, outside of the local machine; I find it difficult to accept this was not the intent based on how the article is structure, the bullet points, and the summary.

> I was literally viewing images in the Finder via spacebar QuickLook (not even in Preview) when it hit the network for the first time.

Yes, this is likely LiveText and other OCR features done live on the M1/M2 models: https://developer.apple.com/documentation/visionkit/enabling...

There is a lot of live processing done with the M1/M2 boards even without iCloud/Photos.

I understand you think your article presents an objective and curious item that asks the reader to investigate more, but I and other people in this topic are telling you this is not the way your article is understood and interpreted, and we've pointed out how the other conclusion was reached based on your article. An inquisitive article with a call to action would be just that, as other commenters have already posted:

- I see this daemon trying to reach out to Apple APIs

- At this time, I was doing this

- I have not investigated the binary or network traffic

- I find this interesting, but I have not yet drawn a conclusion on it

- I invite others to comment on it more

If the article was presented like that, you wouldn't have so many comments here stating that they find your conclusion premature and misunderstanding your intention.

If your intent is really to call for investigation, I would suggest that you add an addendum section to the article clarifying your goals and purpose, as it seems that overwhelmingly readers are not taking this interpretation.


little snitch has the ability to MITM and display the contents of transfers. I'd suggest you use it and amend the post with details.


Not for urlsessions with pinned keys, especially not on AMFI binaries you can't patch if you're not in development mode.


well, sounds like he has his work cut out for him then.


If he can't do something like "strings /System/Library/PrivateFrameworks/MediaAnalysis.framework/Versions/A/mediaanalysisd" I don't think any actual investigation of the binary or traffic is the first step of learning for him to take...


You didn't report false claims. You reported a bunch of unrelated claims without any technical research, and then pointed your readers to a conclusion that lacked any technical or journalistic merit.

It's shallow alarmism.


An entire valid critique, but would you be alarmed if your computer suddenly started making network requests when you were browsing private images in a local directory, having previously opted out of all network services/features?

At the very least, it's a bug. At worst, it's covert surveillance.


Or it's an entirely valid necessity of a feature you do not understand, that's being driven by a system setting you also do not understand.

The system could be behaving exactly as it should be, but because you failed to do any research before drawing any conclusions, you've decided there are only two explanations when many more could be possible.


Which system setting?


Nobody here is going to do your homework for you. The Venn diagram of people that can and people that want to know but don't know how to do it is not going to have a lot of overlap.


> Apple Has Begun Scanning Your Local Image Files Without Consent

You consent when you agree to use their software. The information is encrypted and stays on your device. One of many notices over the years is here: https://www.apple.com/ios/photos/pdf/Photos_Tech_Brief_Sept_...

And, they've been analyzing your local image files for over a decade for numerous OS purposes.

> The media erroneously reported this as Apple reversing course.

The article you are quoting (whose author you infer has poor reading comprehension) is from December 2022. CSAM was put on pause in 2021. They have announced as of last month it is no longer being developed.

You criticize Lily's ability to understand Apple's statements, and yet:

> At the beginning of September 2021, Apple said it would pause the rollout of the feature to “collect input and make improvements before releasing these critically important child safety features.” In other words, a launch was still coming. Now the company says that in response to the feedback and guidance it received, the CSAM-detection tool for iCloud photos is dead.

Is this a lack of reading comprehension? Or did you just not read it at all?

> Today, Apple scanned my local files and those scanning programs attempted to talk to Apple APIs, even though I don’t use iCloud, Apple Photos, or an Apple ID.

> This is your first and only warning: Stock macOS now invades your privacy via the Internet when browing [sic] local files, taking actions that no reasonable person would expect to touch the network, with iCloud and all analytics turned off, no Apple apps launched (this happened in the Finder, via spacebar preview), and no Apple ID input. You have been notified of this new reality. You will receive no further warnings on the topic.

You are inferring (and I'm using the term infer to be very charitable) Apple is sending data to its servers based on the media it analyzes. It's clear that packets are coming in the form of GET requests (406B/s) not outbound requests (6B/s). IIRC that API endpoint is for searching to process your input before actually searching your local machine to get Siri suggestions and what not.

> Integrate this data and remember it: macOS now contains network-based spyware even with all Apple services disabled. It cannot be disabled via controls within the OS: you must used third party network filtering software (or external devices) to prevent it.

I'd call their advertising network-based spyware, but this isn't. And you absolutely can disable these features.

I trust you will update the article now.


You know what you did.


I wonder if little snitch gets access to _all_ traffic. I remember several osx releases ago there was an issue because osx (big sur) moved to an api interface and it broke little snitch. I assume if apple was dedicated enough they could block some traffic from being seen by little snitch.


Same thing for me. First time mediaanalysisd has ever connected to the internet. Why? What is it sending?


This submission should be flagged, as there is no evidence to support the claim in the article. The information provided only shows that a system process called "medianalysisd" attempts to open a TCP connection to "api.smooth.apple.com."


Your right to assume you don't know whats happening behind the scenes in a proprietary software without further investigation. I think the cognitive dissonance is not following that conclusion to its logical end which is:

use a non-proprietary software, otherwise you can't know what you dont know and you cant learn what you cant discover, among other tautologies.

What little snitch alerts you too isn't even scratching the surface of whats going on behind the scenes on your device. If you would like to get a sense of that check out FSMonitor and just run it in the background for an hour.


They have not 'begun' this, it's been happening for a some time, with the (perhaps uninformed) consent of whomever accepted the license.


Welcome to the cloud, we'll do whatever the f*k we want with your data.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: