Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Exploring EXIF (hturan.com)
189 points by spansoa on Sept 6, 2023 | hide | past | favorite | 33 comments


I'm the author of the osxphotos[0] tool mentioned in the article. For photos in an Apple Photos library, osxphotos gives you access to a rich set of metadata beyond what's in the actual EXIF/IPTC/XMP of the image. Apple performs object classification and other AI techniques on your images but generally doesn't expose this to the user. For example, photos are categorized as to object in them (dog, cat, breed of dog, etc.), rich reverse geolocation info (neighborhood, landmarks, etc.) and an interesting set of scores such as "overall aesthetic", "pleasant camera tilt", "harmonious colors", etc. These can be queried using osxphotos, either from the command line, or in your own python code. (Ref API docs[1])

For example, to find your "best" photos based on overall aesthetic score and add them to the album "Best Photos" you could run:

osxphotos query --query-eval "photo.score.overall > 0.8" --add-to-album "Best Photos"

To find good photos with trees in them you could try something like:

osxphotos query --query-eval "photo.score.overall > 0.5" --label Tree --add-to-album "Good Tree Photos"

There's quite a bit of other interesting data in Photos that you can explore with osxphotos. Run `osxphotos inspect` and it will show you all the metadata for whichever photo is currently selected in the Photos app.

[0] https://github.com/RhetTbull/osxphotos [1] https://rhettbull.github.io/osxphotos/


That first query is fantastic. Aside from the obvious wedding photos which were indeed aesthetically pleasing, my wife and I were laughing to the point of tears over a "most aesthetic" photo it found of me climbing through a cage suspended between two cliffs in the Alps.

Thanks for the cool tool! I too wish Apple exposed these scores as a filter at the GUI level, but after seeing the output, perhaps I know why they don't.


> the EXIF data marker (FFE1)

This isn't quite right. That's a JPEG application specific marker (APP1). It indicates that some additional data is embedded in the JPEG file at that point which should be interpreted in an application specific manner (e.g. skipped if the program doesn't know how to handle it). That could be Exif tags but could also be other things (like XMP). An APP1 segment can be distinguished as Exif data based on the presence of the "Exif" string immediately after the segment size. (The tags themselves are stored in essentially a TIFF file embedded after the Exif identifier, if you're curious.)


FYI if you didn’t already know: pdf files also have EXIF data. I just learned that recently.


Thanks for sharing this, I’ll need to explore this a little more.


I use exiftool[1] and pdfinfo[2] to add or correct title and author for PDF files. With pdfinfo I can easily generate an index to my PDF file directories.

1 https://exiftool.org/

2 https://www.xpdfreader.com//


Be careful about posting your phone camera images publicly if you’re protective of your privacy. Some social media trim location and other sensitive EXIF tags, but it’s best to not trust them.


100% which is why, as mentioned in the article, iOS trims the EXIF data when uploading images to the web.

What was most alarming to me, (I’m not sure why it was a surprise) was that all this rich EXIF data is available to any app that I grant full access to my photo gallery.


Presumably a place like Meta saves all of EXIF data for themselves and serves the end users sanitized ones. I bet you could still fingerprint an image and know what actual camera it came from based on imperfections in the optical sensor, though.


Yep, that's why I wrote a tool to cut all metadata from photos

https://cutexif.com/

Which I use before uploading photos to pubic services.

Exif has surprising amount of infomration


    exiftool -All= image.jpg


I recently learned Windows has this functionality built-in. Right-click > Properties > Details > Remove Properties and Personal Information.

It's saved me from digging out a purpose-specific tool a couple times.


I remember a body positivity blog let contributors provide nude photos, often mobile. To show how normal bodies look like. The engineer in me wondered “wait, what if…” so I used an online EXIF viewer and yep, it wasn’t stripped. A map pin right on their house! :o That was the worst mistake I’ve seen with these yet.


I have created a tool called ExifPurge to remove EXIF data from multiple photos in one go.

https://www.exifpurge.com


    exiftool -All= image.jpg


No GNU/Linux build?


I wonder how the GPS coords is actually encoded in the EXIF data, because I have a hard time believing they're actually encoding it with as many significant digits as displayed by the website.

For the example they give, I'm counting 14 significant digits for GPSLatitude, and 16 for GPSLongitude, which is way more specific than a grain of sand, ref https://xkcd.com/2170/

I suppose it's floating point, and of course that brings its own "fun" in displaying in decimal.

At least it also includes GPSPositioningError!


Apparently, each coordinate is an array of 3 rational values representing the degrees, minutes and seconds, where the numerator and denominator are 32-bit unsigned integers (plus a separate field to represent the sign). That's a lot of precision.


EXIF is a binary format. I would expect it to just stuff a double in there. So 16 digits is what you get even if you need less. 32 bit floats aren't precise enough to store latitude/longitude coordinates.


Just look for the many documentations that exists, just quick search: https://exiftool.org/TagNames/GPS.html


You're right they don't usually include that many significant digits. Here is an example sample of some GPS data I specifically extracted from a cross-country bike trip from photos I took on my phone, for comparison:

https://github.com/lelandbatey/batey_bike_trip_records/blob/...


The GPS encoding depends on the software that generates the EXIF. In case of the article with iPhone this is reasonably consistent across the apps. But, EXIF GPS (and other fields) can be very inconsistent across other software, both on mobile phones with cameras, and cameras themselves. In my experience I seen the most errors in knock off devices.

IF you are interested to learn more, look into "deadbox" digital forensics.


I know for a fact that, possibly due to a bug, or intentionally, they are not encoded to the accuracy that they could be, and using Apple's own EXIF tools will not let you encode them at full accuracy. I know this as I had to write a tool to re-encode the values into the EXIF.


That is way more information in a single photo than I ever would have guessed.


Interesting post.

From the metadata of a person's photos, you can extract, inter alia:

...whether they took some photos from a plane or helicopter (viewshed, angle and altitude are preserved);

...how conscientious someone is (by looking at their iPhone's uptime, perhaps the over-conscientious person with an extreme uptime is borderline OCD?).


Extreme uptime probably means they haven't installed updates in a while, that's the only reason I reboot my phone anymore.


somewhat related, i used a tool to extract embedded jpeg preview images from raw images. much faster than actually using software to process the raw files if you just need to quickly share, your camera already "developed" them :)

http://www.fsoft.it/ERawP/

exiftool apparently can extract these also:

exiftool -a -b -W %d%f_%t%-c.%s -preview:all dir

Extract all types of preview images (ThumbnailImage, PreviewImage, JpgFromRaw, etc.) from files in directory "dir", adding the tag name to the output preview image file names.

https://exiftool.org/exiftool_pod.html


> We have great tools for searching a single one of these dimensions, but we‘re severely lacking in combining all of them into an exploratory interface

I would recommend you to try an open-source semantic photo search app I've made, it's called Queryable. See: https://github.com/mazzzystar/Queryable


Doesn’t this information get stripped from photos, at least on iOS when you share them. Seems to be what I recall experiencing.


> In fact, when you give an iOS app full access to your photo library, you're giving all of this information [including GPS data] away too.

Scary.


Someone tell me does any library exists that handles exif, xmp, iptc and has machine output (json)?


ExifTool[0] does all that from the command line. I use it for automating my photo organization workflow and, as a bonus, I use it for reading the metadata of damn near any filetype.

[0]: https://exiftool.org


Know that although ExifTool is written in perl, you can run it in "batch mode" which makes it quite fast--only a couple of ms to parse a file. I've written an open source library to manage the subprocesses for you if you're using node.js (and I also wrote the ruby variant ages ago):

https://github.com/photostructure/exiftool-vendored.js




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: