And if you just want to go by the pixel data, look into "perceptual hashing". ht...

eviks · on July 12, 2023

is there an option to just calculate image hash (but on image data, not the full file of image data + metadata) without any transforms? So that if it matches you can be 100% certain it's the same image

mceachen · on July 11, 2023

Unfortunately, (almost all!) image hashing don't detect color differences--they map images to greyscale first. This may be fine for many situations, but it will return the same result for a sepia tint, a full color original with incorrect white balance, and the final result you made after mucking with channels for a couple minutes.

I also found that there really isn't one "best" image hash algorithm. Using _several different_ image hash algos turns out to be only fractionally more expensive during both compute and query times, and substantially improves both precision and recall. I'm using a mean hash, gradient diff, and a DCT, all rendered from all three CIELAB-based layers, so they're sensitive to both brightness and color differences.

rivo · on July 11, 2023

The library I posted uses colour information. It won't map to greyscale first.