A very nice writeup, but (no offense) "tens of seconds" per image? Maybe I haven't understood the technical challenges well enough, but processing an image of 60,000 pixels should surely be much faster than that. What would the cost center of such a process even be - building the superpixels?
we got good results for that specific failure mode with random sampling.
how much samples to take require some tuning, but you can use the image data itself to facilitate the tuning part by exiting the random sampling when the deviation of sampled pixels is below a factor proportional to samples/totals