I did this for common plant diseases using Tensorflow on Android when learning how to use TF last year. Found the accuracy to be not that great with real world images that are not in the dataset (with a dataset of 50,000 images from PlantVillage). I think there are just too many visual similarities between various plants (especially when taken from various angles) that the network doesn't focus on the right things. Did a short writeup here:
It could work for a subset of plants, or potentially with a much larger training set - but I think NIR spectral/hyperspectral imaging would be the way forward here with more differentiating data points.
>I think there are just too many visual similarities between various plants
I get the impression that you tried to train a single classifier to diagnose disease in any species in the PlantVillage database.
You might get better results by training a separate classifier for each plant species (or starting with just one species, such as tomato, for which PlantVillage has 10 disease categories). A farmer knows what crop they're growing, so can select the correct classifier when they submit a photo.
Great suggestion, that's the approach I've been looking at most recently (manual selection of crop first, then classifier for only that crop's diseases) and it does give better results. Still, many plant diseases can look quite similar - more training data once we get it (and more variety in the photos) should help accuracy
Did you use inputs other than just image for identification? I would have thought that adding location and time of year alone could significantly improve results.
Not an expert -- would this be a result of training data doing a much better job isolating the subject vs a real world photo that may have other plants in the scene? Not that this is a solution in all cases, but is it possible your results would have improved with better real world pictures?
Not necessarily improved, because in real world photos you can have what I would baptise as "the voynich effect".
Plants in Voynich manuscript aren't real, can't even be classified in a family, but strangely still look familiar to us because they are "frankenplants". You can have exactly the same problem in photos of wild plants. It only takes the leaves of a climber growing over the flowers of other plant, or different flowers and fruits mixed together; and you'll have a new species. A very tricky one to identify. After scratching the head for a while humans can sense that something is wrong... machines normally can't see the problem. An (in)famous case is the photo of two juxtaposed black people arranged casually in a geometry that was tagged by the machine as 'gorilla'.
Likely but the whole point of 'AI' is that it should be able to identify flowers that don't look like what it's seen before. If not, it's just a huge lookup table that's like a memory repository.
This is, of course, a critical challenge in data science and is definitely not a trivial one to solve.
In some cases these are the goals, but dharma1 said the goal was to identify plant disease. If you can improve your results by taking better pictures then it becomes a tradeoff between training someone to take pictures and training someone to identify plant disease.
I think we have a tendency to treat AI as a silver bullet when we should be treating it as a tool we can use to help augment what we're already doing.
http://helminen.co/plant-disease
It could work for a subset of plants, or potentially with a much larger training set - but I think NIR spectral/hyperspectral imaging would be the way forward here with more differentiating data points.