Machine learning gives performance without theoretical undertstanding. Norvig di...

trhway · on June 9, 2016

>Machine learning gives performance without theoretical undertstanding.

that is good experimental data and the role of theorists here is to look how that performance achieved and why. For example, one can reasonably suspect that there is a good reason why the kernels in a well trained image recognition deep learning net do look like receptive fields of neurons in visual cortex. I'm pretty sure that there is some kind of statistical optimality in that, something similar to like normal distribution is maximum entropy distribution for a given standard variation. The same way i'd guess Gabor of neuron receptive field is something like maximum entropy on the set of all possible edges or something like this. The point here is that the great success of deep learning generates a lot of very good data for theorists to consume. You can do only so much theory without good experimental data, and in the decades before the availability of computing power (and resulting success of deep learning) there wasn't that much of the computer vision theory advances to speak about, really.

>leaves flying and various things

Newton did that for 20 years. With great success.

argonaut · on June 9, 2016

Here's a problem: you can argue that Gabor filters arise because we design neural nets to encourage them. Gabor filters mostly arise in CNN or things otherwise regularized to be like CNNs. Convolutional layers are a form of regularization that restrict the space of models that a network can conform to. The Gabor filters are learned but none of this is evidence they are globally "optimal" given that a human manually decided whether or not to include the presence of convolutions.

It also goes without saying that the phrase "statistically optimal" is meaningless in this specific context. You can claim they are a part of minimizing the cost function, but, again, you have to be very careful about the chicken and egg problem, because humans are the ones who manually craft the cost function.

suchow · on June 9, 2016

You might find this 1996 Nature paper by Olshausen & Field interesting. In it, they describe how a coding strategy that maximizes spareness when representing natural scenes is enough to produce a family of localized, oriented, bandpass receptive fields, like those found in the early visual system of humans.

https://courses.cs.washington.edu/courses/cse528/11sp/Olshau...

argonaut · on June 9, 2016

This fits in with my point: they imposed several restrictions about the model space: maximizing sparseness, and they also make several linearity assumptions.

happycube · on June 9, 2016

Superficially, those graphics look a lot like visualizations of conv nets.

trhway · on June 9, 2016

those papers should be, if not 101, at least 201 for neural net track as it would help to establish common framework and basis for thinking, talking and analysis of neural nets machinery.

eli_gottlieb · on June 9, 2016

Great, now I've got another paper to cite. Thanks!

trhway · on June 9, 2016

>you can argue that Gabor filters arise because we design neural nets to encourage them. Gabor filters mostly arise in CNN or things otherwise regularized to be like CNNs.

and do we know why? Usually it would mean some optimality. It should be relatively simple math here (back at the time at our University it would be given to a student as a thesis project and couple months later we'd have it), and that would give us 2 things - insight into biological visual cortex (which we suppose follows some optimality too and know we would have a very good candidate for the one) as well as to allow to start some primary convolutional layers with the (optimal set of) Gabors instead of going through learning them. Actually some of the best results i saw 15-20 years ago were produced by the simulation of visual cortex through such construction. And now image trained deep learning nets converge to the same.

argonaut · on June 9, 2016

No, we don't know why. You also have little basis to claim the biological visual cortex is optimal. It certainly works, which is enough for people who draw inspiration from it.

trhway · on June 9, 2016

that is my point - we have very good suspicion and experimental data (successful deep learning CNN as well as visual cortex) that it is optimal, at least in some very wide class if not global, and that warrants investigation for a proof of it. Proven optimality would be very telling, especially for biological visual cortex. Even if optimality doesn't happen for CNN Gabor, there may happen to be discovered reasons for why not, and thus that would help to construct even better, probably optimal, approach.

argonaut · on June 10, 2016

Suspicion and empirical evidence cannot prove something is optimal. You cannot exhaustively empirically search an infinite space of models. You are seriously misunderstanding the definition of "optimal."

trhway · on June 10, 2016

>Suspicion and empirical evidence cannot prove something is optimal.

what are you talking about? where did i say such a thing?

argonaut · on June 11, 2016

> we have very good suspicion and experimental data ... that it is optimal

...

hacker42 · on June 9, 2016

I am not seeing the chicken and egg problem. Isn't it always the case that when we consider the optimality of something it depends on some definitions?

argonaut · on June 9, 2016

The chicken and egg problem is that we design neural nets in order for Gabor filters to show up. If you used a different neural net architecture choice, they wouldn't show up. So the presence of Gabor filters indicating optimality is sort of begging the question.

hacker42 · on June 9, 2016

As far as I know, Gabor filters show up in all kinds of multi-layered models, even in unsupervised ones. Gabor filters are quite canonical.

trhway · on June 9, 2016

CNN beat other architectures in images and Gabor emerges in CNN. That is a pretty good candidate for even global optimaliy.

argonaut · on June 9, 2016

The fact that X works and A, B, and C don't work, does not imply A is globally optimal.

romaniv · on June 9, 2016

In practical terms, performance without understanding can lead to highly surprising/counter-intuitive results when algorithms are applied to real-life problems. This doesn't matter much if you're doing movie suggestions or something like that, but it does matter in many other areas that could benefit from AI/ML.

jddjdbdbd3 · on June 9, 2016

But this is what the human brain is doing. A large complex statistical analysis of video that provided accurate predictions would contain an understanding of physics, just as our brain does. What you are essentially doing with this is creating a new researcher out of thin air, asking him to work out how to understand video and then when he/she/it succeeds not bothering to ask him how it was done. Then tossing the knowledge in his head away because you didn't come up with it.