Recommendation: Reach out to my colleague James Bergstra, and build out automati...

ninjin · on Jan 17, 2013

I agree that the hyper-parameter selection is a huge pain, personally though, I am more familiar with the work of Snoek et al. [1] from NIPS in December last year. He even distributes a neat Python package that will perform Bayesian optimisation combined with MCMC [2] so that even people, like me, that are not yet familiar with Gaussian Processes can deploy it easily.

[1]: http://arxiv.org/pdf/1206.2944v2

[2]: http://www.cs.toronto.edu/~jasper/software.html

pilooch · on Jan 17, 2013

Link to the paper: http://jmlr.csail.mit.edu/papers/v13/bergstra12a.html

dave_sullivan · on Jan 19, 2013

Yeah, it's a really good point.

I haven't played with automatic parameter selection much (but have been seeing more papers on it recently) so I hadn't really considered it all that closely.

While I'd like to give people a fair amount of control over model parameters if they want, it probably is very important that I make things as turnkey as I can. Shouldn't be too tough to hack something together and make it an option during training.

While I'm trying to start things off relatively simply, the overall goal really is towards allowing people to create models that act as parts of much larger systems, maybe larger neural nets themselves. A sort of genetic algorithm that spawns new neural networks with random parameters and random connections to previous networks could be kind of neat, and making the base elements of those types of architectures (a single fully connected deep net, for example) easily accessible is a first step towards that goal.

tlarkworthy · on Jan 18, 2013

Presumably if you have a GPU backed cloud DBN. Hyper parameter selection is faster than one param per day. Also how to you choose the parameters to the hyper parameter tuner? I am never convinced these things work given no free lunch theorem.