Saturday 12:54 p.m.–1 p.m. in Colony Ballroom
Hyperopt: A Python library for optimizing the hyperparameters of machine learning algorithms
- Audience level:
Hyperopt provides parallel infrastructure and Bayesian optimization algorithms that can tune the hyperparameters of machine learning systems (including pre-processing steps) as well as domain experts. This talk introduces hyperopt architecture and usage.
Most machine learning algorithms have hyperparameters that have a great impact on end-to-end system performance, and adjusting hyperparameters to optimize end-to-end performance can be a daunting task. Hyperparameters come in many varieties--continuous-valued ones with and without bounds, discrete ones that are either ordered or not, and conditional ones that do not even always apply (e.g., the parameters of an optional pre-processing stage)--so conventional continuous and combinatorial optimization algorithms either do not directly apply, or else operate without leveraging structure in the search space. Typically, the optimization of hyperparameters is carried out before-hand by domain experts on unrelated problems, or manually for the problem at hand with the assistance of grid search. However, even random search has been shown to be competitive .
Better hyperparameter optimization algorithms (HOAs) are needed for two reasons:
HOAs formalize the practice of model evaluation, so that benchmarking experiments can be reproduced by different people.
Learning algorithm designers can deliver flexible fully-configurable implementations (of e.g. Deep Learning algorithms) to non-experts, so long as they also provide a corresponding HOA.
Hyperopt provides serial and parallelizable HOAs via a Python library [2, 3]. Fundamental to its design is a protocol for communication between (a) the description of a hyperparameter search space, (b) a hyperparameter evaluation function (machine learning system), and (c) a hyperparameter search algorithm. This protocol makes it possible to make generic HOAs (such as the bundled "TPE" algorithm) work for a range of specific search problems. Specific machine learning algorithms (or algorithm families) are implemented as hyperopt search spaces in related projects: Deep Belief Networks , convolutional vision architectures , and scikit-learn classifiers . This presentation will explain what problem hyperopt solves, how to use hyperopt, and how hyperopt can deliver accurate models from data alone, without operator intervention.
 J. Bergstra and Y. Bengio (2012). Random Search for Hyper-Parameter Optimization. Journal of Machine Learning Research 13:281–305. http://www.jmlr.org/papers/volume13/bergstra12a/bergstra12a.pdf
 J. Bergstra, D. Yamins and D. D. Cox (2013). Making a Science of Model Search: Hyperparameter Optimization in Hundreds of Dimensions for Vision Architectures. Proc. 30th International Conference on Machine Learning (ICML-13). http://jmlr.csail.mit.edu/proceedings/papers/v28/bergstra13.pdf
 Hyperopt: http://jaberg.github.com/hyperopt
 ... for Deep Belief Networks: https://github.com/jaberg/hyperopt-dbn
 ... for convolutional vision architectures: https://github.com/jaberg/hyperopt-convnet
 ... for scikit-learn classifiers: https://github.com/jaberg/hyperopt-sklearn