Presentation: Hyperopt: A Python library for optimizing the hyperparameters of machine learning algorithms

Saturday 12:54 p.m.–1 p.m. in Colony Ballroom

Hyperopt: A Python library for optimizing the hyperparameters of machine learning algorithms

James Bergstra

Audience level:: Experienced

Description

Hyperopt provides parallel infrastructure and Bayesian optimization algorithms that can tune the hyperparameters of machine learning systems (including pre-processing steps) as well as domain experts. This talk introduces hyperopt architecture and usage.

Abstract

Most machine learning algorithms have hyperparameters that have a great impact on end-to-end system performance, and adjusting hyperparameters to optimize end-to-end performance can be a daunting task. Hyperparameters come in many varieties--continuous-valued ones with and without bounds, discrete ones that are either ordered or not, and conditional ones that do not even always apply (e.g., the parameters of an optional pre-processing stage)--so conventional continuous and combinatorial optimization algorithms either do not directly apply, or else operate without leveraging structure in the search space. Typically, the optimization of hyperparameters is carried out before-hand by domain experts on unrelated problems, or manually for the problem at hand with the assistance of grid search. However, even random search has been shown to be competitive [1].

Better hyperparameter optimization algorithms (HOAs) are needed for two reasons:

HOAs formalize the practice of model evaluation, so that benchmarking experiments can be reproduced by different people.
Learning algorithm designers can deliver flexible fully-configurable implementations (of e.g. Deep Learning algorithms) to non-experts, so long as they also provide a corresponding HOA.

Hyperopt provides serial and parallelizable HOAs via a Python library [2, 3]. Fundamental to its design is a protocol for communication between (a) the description of a hyperparameter search space, (b) a hyperparameter evaluation function (machine learning system), and (c) a hyperparameter search algorithm. This protocol makes it possible to make generic HOAs (such as the bundled "TPE" algorithm) work for a range of specific search problems. Specific machine learning algorithms (or algorithm families) are implemented as hyperopt search spaces in related projects: Deep Belief Networks [4], convolutional vision architectures [5], and scikit-learn classifiers [6]. This presentation will explain what problem hyperopt solves, how to use hyperopt, and how hyperopt can deliver accurate models from data alone, without operator intervention.

[1] J. Bergstra and Y. Bengio (2012). Random Search for Hyper-Parameter Optimization. Journal of Machine Learning Research 13:281–305. http://www.jmlr.org/papers/volume13/bergstra12a/bergstra12a.pdf

[2] J. Bergstra, D. Yamins and D. D. Cox (2013). Making a Science of Model Search: Hyperparameter Optimization in Hundreds of Dimensions for Vision Architectures. Proc. 30th International Conference on Machine Learning (ICML-13). http://jmlr.csail.mit.edu/proceedings/papers/v28/bergstra13.pdf

[3] Hyperopt: http://jaberg.github.com/hyperopt

[4] ... for Deep Belief Networks: https://github.com/jaberg/hyperopt-dbn

[5] ... for convolutional vision architectures: https://github.com/jaberg/hyperopt-convnet

[6] ... for scikit-learn classifiers: https://github.com/jaberg/hyperopt-sklearn