Efficient Hold-Out for Subset of Regressors

: Pahikkala T, Suominen H, Boberg J, Salakoski T

: Kolehmainen Mikko, Toivanen Pekka, Beliczynski Bartlomiej

: 9th International Conference on Adaptive and Natural Computing Algorithms

: 2009

Lecture Notes in Computer Science

: Proceedings of the 9th International Conference on Adaptive and Natural Computing Algorithms (ICANNGA'09)

ADAPTIVE AND NATURAL COMPUTING ALGORITHMS

: LECT NOTES COMPUT SC

: 5495

: 350

: 359

: 10

: 978-3-642-04920-0

: 0302-9743

Hold-out and cross-validation are among the most useful methods for model selection and performance assessment of machine learning algorithms. In this paper, we present a computationally efficient algorithm for calculating the hold-out performance for sparse regularized least-squares (RLS) in case the method is already trained with the whole training set. The computational complexity of performing the hold-out is O(vertical bar H vertical bar(3) + vertical bar H vertical bar(2)n), where vertical bar H vertical bar is the size of the hold-out set and n is the number of basis vectors. The algorithm can thus be used to calculate various types of cross-validation estimates effectively. For example, when m, is the number of training examples, the complexities of N-fold and leave-one-out cross-validations are O(m(3)/N(2) + (m(2)n)/N) and O(mn), respectively. Further, since sparse RLS can be trained in O(mn(2)) time for several regularization parameter values in parallel, the fast hold-out algorithm enables efficient; selection of the optimal parameter value.