Efficient Hold-Out for Subset of Regressors - UTU Research Portal

A4 Refereed article in a conference publication

Efficient Hold-Out for Subset of Regressors

Authors: Pahikkala T, Suominen H, Boberg J, Salakoski T

Editors: Kolehmainen Mikko, Toivanen Pekka, Beliczynski Bartlomiej

Conference name: 9th International Conference on Adaptive and Natural Computing Algorithms

Publication year: 2009

Journal:Lecture Notes in Computer Science

Book title : Proceedings of the 9th International Conference on Adaptive and Natural Computing Algorithms (ICANNGA'09)

Journal name in sourceADAPTIVE AND NATURAL COMPUTING ALGORITHMS

Journal acronym: LECT NOTES COMPUT SC

Volume: 5495

First page : 350

Last page: 359

Number of pages: 10

ISBN: 978-3-642-04920-0

ISSN: 0302-9743

Abstract

Hold-out and cross-validation are among the most useful methods for model selection and performance assessment of machine learning algorithms. In this paper, we present a computationally efficient algorithm for calculating the hold-out performance for sparse regularized least-squares (RLS) in case the method is already trained with the whole training set. The computational complexity of performing the hold-out is O(vertical bar H vertical bar(3) + vertical bar H vertical bar(2)n), where vertical bar H vertical bar is the size of the hold-out set and n is the number of basis vectors. The algorithm can thus be used to calculate various types of cross-validation estimates effectively. For example, when m, is the number of training examples, the complexities of N-fold and leave-one-out cross-validations are O(m(3)/N(2) + (m(2)n)/N) and O(mn), respectively. Further, since sparse RLS can be trained in O(mn(2)) time for several regularization parameter values in parallel, the fast hold-out algorithm enables efficient; selection of the optimal parameter value.