A4 Refereed article in a conference publication
Efficient Hold-Out for Subset of Regressors
Authors: Pahikkala T, Suominen H, Boberg J, Salakoski T
Editors: Kolehmainen Mikko, Toivanen Pekka, Beliczynski Bartlomiej
Conference name: 9th International Conference on Adaptive and Natural Computing Algorithms
Publication year: 2009
Journal: Lecture Notes in Computer Science
Book title : Proceedings of the 9th International Conference on Adaptive and Natural Computing Algorithms (ICANNGA'09)
Journal name in source: ADAPTIVE AND NATURAL COMPUTING ALGORITHMS
Journal acronym: LECT NOTES COMPUT SC
Volume: 5495
First page : 350
Last page: 359
Number of pages: 10
ISBN: 978-3-642-04920-0
ISSN: 0302-9743
Abstract
Hold-out and cross-validation are among the most useful methods for model selection and performance assessment of machine learning algorithms. In this paper, we present a computationally efficient algorithm for calculating the hold-out performance for sparse regularized least-squares (RLS) in case the method is already trained with the whole training set. The computational complexity of performing the hold-out is O(vertical bar H vertical bar(3) + vertical bar H vertical bar(2)n), where vertical bar H vertical bar is the size of the hold-out set and n is the number of basis vectors. The algorithm can thus be used to calculate various types of cross-validation estimates effectively. For example, when m, is the number of training examples, the complexities of N-fold and leave-one-out cross-validations are O(m(3)/N(2) + (m(2)n)/N) and O(mn), respectively. Further, since sparse RLS can be trained in O(mn(2)) time for several regularization parameter values in parallel, the fast hold-out algorithm enables efficient; selection of the optimal parameter value.
Hold-out and cross-validation are among the most useful methods for model selection and performance assessment of machine learning algorithms. In this paper, we present a computationally efficient algorithm for calculating the hold-out performance for sparse regularized least-squares (RLS) in case the method is already trained with the whole training set. The computational complexity of performing the hold-out is O(vertical bar H vertical bar(3) + vertical bar H vertical bar(2)n), where vertical bar H vertical bar is the size of the hold-out set and n is the number of basis vectors. The algorithm can thus be used to calculate various types of cross-validation estimates effectively. For example, when m, is the number of training examples, the complexities of N-fold and leave-one-out cross-validations are O(m(3)/N(2) + (m(2)n)/N) and O(mn), respectively. Further, since sparse RLS can be trained in O(mn(2)) time for several regularization parameter values in parallel, the fast hold-out algorithm enables efficient; selection of the optimal parameter value.