Loading…
Iterative column subset selection
Dimensionality reduction is often a crucial step for the successful application of machine learning and data mining methods. One way to achieve said reduction is feature selection. Due to the impossibility of labelling many data sets, unsupervised approaches are frequently the only option. The colum...
Saved in:
Published in: | Knowledge and information systems 2018, Vol.54 (1), p.65-94 |
---|---|
Main Authors: | , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Dimensionality reduction is often a crucial step for the successful application of machine learning and data mining methods. One way to achieve said reduction is feature selection. Due to the impossibility of labelling many data sets, unsupervised approaches are frequently the only option. The column subset selection problem translates naturally to this purpose and has received considerable attention over the last few years, as it provides simple linear models for low-rank data reconstruction. Recently, it was empirically shown that an iterative algorithm, which can be implemented efficiently, provides better subsets than other state-of-the-art methods. In this paper, we describe this algorithm and provide a more in-depth analysis. We carry out numerous experiments to gain insights on its behaviour and derive a simple bound for the norm recovered by the resulting matrix. To the best of our knowledge, this is the first theoretical result of this kind for this algorithm. |
---|---|
ISSN: | 0219-1377 0219-3116 |
DOI: | 10.1007/s10115-017-1115-4 |