This is a MATLAB® object which also requires the Spider Toolbox
The selection of a subset of input variables is often based on the previous construction of a ranking to order the variables according to a given criterion of relevancy. The objective is then to linearize the search, estimating the quality of subsets containing the topmost ranked variables. An algorithm devised to rank input variables according to their usefulness in the context of a learning task is presented. This algorithm is the result of a combination of simple and classical techniques, like correlation and orthogonalization, which allow the construction of a fast algorithm that also deals explicitly with redundancy. Additionally, the proposed ranker is endowed with a simple polynomial expansion of the input variables to cope with nonlinear problems. The comparison with some state-of-the-art rankers showed that this combination of simple components is able to yield high-quality rankings of input variables. The experimental validation is made on a wide range of artificial data sets and the quality of the rankings is assessed using an ROC-inspired setting, to avoid biased estimations due to any particular learning algorithm.J. R. Quevedo, A. Bahamonde, and O. Luaces, “A simple and efficient method for variable ranking according to their usefulness for learning,” Comput. Stat. Data Anal., vol. 52, no. 1, 2007.
You can also download a dataset generator and an example of use.
© ML-GROUP, Artificial Intelligence Center, University of Oviedo at Gijón, 2007