Publication Cover

More About NC


Article Metrics

Altmetric

About article usage data:

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Aenean euismod bibendum laoreet. Proin gravida dolor sit amet lacus accumsan et viverra justo commodo. Proin sodales pulvinar tempor. Cum sociis natoque penatibus et magnis dis parturient montes, nascetur ridiculus mus.


The goal of supervised feature selection is to find a subset of input features that are responsible for predicting output values. The least absolute shrinkage and selection operator (Lasso) allows computationally efficient feature selection based on linear dependency between input features and output values. In this letter, we consider a feature-wise kernelized Lasso for capturing nonlinear input-output dependency. We first show that with particular choices of kernel functions, nonredundant features with strong statistical dependence on output values can be found in terms of kernel-based independence measures such as the Hilbert-Schmidt independence criterion. We then show that the globally optimal solution can be efficiently computed; this makes the approach scalable to high-dimensional problems. The effectiveness of the proposed method is demonstrated through feature selection experiments for classification and regression with thousands of features.

Makoto Yamada
Yahoo Labs, 701 1st Ave., Sunnyvale, CA 94098, U.S.A.
Wittawat Jitkrittum
University College London, Alexandra House, 17 Queen Square, London, WC1N 3AR, U.K.
Leonid Sigal
Disney Research Pittsburgh, Pittsburgh, PA 15213, U.S.A.
Eric P. Xing
Carnegie Mellon University, Pittsburgh, PA 15213, U.S.A.
Masashi Sugiyama
Tokyo Institute of Technology O-okayama, Meguro-ku, Tokyo, 152-8552, Japan

The goal of supervised feature selection is to find a subset of input features that are responsible for predicting output values. The least absolute shrinkage and selection operator (Lasso) allows computationally efficient feature selection based on linear dependency between input features and output values. In this letter, we consider a feature-wise kernelized Lasso for capturing nonlinear input-output dependency. We first show that with particular choices of kernel functions, nonredundant features with strong statistical dependence on output values can be found in terms of kernel-based independence measures such as the Hilbert-Schmidt independence criterion. We then show that the globally optimal solution can be efficiently computed; this makes the approach scalable to high-dimensional problems. The effectiveness of the proposed method is demonstrated through feature selection experiments for classification and regression with thousands of features.

Makoto Yamada
Yahoo Labs, 701 1st Ave., Sunnyvale, CA 94098, U.S.A.
Wittawat Jitkrittum
University College London, Alexandra House, 17 Queen Square, London, WC1N 3AR, U.K.
Leonid Sigal
Disney Research Pittsburgh, Pittsburgh, PA 15213, U.S.A.
Eric P. Xing
Carnegie Mellon University, Pittsburgh, PA 15213, U.S.A.
Masashi Sugiyama
Tokyo Institute of Technology O-okayama, Meguro-ku, Tokyo, 152-8552, Japan