Skip to main content
Please note this event occured in the past.
November 08, 2023 4:00 pm - 4:00 pm ET
Statistics and Data Science Seminar Series
LGRT 1681

With the rapid development of modern technology, massive amounts of data with complex pattern are generated. Gaussian process models that can easily fit the nonlinearity in data become more and more popular nowadays. It is often the case that in some data only a few features are important or active. However, unlike classical linear models, it is challenging to identify active variables in Gaussian process models. One of the most commonly used methods for variable selection in Gaussian process models is automatic relevance determination, which is known to be open ended. There is no rule of thumb to determine the threshold for dropping features, which makes the variable selection in Gaussian process models ambiguous. In this work, we propose two variable selection algorithms for Gaussian process models, which use the artificial nuisance columns as baseline for identifying the active features. Moreover, the proposed methods work for both regression and classification problems. The algorithms are demonstrated using comprehensive simulation experiments and an application to multi-subject electroencephalography data that studies alcoholiclevels of experimental subjects.

Keywords: Automatic relevance determination, Electroencephalography data, Gaussian process, Principal component analysis, Variable selection

Note:
This seminar is one of the joint colloquium series with the University of Connecticut. The zoom link to this seminar is https://umass-amherst.zoom.us/j/93295628670.