Please note this event occurred in the past.
November 13, 2024 12:00 pm - 1:30 pm ET
Seminars,
Graduate and Learning Seminars,
Learning Learning,
Department Event
LGRT1621

Abstract

Researchers in biomedical studies often work with biased samples that are not selected uniformly at random from the population of interest. One example is a case-control study, where cases are over-sampled to study risk factors of rare diseases. While these designs are motivated by specific scientific questions, it is often of interest to use them to pursue secondary lines of investigations. In these cases, the biased sample can lead to spurious association between an exposure and an outcome when both of them affect the case-control status. This phenomenon is known in the causal inference literature as collider bias. While tests of independence under biased sampling are available, these methods typically do not apply when the number of variables is large. 

 

In this work, we are interested in using the knockoff framework to select important variables among very many with replicability guarantees. We show that the standard model-X knockoffs fail to control FDR in the presence of biased sampling. We show that by tilting the population distribution with the selection probability and constructing knockoff variables according to this tilted distribution, the knockoff filter would control the FDR. We apply the tilted knockoff method to identify genetic underpinning of endophenotypes in a case-control study.