题目:Model-free feature screening approaches in the presence of missing response
主持:曹春正副教授
时间:2015年11月12日(周四)16:00-17:30
地点:尚贤楼108
数学与统计学院
2015年11月11日
Abstract:
It is quite challenge to develop model-free feature screening approaches {/it directly} for missing response problems since the existing standard missing data analysis methods cannot be applied directly to high dimensional case. This paper develops some novel methods by borrowing information of missingness indicators such that any feature screening procedures for ultrahigh-dimensional covariates with full data can be applied to missing response case.
The first method is the so-called missing indicator imputation screening which is developed by proving that the set of the active predictors of interest for the response is a subset of the active predictors for the product of the response and missingness indicator under some mild conditions. As an alternative, another method, which is called as Venn diagram based approach, is also developed for obtaining a feature screening estimator of the active predictor set of interest. The sure screening property is proven for both the methods.
It is shown that the complete case (CC) approach can also keep the sure screening property of any feature screening approach with sure screening property. A simulation study was conducted to compare the proposed methods with the CC approach. Real data analysis was used to illustrate the proposed method. Both the simulation studies and real data analysis indicate that the proposed zero imputation feature screening method outperforms the CC method and the Venn diagram based method in some significant cases and is quite competitive in some other cases.
简介: 王启华,中国科学院数学与系统科学研究院研究员,国家杰出青年基金获得者,教育部长江学者奖励计划特聘教授,中科院“百人计划入选者”,
Elected member of International Statistical Institute (国际统计研究会会当选成员),是一些国际国内刊物的主编。 主要从事生存分析、缺失数据分析、高维数据统计及非参数与半参数统计等方面研究。