学术空间

数学与统计及交叉学科前沿论坛------高端学术讲座第37场

讲座题目:Interaction Identification and Clique Screening for Classification with Ultra-high Dimensional Discrete Features

主讲人:安百国

讲座时间:2021.11.5(周下午1400-1500

讲座地点:腾讯会议574 268 761 及 数统楼学术五活动室

主讲人简介:

  安百国首都经济贸易大学副教授,博士生导师,统计学院数理统计系主任2012年毕业于东北师范大学。2013-2015年美国北卡罗莱纳大学教堂山分校博士后,2016年至今工作于首都经济贸易大学统计学院。研究兴趣包括机器学习、维复杂数据分析、文本分析、图像数据分析。

主讲内容:

  Interactions have greatly influenced recent scientific discoveries, but the identification of interactions is challenging in ultra-high  dimensions. In this study, we propose an interaction identification method for classification with ultra-high-dimensional discrete features. We utilize clique sets to capture interactions among features, where features in a common clique have interactions that can be used for classification. The number of features related to the interaction is the size of the clique. Hence our method can consider interactions caused by more than two feature variables. We propose a Kullback-Leibler divergence-based approach to correctly identify the clique sets with a probability that tends to 1 as the sample size tends to infinity. A clique screening method is then proposed to filter out clique sets that are useless for classification, and the strong sure screening property can be guaranteed. Finally, a clique naive Bayes classifier is proposed for classification. Numerical studies demonstrate that our proposed approach performs very well.