题目：Asympirical Analysis: A New Paradigm for Data Science
报告人简介：Prof. Ping Ma is a Professor of Statistics and co-directs the big data analytics lab at the University of Georgia, USA. He was Beckman Fellow at the Center for Advanced Study at the University of Illinois at Urbana-Champaign, Faculty Fellow at the US National Center for Supercomputing Applications, and a recipient of the US National Science Foundation CAREER Award. His paper won the best paper award of the Canadian Journal of Statistics in 2011. He serves on multiple editorial boards including the Journal of the American Statistical Association and Statistical Applications in Genetics and Molecular Biology. He is a fellow of the American Statistical Association.
摘要：Large samples have been generated routinely from various sources. Classic statistical and analytical methods are not well equipped to analyze such large samples due to expensive computational costs.
In this talk, I will present an asympirical (asymptotic + empirical) analysis in large samples. The proposed method can significantly reduce computational costs in high-dimensional and large-scale data. We show the estimator based on the proposed methods achieves the optimal convergence rate. Extensive simulation studies will be presented to demonstrate numerical advantages of our method over competing methods. I will further illustrate the empirical performance of the proposed approach using two real data examples.
题目：Reference-Free Learning with Multiple Metagenomic Samples
报告人简介：Prof. Wenxuan Zhong is a Professor of Statistics and the founding director of the big data analytics lab at the University of Georgia. She is also the founding member of Georgia Institutes of Informatics. She serves multiple NSF and NIH panels. Her research focuses on the statistical methodology and theory development to face the striking new phenomena emerged under the big data regime.
摘要： A major goal of metagenomics is to identify and study the entire collection of microbial species in a set of targeted samples. In this talk, I will present a novel statistical metagenomic algorithm that simultaneously identifies microbial species and estimates their abundances without using reference genomes. Compared to reference-free methods based primarily on k-mer distributions or coverage information, the proposed approach achieves a higher species binning accuracy and is particularly powerful when sequencing coverage is low. I will demonstrate the performance of this new method through both simulation and real metagenomic studies.