EECS500 Seminar

Xiang Zhang
Mining High-Throughput Biological Data
University of North Carolina at Chapel Hill
White 411
11:30am - 12:30pm
April 26, 2011

Advanced biotechnologies have rendered feasible high-throughput data collecting in human and other model organisms. The availability of such data holds promise for dissecting complex biological processes. Making sense of the flood of biological data poses great statistical and computational challenges.
In this talk, I will discuss the problem of finding gene-gene interactions in high-throughput genetic data. Finding genetic interactions is an important biological problem since many common diseases are caused by joint effects of genes. Previously, it was considered intractable to find genetic interactions in the whole-genome scale due to the enormous search space. The problem was commonly addressed using heuristics which do not guarantee the optimality of the solution. I will show that by utilizing the upper bound of the test statistic and effectively indexing the data, we can dramatically prune the search space and reduce computational burden. Moreover, our algorithms guarantee to find the optimal solution. In addition to handling specific statistical tests, our algorithms can be applied to a wide range of study types by utilizing convexity, a common property of many commonly used statistics.
I will also briefly survey my work on modeling gene regulatory networks using gene expression data and finding local latent patterns that are hidden in the subspaces of high-dimensional data.


Xiang Zhang is a Ph.D. Candidate in the Department of Computer Science at the University of North Carolina at Chapel Hill, where he is advised by Dr. Wei Wang. His research focuses on data mining, bioinformatics, and databases. Working closely with biologists and statisticians, he has developed effective and efficient techniques for analyzing high-throughput genetic and genomic data. For his research, he has won a best student paper award at ICDE 2008 and a best research paper award at SIGKDD 2008. He is a recipient of a Microsoft Research Ph.D. Fellowship in 2009-2011.