Seminar abstract

Stability in Multi-Clustering

Jian Pei
Professor, IEEE Fellow
Simon Fraser University, Canada

Abstract: Many high dimensional data sets are like ambigrams -- their meaning may change when viewed or interpreted from different perspectives. Multi-clustering is the art of understanding such ambigram data sets comprehensively. Although there are a few multi-clustering methods proposed, a fundamental problem remains: are there multiple clusterings hidden in a data set, and, if so, how many. In this talk, I will present a breakthrough in my group. Essentially, we introduce to multi-clustering a stable property based on the Laplacian eigengap, and prove that the larger the eigengap, the more stable a clustering solution. Based on this stable property, we develop a novel multi-clustering method that can automatically identify a certain number (one or multiple) of stable but different clustering solutions by exploring the corresponding feature subspaces. We formulate the problem into an optimization framework that is non-convex. However, it is surprisingly desirable and appropriately utilized in our problem. Our comprehensive empirical studies on both synthetic and real datasets verify the proposed method in several aspects: number of hidden clusterings, differences between different solutions and their feature subspaces.

Bio: Jian Pei is currently the Canada Research Chair in Big Data Science and a professor at the School of Computing Science and the Department of Statistics and Actuarial Science at Simon Fraser University, Canada. He received a Ph.D. degree at the same school in 2002 under Dr. Jiawei Han’s supervision. His research interests are to develop effective and efficient data analysis techniques for novel data intensive applications. He has published prolifically and is one of the top cited authors in data mining. He received a series of prestigious awards. He is also active in providing consulting service to industry and transferring the research outcome in his group to industry and applications. He is an editor of several esteemed journals in his areas and a passionate organizer of the premier academic conferences and initiatives defining the frontiers of the areas. He is an IEEE Fellow.
