Main menu:

Invited Talk:


Title: Colorful Social Networks: Inferring Social Ties in Large-scale Networks

Dr. Jie Tang

Department of Computer Science and Technology, Tsinghua University

Homepage: http://keg.cs.tsinghua.edu.cn/persons/tj/

Abstract:
Information network contains abundant knowledge about relationships among people or entities. Unfortunately, such kind of knowledge is often hidden in a network where different kinds of relationships are not explicitly categorized. An interesting question is: to which extent we can infer the social ties between people according to their historic behaviors? In this talk, I am going introduce our two works on relation mining. The first one is to mine advisor-advisee relationships from research publication networks. We propose a time-constrained probabilistic factor graph model (TPFG) to formalize the problem in a probabilistic framework. Experimental results show that the proposed approach infer advisor-advisee relationships efficiently and achieves a state-of-the-art accuracy (80-90%) without any supervised information. Another problem is how generalize the probabilistic model to deal with different types of social networks. We will further introduce a partially labeling factor graph model.

Short Biography:
Jie Tang is an associate professor at the Department of Computer Science and Technology, Tsinghua University. His main research interests include social network mining and fundamental learning theories. He has published over 100 research papers in major international journals and conferences including: KDD, IJCAI, SIGMOD, ACL, ISWC, Machine Learning Journal, TKDD, TKDE, JWS and JoDS. He serves as Publications Co-Chairs of SIGKDD'11, Program Chair of ADMA'11, Poster Chair of WSDM'11, Publicity Co-Chairs of ICDM'11, Area Chair of ECML/PKDD'11, Publicity Co-Chairs of SocInfo'11, Vice PC Chair WI'10-11, and also serves as the PC member of more than 50 major international conferences. He serves as the editor of Journal of Software, Semantic Web Journal, Journal of Advances in Information Technology, the guest editor of TKDD special issue on large-scale data mining and TIST special issue on Computational Models of Collective Intelligence in the Social Web. HP: http://keg.cs.tsinghua.edu.cn/persons/tj/


Title: Cross-domain Classification: Generative vs. Discriminative Models

Dr. Ping Luo

HP Labs

Homepage: http://www.hpl.hp.com/people/ping_luo/

Abstract:
In many emerging real-world applications, new test data for classification usually come from fast evolving data domains. It introduces the problem of cross-domain classification, which considers the data distribution difference among data domains. It is widely believed that discriminative models are preferred to generative ones for traditional classification problems, where training and testing data are from the same data distribution. However, in this talk we will show how generative models are fully utilized to outperform discriminative ones for cross-domain classification, and analyze why this happen.
Specifically, this study is inspired by the two observations about the data distributions across domains. First, different domains may use different key words to express the same concept. Second, the association between the concepts and document classes may be stable across domains. These two aspects are actually the distinction and commonality across data domains. Along this line we propose a generative model, Collaborative Dual-PLSA (CD-PLSA), to simultaneously capture both the domain distinction and commonality among multiple domains. The shared commonality intertwines with the distinctions over multiple domains, and is also used as the bridge for knowledge transformation. We conduct extensive experiments over hundreds of classification tasks with multiple source domains and multiple target domains to validate the superiority of the proposed CD-PLSA model over existing state-of-the-art methods. In particular, we show that CD-PLSA is more tolerant of distribution differences than other methods. Finally, we argue that generative models explicitly model the data distribution differences among data domains via the joint probability, which helps to improve accuracy for cross-domain classification.

Short Biography:
Ping Luo is currently a Research Scientist in the Hewlett-Packard Labs, China. He received the Ph.D. degree in computer science from Institute of Computing Technology, Chinese Academy of Sciences. His general area of research is knowledge discovery and machine learning. He has published several papers in some prestigious refereed journals and conference proceedings, such as IEEE Transactions on Information Theory, IEEE Transactions on Knowledge and Data Engineering, Journal of Parallel and Distributed Computing, ACM SIGKDD, ACM CIKM, IJCAI. He is the recipient of the Doctoral Dissertation Award, China Computer Federation, 2009. Dr. Luo is a member of the IEEE Computer Society and the ACM.