Introduction to Data Mining            

[Course Information] [Slides] [Reading Materials] [Assignments] [Data Mining Practice]

Course Information 
  Course Number: 22010230
  To: Undergraduate students of Department of Computer Science and Technology, Nanjing University.
  Number of Students: 150
  Classroom: I-106, Main Teaching Building, Xianlin Campus
  Time: 10:10 -- 12:00, every Tuesday
  Office Hour: 13:30 -- 14:30, every Tuesday   (Rm 917)
Text Book: J. Han, M. Kamber, J. Pei. Data Mining: Concepts and Techniques, 3rd edition. Elsevier Inc, 2012
Main Reference Book: [1] C. C. Aggarwal. Data Mining: The Textbook. Springer, 2015.
[2] P.-N. Tan, M. Steinbach, V. Kumar. Introduction to Data Mining, Addison-Wesley, 2006.
[3] I. H. Witten, E. Frank. Data Mining: Practical Machine Learning Tools and Techniques, 3rd edition. Morgan Kaufmann Publishers, 2011
[4] D. Hand, H. Mannila, P. Smyth. Principles of Data Mining. MIT Press, MA:Cambridge, 2001.
Grading: Final exam (30%) + Assignments (30%) + Data Mining Practice (40%)
  TA: Mr. Yi-Fan Ma, Mr. Wen-Yu Tang, and Mr. Yang Cao


Slides      (The slides would appear on this web page just after the corresponding chapter is introduced.)
               Part 1:    Introduction
               Part 2:    Data Cube and OLAP
               Part 3:    Data Preprocessing
               Part 4:    Association
               Part 5:    Prediction
               Part 6:    Clustering
               Part 7:    Outlier Analysis
               Part 8:    Mining the Text and Web Data

Reading Materials
              Matrix Cookbook

              The Assignment 1 is released, please check the slides of Part 3. (Due on class at 12:00, Apr. 9).  
              The Assignment 2 is released, please check the slides of Part 4. (Due at 23:59, Apr 23.).   [data & requirement] [Submission Links]
              The Assignment 3 is released, please check the slides of Part 5. (Due at 23:59, May 24.).   [data & requirement] [Submission Links]

Data Mining Practice 
       The data mining practice is to apply data mining techniques to recommend suitable emojis for the textual messages. Detailed information could be found here. (Reports due at 23:59, Jun. 18)