PAKDD 2007 Data Mining Competition
PAKDD competition


  Participation Guidelines

  Evaluation and Awards

  Dates and Updates

  Results and Reports

  Final Winners

  User Login


  Online Registration

  Online Submission

  PAKDD 2007


Participation Guidelines

The real-world dataset for this competition has been provided by a consumer finance company (the company) with the aim of finding better solutions for a cross-selling business problem. By registering for this competition you:

  • acknowledge that you may only use the data for the purpose of entering this competition;
  • acknowledge that your submission may be given to, and used, by the company;
  • grant the company and its affiliates a perpetual, worldwide, non-exclusive, irrevocable and royalty free licence to use, modify, communicate to the public and adapt and combine your submission for their own business purposes, without further reference to you.

Open Category -There is only one entry category for this year's competition. Team entries (including coaches) are permitted.

Free Registration - Potential competition participants are required to register the following information via the "Online Registration" page:

(1) Full Name, (2) Contact Postal Address, (3) Contact Telephone Number (including country/area codes), (4) Contact Email Address, (5) Occupation (optional), and (6) Company/Institution (optional).

For team entries, please note that registration and contact information will need to be submitted online for each member of your team.

Upon successful registration, competition modeling and prediction datasets can be obtained from the downloads section (in the participant login area). The datasets will be available in both Excel97 format and tab-delimited text file format. A data dictionary will also be provided (however not all variable codings will be explained). Note also that the datasets might contain missing or invalid values. Do note that this dataset is not meant to be made available or redistributed in the public domain after the competition as requested by the dataset donor.

Misclassification cost functions (marketing costs and potential gains) will not be provided. Participants should also take note not to necessarily assume that the proportions of the target flag in the overall customer database population, in the modeling sample, and in the prediction sample are the same.


1. Participants are required to submit via the "online Submissions" page a tab-delimited text file or an Excel97 file containing 2 fields: the numeric "Customer_ID" field and the numeric "Target_Score" field for each of the 8,000 customers in the prediction dataset. Less than 8,000 scores will not be considered a complete submission. Normalization or scaling of scores is not required.

2. Entrants are also required to submit a technical paper style report (in Word or PDF format, minimum one page, maximum 3,000 words) that includes:

a)      Walk-through of any data preparation step(s) that were applied

b)      Explanation of modeling technique(s) used

c)       Summary of scoring model results (e.g. final model parameters and model assessment)

d)      Discussion on what business insights can be interpreted from the scoring model results.

These reports are meant primarily for explaining your work to the general public and to the company donating the dataset, and will be published on the "online Results" page from 1 May 2007.Participation in this competition implies that you agree for your report(s) to be published online. Note that the committee might also consider assessing qualitative aspects of these reports if it deems that a tie-breaker is necessary.


Copyright @2007 LeVis Group, Shanghai University
Webmaster : Li Dan - Shanghai University (China)