The 11th Pacific-Asia Knowledge Discovery
and Data Mining conference (PAKDD 2007) is pleased to host another
data mining competition, co-organized by the Singapore Institute of
Cross-Selling Problem Summary
real-world dataset for this year's competition was donated to us by a
consumer finance company with the aim of possibly finding better
solutions for a cross-selling business problem.
The company currently has a customer base of credit card customers as well as a customer base of home loan (mortgage) customers. Both of these products have been on the market for many years, although for some reason the overlap between these two customer bases is currently very small. The company would like to make use of this opportunity to cross-sell home loans to its credit card customers, but the small size of the overlap presents a challenge when trying to develop a effective scoring model to predict potential cross-sell take-ups.
A modeling dataset of 40,700 customers with 40 modeling variables (as of the point of application for the company's credit card), plus a target variable, will be provided to the participants. This is a sample of customers who opened a new credit card with the company within a specific 2-year period and who did not have an existing home loan with the company. The target categorical variable "Target_Flag" will have a value of 1 if the customer then opened a home loan with the company within 12 months after opening the credit card (700 random samples), and will have a value of 0 if otherwise (40,000 random samples).
prediction dataset (8,000 sampled cases) will also be provided to
the participants with similar variables but withholding the target
The data mining task is a to produce a score for each customer in the prediction dataset, indicating a credit card customer's propensity to take up a home loan with the company (the higher the score, the higher the propensity).