Distinguishing the Forest from the TREES: A Comparison of Tree-Based Data Mining Methods

By Richard A. Derrig, Louise A. Francis

Download PDF of Full Text


One of the most commonly used data mining techniques is decision trees, also referred to as classification and regression trees or C&RT. Several new decision tree methods are based on ensembles or networks of trees and carry names like TreeNet and Random Forest. Viaene et al. compared several data mining procedures, including tree methods and logistic regression, for modeling expert opinion of fraud/no fraud using a small fixed data set of fraud indicators or “red flags.” They found that simple logistic regression did as well at matching expert opinion on fraud/no fraud as the more sophisticated procedures. In this paper we will introduce some publicly available regression tree approaches and explain how they are used to model four proxies for fraud in insurance claim data. We find that the methods all provide some explanatory value or lift from the available variables with significant differences in fit among the methods and the four targets. All modeling outcomes are compared to logistic regression as in Viaene et al., with some model/software combinations doing significantly better than the logistic model.

Keywords: Fraud, data mining, ROC curve, claim investigation, decision trees

Related Documents:


Derrig, Richard A., and Louise A. Francis, "Distinguishing the Forest from the TREES: A Comparison of Tree-Based Data Mining Methods," Variance 2:2, 2008, pp. 184-208.

Taxonomy Classifications

Subscribe to the RSS Feed

Email List

Sign up today for the Variance e-mail list and receive updates about new issues, articles, and special features.

Mission Statement

Variance (ISSN 1940-6452) is a peer-reviewed journal published by the Casualty Actuarial Society to disseminate work of interest to casualty actuaries worldwide. The focus of Variance is original practical and theoretical research in casualty actuarial science. Significant survey or similar articles are also considered for publication. Membership in the Casualty Actuarial Society is not a prerequisite for submitting papers to the journal and submissions by non-CAS members is encouraged.