ARPHA Preprints, doi: 10.3897/arphapreprints.e69590
An Ensemble Model for Financial Statement Fraud Detection
expand article infoAhmed M. Khedr, Magdi El Bannany, Sakeena Kanakkayil
‡ University of Sharjah, Sharjah, United Arab Emirates
Open Access
Abstract

Fraudulent financial statements are deliberate furnishing and/or reporting incorrect statistics, and this has become a major economic and social concern as the global market is witnessing an upsurge in financial accounting fraud, costing businesses billions of dollars a year. Identifying companies that manipulate financial statements remains a challenge for auditors, as fraud strategies have become increasingly sophisticated over the years. We evaluate machine learning techniques for financial statement fraud detection, particularly a powerful ensemble technique, the XGBoost algorithm, that help to identify fraud on a set of sample companies drawn from the MENA region. The issue of the class imbalance in the dataset is addressed by applying the SMOTE algorithm. We found that XGBoost algorithm outperformed other algorithms in this study: Logistic Regression (LR), Decision Tree (DT), Vector Machine Support (SVM), Adaboost, and RandomForest. The XGBoost algorithm is then optimised to obtain the optimum performance.

Keywords
Financial Statement Fraud (FSF), Middle East and North Africa (MENA), Machine Learning (ML), XGBoost