Facing Imbalanced Data–Recommendations for the Use of Performance Metrics
Abstract
Recognizing facial action units (AUs) is important for situation analysis and automated video annotation. Previous work has emphasized face tracking and registration and the choice of features classifiers. Relatively neglected is the effect of imbalanced data for action unit detection. While the machine learning community has become aware of the problem of skewed data for training classifiers, little attention has been paid to how skew may bias performance metrics. To address this question, we conducted experiments using both simulated classifiers and three major databases that differ in size, type of FACS coding, and degree of skew. We evaluated influence of skew on both threshold metrics (Accuracy, F-score, Cohen's kappa, and Krippendorf's alpha) and rank metrics (area under the receiver operating characteristic (ROC) curve and precision-recall curve). With exception of area under the ROC curve, all were attenuated by skewed distributions, in many cases, dramatically so. While ROC was unaffected by skew, precision-recall curves suggest that ROC may mask poor performance. Our findings suggest that skew is a critical factor in evaluating performance metrics. To avoid or minimize skew-biased estimates of performance, we recommend reporting skew-normalized scores along with the obtained ones.
BibTeX
@workshop{Jeni-2013-119675,author = {Laszlo A. Jeni and Jeffrey F. Cohn and Fernando De la Torre},
title = {Facing Imbalanced Data--Recommendations for the Use of Performance Metrics},
booktitle = {Proceedings of ACII '13 Workshops},
year = {2013},
month = {September},
pages = {245 - 251},
}