Weak Supervision for Affordable Modeling of Electrocardiogram Data
Abstract
Analysing electrocardiograms (ECGs) is an inexpensive and non-invasive, yet powerful way to diagnose heart disease. ECG studies using Machine Learning to automatically detect abnormal heartbeats so far depend on large, manually annotated datasets. While collecting vast amounts of unlabeled data can be straightforward, the point-by-point annotation of abnormal heartbeats is tedious and expensive. We explore the use of multiple weak supervision sources to learn diagnostic models of abnormal heartbeats via human designed heuristics, without using ground truth labels on individual data points. Our work is among the first to define weak supervision sources directly on time series data. Results show that with as few as six intuitive time series heuristics, we are able to infer high quality probabilistic label estimates for over 100,000 heartbeats with little human effort, and use the estimated labels to train competitive classifiers evaluated on held out test data.
This work was partially supported by the Defense Advanced Research Projects Agency award FA8750-17-2-0130, and by the Space Technology Research Institutes grant from National Aeronautics and Space Administration’s Space Technology Research Grants Program.
BibTeX
@conference{Goswami-2022-131260,author = {Mononito Goswami and Benedikt Boecking and Artur Dubrawski},
title = {Weak Supervision for Affordable Modeling of Electrocardiogram Data},
booktitle = {Proceedings of American Medical Informatics Association Annual Symposium (AMIA '21)},
year = {2022},
month = {February},
pages = {536 - 545},
}