Machine Learning Parallelism Could Be Adaptive, Composable and Automated

NSH 3305

Abstract: In recent years, the pace of innovations in the fields of machine learning has accelerated. To cope with the sheer computational complexity of training large ML models on large datasets, researchers in SysML have created algorithms and systems that parallelize ML training and inference over multiple CPUs or GPUs, or even multiple computing nodes [...]