Fast Distribution To Real Regression
Abstract
We study the problem of distribution to real regression, where one aims to regress a mapping f that takes in a distribution input covariate P∈\mathcalI (for a non-parametric family of distributions \mathcalI) and outputs a real-valued response Y=f(P) + ε. This setting was recently studied in Pózcos et al. (2013), where the “Kernel-Kernel” estimator was introduced and shown to have a polynomial rate of convergence. However, evaluating a new prediction with the Kernel-Kernel estimator scales as Ω(N). This causes the difficult situation where a large amount of data may be necessary for a low estimation risk, but the computation cost of estimation becomes infeasible when the data-set is too large. To this end, we propose the Double-Basis estimator, which looks to alleviate this big data problem in two ways: first, the Double-Basis estimator is shown to have a computation complexity that is independent of the number of of instances N when evaluating new predictions after training; secondly, the Double-Basis estimator is shown to have a fast rate of convergence for a general class of mappings f∈\mathcalF.
BibTeX
@conference{Oliva-2014-119784,author = {J. Oliva and W. Neiswanger and B. Poczos and J. Schneider and E. Xing},
title = {Fast Distribution To Real Regression},
booktitle = {Proceedings of 17th International Conference on Artificial Intelligence and Statistics (AISTATS '14)},
year = {2014},
month = {April},
volume = {33},
pages = {706 - 714},
}