Proper regularizers for semi-supervised learning

Dejan Slepcev, Carnegie Mellon University
4/13, 2022 at 4:10PM-5PM in https://berkeley.zoom.us/j/186935273

We will discuss a standard problem of semi-supersised learning: given a data set (considered as a point cloud in a euclidean space) with a small number of labeled points the task is to extrapolate the label values to the whole data set. In order to utilize the geometry of the dataset one creates a graph by connecting the nodes which are sufficiently close. Many standard approaches rely on minimizing graph-based functionals, which reward the agreement with the labels and the regularity of the estimator. Choosing a good regularization leads to questions about the relations between discrete functionals in random setting and continuum nonlocal and differential functionals. We will discuss how insights about this relation provide ways to properly choose the functionals for semi-supervised learning and appropriately set the weights of the graph so that the information is propagated in a desirable way from the labeled points. Theoretical results, numerical illustrations and performance on standard test data sets will be provided.