Gabriel Peyré: Training a MLP with a single hidden layer is the evolution of a sparse probability measure (sum of Dirac) over the neurons parameter domain (here 2D for the regression of a 1D function, slope+position of the ridge). https://arxiv.org/abs/1412.8690 https://t.co/2nePwSIQIP
9 replies, 682 likes
Ally Jr: @ailabtz FUN FACT: A fully connected neural network with a single hidden layer (w. iid prior) of infinite width is equivalent to a Gaussian process!
Another fun fact, @doctor_elsa uses Gaussian processes for time series prediction to look for when interventions occurred.
0 replies, 1 likes
Found on Feb 08 2020 at https://arxiv.org/pdf/1412.8690.pdf