hardmaru: Do We Need Zero Training Loss After Achieving Zero Training Error?
By not letting the training loss to go to zero, model will “random walk” with the same non-zero loss and drift into an area with a flat loss landscape that leads to better generalization.
14 replies, 784 likes
Shawn Presser: Most underrated ML hack of this century:
Loss getting too low?
loss = abs(loss - x) + x
where x is a value like 0.2.
Presto, your loss is no longer <0.2.
Set it to whatever you want. It completely stabilized our biggan runs.
This is "flood loss" https://arxiv.org/abs/2002.08709
13 replies, 494 likes
tsauri: @ericjang11 @karpathy well only recently we know we can "freeze" validation loss with something like
loss = (loss + 0.2).abs() - 0.2
0 replies, 26 likes
YipingLu_2prime: And you can see epoch wise double descent here
0 replies, 4 likes
Found on Apr 20 2020 at https://arxiv.org/pdf/2002.08709.pdf