It's Not a Lottery, It's a Race: Understanding How Gradient Descent Adapts the Network's Capacity to the Task
概要
arXiv:2602.04832v2 Announce Type: replace-cross Abstract: Our theoretical understanding of neural networks is lagging behind their empirical success. One of the important unexplained phenomena is why and how, during the process of training with gradient descent, the theoretical capacity of neural n…