Posts
A Primer on Implicit Regularization
The way we parameterize our model strongly affects the gradients and the optimization trajectory. This biases the optimization process towards certain kinds of solutions, which could explain why our deep models generalize so well.
Over-Parameterization and Optimization III - From Quadratic to Deep Polynomials
The promise of projection based optimization in the canonical space leads us on a journey to generalize the shallow model to deep architectures. The journey is only partially successful, but there are some nice views along the way.
Over-Parameterization and Optimization II - From Linear to Quadratic
If the canonical representation of the network has a nicer optimization landscape than the deep parameterization, could we use it to get better optimization algorithms for a non-linear neural network?
Over-Parameterization and Optimization I - A Gentle Start
Deep, over-parameterized networks have a crazy loss landscape and it’s hard to say why SGD works so well on them. Looking at a canonical parameter space may help.
Discriminative Active Learning
Labeling data is hard work, which means active learning is something worth knowing about… In this blog series I introduce the basic concepts of active learning, review the state of the art algorithms of today and develop a new active learning algorithm.