Posts

  • A Primer on Implicit Regularization

    The way we parameterize our model strongly affects the gradients and the optimization trajectory. This biases the optimization process towards certain kinds of solutions, which could explain why our deep models generalize so well.

  • Over-Parameterization and Optimization III - From Quadratic to Deep Polynomials

    The promise of projection based optimization in the canonical space leads us on a journey to generalize the shallow model to deep architectures. The journey is only partially successful, but there are some nice views along the way.

  • Over-Parameterization and Optimization II - From Linear to Quadratic

    If the canonical representation of the network has a nicer optimization landscape than the deep parameterization, could we use it to get better optimization algorithms for a non-linear neural network?

  • Over-Parameterization and Optimization I - A Gentle Start

    Deep, over-parameterized networks have a crazy loss landscape and it’s hard to say why SGD works so well on them. Looking at a canonical parameter space may help.

  • Discriminative Active Learning

    Labeling data is hard work, which means active learning is something worth knowing about… In this blog series I introduce the basic concepts of active learning, review the state of the art algorithms of today and develop a new active learning algorithm.