Second day: follow-up on Mallat’s talk
Let us start by summarizing, very schematically, the end of Stéphane’s talk. The idea is now that our classes do not feature simple geometric invariances (think of images of beavers). This is where things get tricky but potentially even more interesting. First, taking convolutions along geometric deformations will be simplified by switching to a Haar wavelet transform. In the Haar transform, you only need to compute averages and differences between pairs of pixels – they are taken aligned on the signal grid in the classical transform. For Haar scattering, this is generalized by allowing to apply these operations to any pair of points. The question is now how do we choose those pairs ? This seems combinatorially hopeless … Stéphane’s argument is that this pairing can be learned by imposing that we reduce space volume while maintaining inter class margins. Interestingly, but this is out of the scope of this summary, this can be turned into a convex optimization problem. Comparing the scattering transform to some of the most recent deep nets shows that with only two layers performances are comparable but with much less training complexity. The most sophisticated deep nets then add extra layers that are not exactly similar to the first ones and it is not clear what these try to achieve.
Large random matrices
Jamal Najim gave a wonderful first course on large random matrices. This was an introduction to celebrated results, like Wigner’s semi-circle law that controls the density of eigenvalues of large random symmetric matrices. Jamal then took us through the proof of the Marcenko-Pastur which provides the asymptotic density of eigenvalues for large covariance matrices. This was really a tour de force since the proof is rather complex, but Najim highlighted its architecture (a subtle mix of probability and algebra, glued with the Stieltjes transform) with great pedagogy.