On tiny steps and giant leaps

Following up on the previous post. After the initial shock and awe, we started to react. The first thing we wondered was wether we (the parents) had misunderstood the opinions of the various specialists we had met. We had a simple question:

Is the public school the best educational system for our son, to allow him to learn, socialize and, simply put, ensure he develops into a happy teenager, accepted for all he is?

Fortunately, we are not on our own when facing such questions. We therefore turned to our son’s network – his “I can network“.

The network is the most important tool for kids on the spectrum and for their parents. It is a team of specialists, educators, therapists, coaches – whoever can help your child on any specific aspect of his ASD. The network is typically assembled by the parents and it takes a lot of energy to set it up, keep it alive, but it is a fundamental component of early life on the spectrum. Our son’s network kept growing over the years and now includes his beloved speech therapist, a psychologist from the newly founded Center for Autism, two ASD specialists who focus on learning and behaviour, our pediatrician, the in-class aid from the county school system and soon an ergotherapist.

We asked the question above to each member of the network and we received the same simple answer: yes, your son has all the necessary intellectual skills to learn and benefit from the school environment. But the school needs to adapt to him, much more than the other way round. Just hearing this provided the necessary energy boost and relief we needed to keep going. We planned a meeting with the school authorities, teachers and the whole network. A member of the network was mandated to visit the school, attend a few hours in class, observe and give feedback. The report was clear and interesting: the school was still largely clinging to the official programme and methods and expecting our son to … to change I guess (I almost wrote “to heal”, from their perspective). It was also clear that the staff was simply ignorant and unsure if they should ask any sort of help. The teacher, who has 20 years experience and is a great teacher I hear, was just baffled by the challenge. She thought she just needed to excel in the traditional methods and dealing with our son was just way out of her comfort zone. More on that later – it is an interesting observation.

The day of the big meeting came. Thanks to discussing with our network, we were prepared. We accepted that the network met first with the school authorities before seeing us: it is easier for the school to hear criticisms if the parents are not in the room. When called in, I first talked at length, giving a positive spin to the conversation. That we were convince the school was the right place and that, given sufficient inputs and with a bit of flexibility, the teacher would just be great. That this was no ordinary education, but also that the outcome would not be ordinary: we were in to literally offer a chance at a better life to a 6 years old child. That the other kids would benefit from the whole experience by learning to appreciate and accept difference. That we did not care if our son didn’t follow the whole program. That we understood the challenges of the school and would do all the possible lobbying work so that authorities provide true means for their mission. I then took a deep breath and talked to the teacher. I told her I could feel she was lost, that she had the impression of being powerless and useless. “My last thought every evening before I sleep”, I told, ” is the same. I think that I must be the most miserable, useless, powerless father for my child. Imagine your frustration and amplify it by the feelings of a father for his son. But “, I continued, ” every morning I wake up wondering what I can do for him, and that’s getting me through the day because I really do help him”. You see, one thing I learned as the father of an ASD kid is a golden rule for survival: you have to let go off that feeling of guilt for it is useless. You’re not going to help your child this way. You have to keep moving, again and again and appreciate all the tiny steps. Because they sum up to giant leaps for our sons.

Here is what I am asking of you”, I told her, “help him take these daily tiny steps. And in the end he will take you on the next giant leap.

The outcome was quite positive. The director of the school agreed to propose a specific programme for our son, with the help of the network. The teacher agreed to learn and adapt as much as possible, which is what we were looking for.

Not everything is rosy. To help the transition, our son will only be part time in school, roughly 14 periods (say 50% of the time). We take him out for various therapies and he stays home the rest of the time. This means one of us (my wife or I) have to be home pretty much 100%. My wife has completely sacrificed her career and will continue to do so  -even though this is a point of disagreement between us. She was in the last stage of interviews with the research center of a big internet company whose name starts with G and offers the best jobs in Europe. The way she handles these sacrifices is just unreal.

But hey, tiny steps is all we’re asking for !


The Immobile Migrants

[File this post under awareness – I am not asking anything else]

If you’re a parent, you’ll agree with me that it is hard to imagine worse than a lack of present and future for your kids. No little friends coming at home. No birthday parties invitations. No school play. No soccer matches on Sundays. No graduation ceremony. No first girlfriend. No first job and the joy (sweet agony!) of seeing your loved one grow into an independent adult, ready to play their part.

Yesterday, we had a meeting at our son’s school during which his future was taken away from him and from us. My 6 years old son is on the autism spectrum. My beautiful, funny, mischievous son is about to enter limbos unless his mother and I fight violently enough to give him a chance at a decent life DESPITE public authorities. Here is what happened. Our son spent two years in a private school, inside a regular class. We funded a 100% in class assistance to help him focus and interact with the teacher and other kids. This was part of a whole network of therapists around him. He made a lot of progress and followed the cursus, learning to count, do basic operations, read and write (though his writing is not good because he does not have a good grip – working on it). After these two years, kids enter a new cycle and we had to decide wether or not to stay in the private system or move to the public one. Costs aside, the public system seemed a good choice because they have a mission of being inclusive with neuro-atypical kids and they have a privileged access to the public health system. On paper it looked excellent and our son would join his older sister at her school. We met with the school’s dean, state school inspector and part of our self-established network. We, and the people taking care of him, described our kid and his challenges. The school told us they had some experience and even had a teacher who volunteered. It was almost too good to be true.

Here is a story that keeps repeating itself over and over: the public authorities set up a policy, politicians will communicate how wonderful it is. But in practice no means will be invested. In practice, the school has no experience, no dedicated staff, no program. They were thinking that they would just welcome our son and he would simply have to learn to behave like a “normal” kid. If you have a child on the spectrum you see where this is going, don’t you ? We did not see it coming. We had prepared him to school over the summer, visited the place. We set up a schedule. He happily started the year and we thought everything was alright. Unfortunately it wasn’t. It started with his older sister (9 years old) bursting into tears at home, explaining us that he is always left on his own during breaks. Some teachers forbade her to play with her brother because he “had to learn to play with others”. Then he had a full meltdown at school. If you have a kid on the spectrum, you know what I am talking about and I won’t describe it here. We were called to the school and discussed with the teacher and the school dean. What unfolded was pure disaster.

Except a few hours per week (!!!) where an assistant (with no qualification whatsoever) helped him, nothing was done to help our son integrate smoothly. No schedule, no program, no hint at a method. He was just supposed to adapt all by himself. Of course it did not work, but the school did not tell us for a full 3 weeks. During breaks he was on his own with the noise from 200 kids around. The teacher thought it is not a good idea if he uses his headphones. The school felt it was fine because “he seemed happy just walking around a bench”. In fact he was just trying to manage his anxiety until it was too much and he melted down.

In class, with no specific support he regressed. They quickly stopped trying to work with him and let him spend most of his time drawing, instead of the exercices he was perfectly capable of 6 months ago. They described him as a disturbance for the other kids because he was noisy.  Of course, he was left on his own with no instruction and support: typically a kid with ASD freaks out and does typical routines to re-assure himself in this situation. He had to completely lose it for the school to finally tell us something was wrong. At home he was getting back to his known universe and was ok, until the meltdown – it took a few days to reassure him.

When we met with the teacher and dean and discovered what had happened for 3 weeks, we were in shock. But the worse was to hear our boy being described as a disturbance. They were not asking for help or input. They were telling us to take him back and place him in a specialised institution. Gone were the good will and smiles of the politicians, the grand talks of the school being open.

We have since called and met various specialists. Did we do something wrong? Did we misunderstand that the best place for our son was a regular school ? Did we misunderstand that the school ought to put a program in place for him ? Why didn’t they do an extra effort at the beginning and then gradually decrease support ? Were the evaluations of his cognitive abilities completely wrong ?

I can only compare the life of autistic kids and their family to those migrants trying to reach a better future in Europe these days*. Every day and in every aspect of our son’s life, we face an ocean to cross. We hurl ourselves into these raging waters, with little or no support. We fail. We sink. We cling to whatever we can. And start again. And yet, if we manage to cross we are only met with contempt and exclusion. We are the immobile migrants. Ours is a fight through time and against time as we flee ignorance and segregation.


*Don’t get me wrong: I am not in any way comparing our situation to that of migrants in Europe. These people live through a considerably more dramatic situation. I can not even begin to imagine the courage it takes to take your family through war, deserts and the raging seas in hope of a better future. They are the true heroes of our time, and one day shall be recognised as such.

Peyresq 2014: Last day



The 2014 edition of the Peyresq summer school just finished. Before the coda, let us summarize the very last talk of Jamal Najim on his series on large random matrices. Jamal’s talk was focused on applications of results he described us in the previous day. I will focus on the application to direction of arrival estimation (DOA), although Jamal also discussed MIMO communications.
In DOA the goal is to estimate the directions to r sources from n measurements that mix the sources linearly with phase factors and additive noise:
\vec{y} = \sum_{\ell=1}^{r} \vec{a}(\varphi_\ell)s_\ell + \sigma \vec{w}
The classical MUSIC algorithm uses the fact that one can obtain a characterization of the signal subspace from the measurements~:
\vec{a}^* (\varphi) \bigl( I_N - \Pi_N \bigr) \vec{a} (\varphi) = 0   \Leftrightarrow \varphi \in \{\varphi_1, ..., \varphi_r\}
where \Pi_N is the ortho projection on the span of the \vec{a}(\varphi_\ell. When the sources are drawn at random, this argument is replaced by an estimation using the ortho projection on the eigenvectors of the measurements covariant matrix since~:
\frac{1}{n} \mathbb{E}\{Y_N Y_N^*\} =   A_N(\vec{\varphi})  \mathbb{E}\{ \frac{S_N S_N^*}{n}\}   A^*_N(\vec{\varphi})  + \sigma^2 I_N  .
This naturally leads to replacing \Pi_N with the projection on the first r eigenvectors of the sample covariance matrix. Except that this is not a super good idea: modeling the sources by complex gaussian variables, one obtains a multiplicative spiked model with a rank r perturbation for the observations and we have learned with Jamal that the sample covariance matrix is not a good estimator of the covariance in this case. Fortunately the theory of large random matrices tells us how we can construct a consistent estimator which turns out to involve if a simple de-biasing correction. Jamal then showed with simulations how and when this method improves on the original MUSIC.

This was the conclusion of an absolutely marvelous week at Peyresq. I could not seriously express with words how much I enjoyed the week: I learned a lot of new things, discussed with many, got constructive feedback on my own work and this in turn generated many ideas I now want to work on. Every workshop should be like this instead of gigantic waste of time, energy and money (a whole week at Peyresq will set you back much less than the registration at those big conferences).

PeyresqThere is always a bit of saudade when such an excellent week ends. But here is an enormous thank you to :
. Patrick and Cédric who organized the event and made sure it was nothing short of perfect
. The Peyresq local team who makes it such a wonderful place to stay at
. All participants, speakers and students: you guys stay cool and keep up the good work!

Fifth day at Peyresq, continued



This post is courtesy of Pierre Borgnat, a huge thanks !


And what about Pierre Vandergheynst’s lectures ? Spread in three sessions, Pierre discussed about the recent advances in Harmonic Analysis on Graphs an Networks. The plan of the lectures was already given on the blog and I will cover a few salient points. In short, this was a marvellous series of lectures, especially given the necessary flexibility that Pierre displayed, when continuing his lecture on the black board due an electricity outage — then improvise a follow-up with both blackboard and a slides due to popular insistence!

First session on the second day, Pierre covered all basics of Spectral Graph Theory and all participants had a refresh about Laplacian matrices, spectrum of a graph, spectral cuts, spectral clustering and graph embedding using Laplacian eigenmaps. A nice example was shown for matching points clouds of a warped surface, for instance the shape of a map in point clouds — unfortunately, this would not work well if someone with his hand on the top of his headTake that silhoutte matching algorithm
Then, attacking the main core of the lectures, Pierre defined operators to study signals indexed on graphs: gradients, regularity, total variation. A central point is the intriguing and important notion coherence of a graph (the largest absolute value of the components of the eigenvectors of the Laplacian) — more on that later on.
Using functional calculus on graphs, it’s possible to define operators through the spectral domain so as, for instance, solve diffusion equation on a graph. The good thing is that it is now possible to give meaning to what is a filtering of graph signals (with examples of denoising or recommendation system). End of the first session.

Third day, second lecture: no electricity… And so what ? Pierre improvised on the blackboard. In the lectures, he showed that it is possible to define first operators to localise a kernel on a graphs and then how it can be interpreted as a translation so as to propose a generalisation of the convolution operator. Still, the “agonizing limit of intuition” limits some of the properties: the translation operator is not unitary, when its coherence departs from 1/\sqrt{N}. That might problematic in specific situations, or will that be an asset to study the local topology of the graphs? (we would be salved when studying graphs on manifolds or accepting a group of translation — these are developments not covered here). With all that, it was direct to define the spectral graph wavelets that provide a continuous and invertible transform using function localized both in frequency and vertices. The good points is that it’s possible to obtain a frame, even a tight frame; that they have nice localisation properties; and finally that implementation can be efficient using Chebyshev polynomials (“love Chebyshev !”). Better, a python toolbox is soon enough available… (the SGWT toolbox in matlab already exists).

Last day at Peyresq: two more hours par Pierre in which we learned how to put coarsen both signals and graphs by putting wavelets in a pyramidal algorithm. Fist ingredient: downsample the graphs using the nodal domains of the (or one of the) eigenvectors with largest eigenvalues of the Laplacian. Second ingredient: filter the signals (easy, remember that we learnt about graph filters and wavelets). Third ingredient: coarsen the graph y keeping only half of the nodal domains, and transform the graph using the Kron reduction (that should activate your memories of basic electricity). A major property is that the Kron reduction algebraically written as the Schur complement of a matrix) preserves properties of the Laplacian and it is useful to compute spline-like interpolation on a signal. This allows us to see the difference between the filtered signal retained on the coarsened graph and the original signal on the original graph (having all the vertices). With all that, you have a sound Laplacian pyramid to filter and coarsen graph signals. To make all that more efficient, the sparsification of the coarsened graphs is useful, using the work of Spielman and Srivastava (second side message of the lecture: once you now all about Chebyshev polynomials, read this paper in SIAM J. Comp. of 2011). The conclusion is that many applications about filtering on graphs, learning on graphs, visualisation of graphs, and so on, just wait in our reach. At the end of the course, it was a very motivating series of lectures.

What was skipped in the lectures? Materials were given in the slides about manifold learning with graph spectral theory or vertex-frequency analysis (in the spirit of the time-frequency analysis or Gagor transform — yet on graphs). I advise all of you to read the nice paper (on ArXiv) by David Shuman, Benjamin Ricaud and P. V. on the latter subject, and to discuss with Pierre about the former one.

Fifth day at Peyresq




This post is courtesy of Johan and Nathanaël – a big thanks !

Today Olivier Levêque made a very nice presentation about results of
large random matrix theory applied to Multi Input Multi Output (MIMO)
antenna systems and large random networks.

He started with a short introduction to information theory, especially
to capacity of a communication channel, on the blackboard, with style.
Olivier then showed a very powerful result in the field of MIMO
communications, which states that when the number $\latex n$ of antennas
grows we get a capacity or order $\latex n$, or

C_n = n \int_0^4   \log ( 1 + Px) q(x) dx .

This result is powerful since it means that theoretically the capacity
grows linarly with the number of antennas (which is a significant
improvement of the previously believed limiting order

C = log(1 +   Pn^2)

for multi antennas systems).

Olivier also enphasized that we can achieve this capacity without
increasing the power P, but simply by distributing it in a clever way,
which can be done using water-filling techniques. He concluded the first
part of the lecture by reminding that, although this result is
demonstrated for n goes to infinity, it actually works already for small
values of n, which explains the joke common amongst larse random matrix
people which states : 8 = \infty .

The second main part of Oliver’s talk dealed with large self-organized
wireless communication networks. The main question proposed to answer
was: assuming that n users want to communicate in a given area with an
O(1) density, what is the rate R_n they can hope to achieve ?
And in relation to this statement, how does the rate scale with n ?

In a setup where we wish to establish n communication channels, we use a
model where the signals that are received depend on the gain between two
nodes divided by the distance, with an additive white gaussian noise.
Note that the gain is i.i.d. and follows a complex gaussian
distribution. Finally the power of emission per user is limited and has
a given value P. For this model we would like to achieve R_n = O(1)

Following this model, a very naive scheme is to dispatch the n different
pairwise communication instances into n different time slots (or
equivalently n different frequency bands). In this case we achieve
R_n = O(\frac{1}{n}) which is really bad for n large.

The second approach uses multi-hop communication. And in this case,
Olivier explained that due to an overload of the users in the center of
the network the limiting rate is R_n = O(\frac{1}{\sqrt{n}} ,
which is better but still not good.

Finally, Oliver presented a scheme coined Distributed MIMO
communications which proceeds in an iterative way. At each stage, a user
broadcasts its message to his neighborhood, which will then send the
message to the neighborhood of the destination user, which receives all
the messages from its own neighbors. We can then recursively use this
scheme for the communication within the neighborhood. In this case,
Oliver showed that the limiting rate for large n with k recursions is
R_n = O(n^{\frac{-1}{k+1}}) which tends to O(1) when
k goes to infinity.

Oliver concluded his lecture by explaining a controversy about this
model, which states that considering gains to be i.i.d. is in fact
unrealistic. Some authors indeed claims that for this setup we can only
hope to have O(2n) i.i.d. random variables (instead of the
O(n^2) that are needed to prove the limits presented above. And
this would bring down the rate back to O(\frac{1}{\sqrt{n}}).
The final word provided a way to avoid this problem simply by stating
that practical networks tend to have a lower density, and that we can
achieve interesting rates given that the area is large enough.

Fourth day at Peyresq


, , ,

This morning we first had a talk by Pierre Borgnat on the use of data analytics for bike renting systems, which gives you access to various rhythms of a city.

We then all went for a super nice trekking around Peyresq, in pristine weather conditions: talk about perfection …


In the afternoon Pierre Borgnat gave a more technical two hours lecture on multiscale community detection and dynamic graphs. Pierre first convinced us that spectral clustering is not necessarily very good for community detection because there is no measure of partition quality and no reference to a null hypothesis (no community). This is why modularity was introduced. Modularity measures how much your graph departs from a null hypothesis where your graph is random with the same degrees. Optimizing modularity is difficult (NP complete?) but there exists a greedy algorithm (the Louvain method, what else?) that solves it, approximately, in polynomial time. Pierre then set on a challenging agenda. He first discussed cases where one really needs a multiscale definition of communities. In fact this is quite often the case since they tend to agglomerate at various scales and allow for rich interpretation. Borgnat used a wavelet transform transform on graphs to give an original and intuitive definition of what multiscale communities are. One of the niceties is that the method comes with a fast algorithm that scales up to fairly large graphs. Another important point stressed by Pierre is the definition of a stability criterion that allows to judge partitions. Pierre showed various applications and a second talk by Rasha Boulos even used the technique in genomics in a very interesting way. In the second part of his talk, Pierre discussed dynamical graph models. This is a very emerging domain where a lot remains to be done. Pierre wet our appetite with interesting ideas that allow to detect and track patterns in dynamical applications.




Among the great things at the Peyresq summer school is the wonderful atmosphere. Of course, staying in a stunningly beautiful place helps. Here is the french community packed in front the village’s only TV set, agonisingly waiting for France to score against Ecuador.

Watching the game at Peyresq

Yesterday morning was also fun thanks to a power outage that left us without beamer. I improvised my talk on the blackboard, with a little help from Patrick and Olivier.
Holding the screen

Third day at Peyresq



I am getting a little behind schedule, so let us start by summarizing the last talk given yesterday by Jean-François Cardoso.


Big Bang, Big Data

Jean-François gave a wonderful introduction to the Planck mission. For those who don’t know Planck, it is in my opinion one of the most amazing feat of human intelligence in this century. Planck is an ESA satellite measuring with unprecendented precision the Cosmic Microwave Background (CMB). The CMB is in some sense the very first picture of the universe: right after the big bang, the universe was a very hot plasma of charged particles in which light could not travel freely – any photon was constantly scattered by interactions with these charged particles. But roughly after 380000 years, because of expansion, the universe cooled down to a critical temperature at which electrons and nuclei can bind into neutral atoms. At that point, photons were allowed to travel uninjured and we are still receiving them today in the form of a 3K microwave radition that was first detected by Penzas and Wilson in the 60s (although it had been predicted by Gamow I believe). This CMB is an amazingly important object because it carries a lot of information about the universe in its early stage, but also about what happened afterwards. Indeed, Jean-François explained that tiny temperature fluctuations (of only 10^-5 relative amplitude) in the CMB are related to little density inhomogeneities in the plasma that are responsible, by gravitational effects, for the formation of galaxies and the structure of the universe as we know it. Simply put, this is our cosmic DNA. There are many challenges in processing the CMB as explained by Cardoso. Jean-François focused on his work that aimed at using the various frequency bands of the observations to separate the CMB from various pesky foreground contamination. Jean-François used a source separation technique that works so well it is now the de facto technique used to create CMB maps. At the heart of the technique is the use of Fourier techniques to exploit the knowledge of the CMB power spectral density and separate it from contamination.

Second day at Peyresq


, ,

Second day: follow-up on Mallat’s talk

Let us start by summarizing, very schematically, the end of Stéphane’s talk. The idea is now that our classes do not feature simple geometric invariances (think of images of beavers). This is where things get tricky but potentially even more interesting. First, taking convolutions along geometric deformations will be simplified by switching to a Haar wavelet transform. In the Haar transform, you only need to compute averages and differences between pairs of pixels – they are taken aligned on the signal grid in the classical transform. For Haar scattering, this is generalized by allowing to apply these operations to any pair of points. The question is now how do we choose those pairs ? This seems combinatorially hopeless … Stéphane’s argument is that this pairing can be learned by imposing that we reduce space volume while maintaining inter class margins. Interestingly, but this is out of the scope of this summary, this can be turned into a convex optimization problem. Comparing the scattering transform to some of the most recent deep nets shows that with only two layers performances are comparable but with much less training complexity. The most sophisticated deep nets then add extra layers that are not exactly similar to the first ones and it is not clear what these try to achieve.

Large random matrices



Jamal Najim gave a wonderful first course on large random matrices. This was an introduction to celebrated results, like Wigner’s semi-circle law that controls the density of eigenvalues of large random symmetric matrices. Jamal then took us through the proof of the Marcenko-Pastur which provides the asymptotic density of eigenvalues for large covariance matrices. This was really a tour de force since the proof is rather complex, but Najim highlighted its architecture (a subtle mix of probability and algebra, glued with the Stieltjes transform) with great pedagogy.