Stefano Sarao Mannelli

I am Stefano Sarao Mannelli, a Senior Reseach Fellow working with Andrew Saxe at Gatsby and SWC (University College London). Prior to my current position, I held a postdoctoral position at the University of Oxford (still with Andrew Saxe) and obtained a Ph.D. in Physics applied to Machine Learning at the University of Paris-Saclay supervised by Lenka Zdeborova. My research focuses on analysing machine learning problems using a model-based approach, where the complexity of the problem is reduced to obtain a parsimonious solvable model that still captures the phenomenon of interest. In my previous works, I applied several variations of this approach to study problems in learning, such as transfer learning, continual learning, and curriculum learning.

Contacts and information

Publications

Optimal Protocols for Continual Learning via Statistical Physics and Control Theory
A meta-learning framework for rationalizing cognitive fatigue in neural systems
Bias in Motion: Theoretical Insights into the Dynamics of Bias in SGD Training
Tilting the Odds at the Lottery: the Interplay of Overparameterisation and Curricula in Neural Networks
Why Do Animals Need Shaping? A Theory of Task Composition and Curriculum Learning
The RL Perceptron: Generalisation Dynamics of Policy Learning in High Dimensions
Optimal transfer protocol by incremental layer defrosting
An Analytical Theory of Curriculum Learning in Teacher-Student Networks
Bias-inducing geometries: an exactly solvable data model with fairness implications
Maslow's Hammer for Catastrophic Forgetting: Node Re-Use vs Node Activation
Probing transfer learning with a model of synthetic correlated datasets
Analytical Study of Momentum-Based Acceleration Methods in Paradigmatic High-Dimensional Non-Convex Problems
Epidemic mitigation by statistical inference from contact tracing data
Optimization and Generalization of Shallow Neural Networks with Quadratic Activation Functions
Complex Dynamics in Simple Neural Networks: Understanding Gradient Flow in Phase Retrieval
Thresholds of descending algorithms in inference problems
Marvels and pitfalls of the Langevin algorithm in noisy high-dimensional inference
Who is Afraid of Big Bad Minima? Analysis of gradient-flow in spiked matrix-tensor models
Passed & Spurious: Descent Algorithms and Local Minima in Spiked Matrix-Tensor Models