Stefano Sarao Mannelli

I am a tenure-track Assistant Professor in the Data Science and AI division in the Computer Science department of Chalmers University of Technology and Gothenburg University. My research interests lie in building a fundamental understanding of learning in ML systems, with a particular focus on bias generation and amplification.

Research Directions

  • Bias generation and amplification in ML systems. What is behind biases in an ML system? What is the impact of our design choices?
  • Learning differences in biological and artificial neural networks. Curriculum learning, continual learning, transfer learning, same concept but completely different results in animals and machines, why?
  • Optimisation in rough landscapes. Connecting dynamics and landscape properties in optimisation.

Keen to know more? Check out the research page or you can read my most recent publications.

Latest News

Oct 2024
Exciting news! Congratulations to John Hopfield and Geoffrey Hinton for winning the 2024 Nobel Prize! As part of this community, I can’t help but feel a sense of pride and joy. In these two articles, we discuss the achievement with colleagues both at our division and at GU.
Sept 2024
Our paper on bias evolution under SGD dynamics (preprint here) has been accepted to NeurIPS 2024! Congrats everyone and especially Anchit Jain for leading the project!
Sept 2024
First day at Chalmers as Assistent Professor!
Aug 2024
Today starts the new edition of Analytical Connectionism at the Flatiron Institute in NYC. We have a great set of speakers, TAs and project mentors! Two exciting weeks ahead!
Jul 2024
Our paper on transfer learning, probing the dependance of the best source representation on data abundance and similarity, has been accepted to TMLR! Congrats everybody!
Jul 2024
I am excited to announce two available positions: one for a PhD student and one for a Postdoc. Click the links to learn more and apply!
Jun 2024
I am delighted that our paper on cognitive fatigue has been accepted to CogSci 2024 as an oral contribution!
May 2024
Last week to apply for the 2nd edition of Analytical Connectionism! Application deadline on Friday 17th.
May 2024
I'll open soon two positions for a Postdoc and a PhD student, stay tuned!
May 2024
ICML decisions are out and I got two papers accepted!

You can find all the news here.

Recent publications

  • Optimal Protocols for Continual Learning via Statistical Physics and Control Theory

    Francesco Mori, Stefano Sarao Mannelli, Francesca Mignacco
    Post thumbnail
    Post thumbnail
    Artificial neural networks often struggle with catastrophic forgetting when learning multiple tasks sequentially, as training on new tasks degrades the performance on previously learned ones. Recent theoretical work has addressed this issue by analysing learning curves in synthetic frameworks under predefined training protocols. However, these protocols relied on heuristics and lacked a solid theoretical foundation assessing their optimality. In this paper, we fill this gap combining exact equations for training dynamics, derived using statistical physics techniques, with optimal control methods. We apply this approach to teacher-student models for continual learning and multi-task problems, obtaining a theory for task-selection protocols maximising... [Read Article]
  • A meta-learning framework for rationalizing cognitive fatigue in neural systems

    Yujun Li, Rodrigo Carrasco-Davis, Younes Strittmatter, Stefano Sarao Mannelli, Sebastian Musslick
    Post thumbnail
    Post thumbnail
    The ability to exert cognitive control is central to human brain function, facilitating goal-directed task performance. However, humans exhibit limitations in the duration over which they can exert cognitive control -a phenomenon referred to as cognitive fatigue. This study explores a computational rationale for cognitive fatigue in continual learning scenarios: cognitive fatigue serves to limit the extended performance of one task to avoid the forgetting of previously learned tasks. Our study employs a meta-learning framework, wherein cognitive control is optimally allocated to balance immediate task performance with forgetting of other tasks. We demonstrate that this model replicates common patterns of... [Read Article]
  • Bias in Motion: Theoretical Insights into the Dynamics of Bias in SGD Training

    Anchit Jain, Rozhin Nobahari, Aristide Baratin, Stefano Sarao Mannelli
    Post thumbnail
    Post thumbnail
    Machine learning systems often acquire biases by leveraging undesired features in the data, impacting accuracy variably across different sub-populations. Current understanding of bias formation mostly focuses on the initial and final stages of learning, leaving a gap in knowledge regarding the transient dynamics. To address this gap, this paper explores the evolution of bias in a teacher-student setup modeling different data sub-populations with a Gaussian-mixture model. We provide an analytical description of the stochastic gradient descent dynamics of a linear classifier in this setting, which we prove to be exact in high dimension. Notably, our analysis reveals how different properties... [Read Article]
  • Tilting the Odds at the Lottery: the Interplay of Overparameterisation and Curricula in Neural Networks

    Stefano Sarao Mannelli, Yaraslau Ivashinka, Andrew Saxe, Luca Saglietti
    Post thumbnail
    Post thumbnail
    A wide range of empirical and theoretical works have shown that overparameterisation can amplify the performance of neural networks. According to the lottery ticket hypothesis, overparameterised networks have an increased chance of containing a sub-network that is well-initialised to solve the task at hand. A more parsimonious approach, inspired by animal learning, consists in guiding the learner towards solving the task by curating the order of the examples, i.e. providing a curriculum. However, this learning strategy seems to be hardly beneficial in deep learning applications. In this work, we propose an analytical study that connects curriculum learning and overparameterisation. In... [Read Article]
  • Why Do Animals Need Shaping? A Theory of Task Composition and Curriculum Learning

    Jin Hwa Lee, Stefano Sarao Mannelli, Andrew Saxe
    Post thumbnail
    Post thumbnail
    Diverse studies in systems neuroscience begin with extended periods of training known as 'shaping' procedures. These involve progressively studying component parts of more complex tasks, and can make the difference between learning a task quickly, slowly or not at all. Despite the importance of shaping to the acquisition of complex tasks, there is as yet no theory that can help guide the design of shaping procedures, or more fundamentally, provide insight into its key role in learning. Modern deep reinforcement learning systems might implicitly learn compositional primitives within their multilayer policy networks. Inspired by these models, we propose and analyse... [Read Article]
  • Explore other publications here.