StefSM Lab

We are a research group based in the Data Science and AI Division of the Department of Computer Science and Engineering at Chalmers University of Technology and University of Gothenburg.

Our mission is to develop a fundamental understanding of learning in artificial and biological systems, with a focus on the origins of bias, generalisation, and optimisation dynamics in machine learning. We combine tools from statistical physics, neuroscience, and deep learning theory.

  • Bias generation and amplification in ML systems. What is behind biases in an ML system? What is the impact of our design choices?
  • Learning differences in biological and artificial neural networks. Curriculum learning, continual learning, transfer learning — same concepts but very different results in animals and machines. Why?
  • Optimisation in rough landscapes. Connecting dynamics and landscape properties in optimisation.

Keen to know more? Check out the research page or browse our recent publications.

The group is led by Stefano Sarao Mannelli, Assistant Professor at Chalmers and Gothenburg University, and Visiting Lecturer at the University of the Witwatersrand. Meet the rest of the lab members here.

Latest News

May 2025
Next week, on May 26th-27th, we will host the Workshop in Advancements in High-Dimensional Methods for Machine Learning. We are looking forward to the event!
Apr 2025
Apr 2025
Save the date! Stefano and Flavio organised two events on statistical physics of learning here in Gothenburg! This is unfortunately a late announcement for the first event (it was on April 1st), but there is still time to join us for our Workshop in Advancements in High-Dimensional Methods for Machine Learning on May 26-27.
Apr 2025
We are very happy to meet with the local cognitive science community at the G-Cog Seminar Series. Stefano will talk about our recent work accepted to CogSci conference on curriculum learning and what modern ML theory can tell us about it.
Mar 2025
Stefano will talk about our work on learning with prior experience and in particular optimal control in continual learning at the workshop AI for active matter - From plankton to robots.
Mar 2025
Great news! Stefano has been nominated Visiting Lecturer at Wits University. While this collaboration with Wits is older, it is nice to see it formally recognised. Stefano and the CAandL Lab, are organising a master-level course on ML theory for year 2026!
Mar 2025
It is a great pleasure to meet the statistical physics community of the Nordic countries at the Nordic Workshop on Statistical Physics. Stefano will talk about how StatPhys tools can shed lights on ML problems and in particular about how we can understand bias generation and amplification in ML.
Feb 2025
Application for the 3rd edition of Analytical Connectionism! This time we are back to London from Aug 25th to Sep 5th. Application deadline on April 18th.
Jan 2025
Dec 2024
Stefano will present our NeurIPS paper on bias dynamics at the FAIMI Workshop this week!

You can find all the news here.

Recent publications

  • Curriculum learning in humans and neural networks

    Younes Strittmatter*, Stefano Sarao Mannelli*, Miguel Ruiz-Garcia, Sebastian Musslick, Markus Wolfgang Hermann Spitzer
    Post thumbnail
    Post thumbnail
    The sequencing of training trials can significantly influence learning outcomes in humans and neural networks. However, studies comparing the effects of training curricula between the two have typically focused on the acquisition of multiple tasks. Here, we investigate curriculum learning in a single perceptual decision-making task, examining whether the behavior of a parsimonious network trained on different curricula would be replicated in human participants. Our results show that progressively increasing task difficulty during training facilitates learning compared to training at a fixed level of difficulty or at random. Furthermore, a sequences designed to hamper learning in a parsimonious neural network... [Read Article]
  • A Theory of Initialisation's Impact on Specialisation

    Devon Jarvis, Sebastian Lee, Clémentine Carla Juliette Dominé, Andrew M Saxe, Stefano Sarao Mannelli
    Post thumbnail
    Post thumbnail
    Prior work has demonstrated a consistent tendency in neural networks engaged in continual learning tasks, wherein intermediate task similarity results in the highest levels of catastrophic interference. This phenomenon is attributed to the network's tendency to reuse learned features across tasks. However, this explanation heavily relies on the premise that neuron specialisation occurs, i.e. the emergence of localised representations. Our investigation challenges the validity of this assumption. Using theoretical frameworks for the analysis of neural networks, we show a strong dependence of specialisation on the initial condition. More precisely, we show that weight imbalance and high weight entropy can favour... [Read Article]
  • Optimal Protocols for Continual Learning via Statistical Physics and Control Theory

    Francesco Mori, Stefano Sarao Mannelli, Francesca Mignacco
    Post thumbnail
    Post thumbnail
    Artificial neural networks often struggle with catastrophic forgetting when learning multiple tasks sequentially, as training on new tasks degrades the performance on previously learned ones. Recent theoretical work has addressed this issue by analysing learning curves in synthetic frameworks under predefined training protocols. However, these protocols relied on heuristics and lacked a solid theoretical foundation assessing their optimality. In this paper, we fill this gap combining exact equations for training dynamics, derived using statistical physics techniques, with optimal control methods. We apply this approach to teacher-student models for continual learning and multi-task problems, obtaining a theory for task-selection protocols maximising... [Read Article]
  • A meta-learning framework for rationalizing cognitive fatigue in neural systems

    Yujun Li, Rodrigo Carrasco-Davis, Younes Strittmatter, Stefano Sarao Mannelli, Sebastian Musslick
    Post thumbnail
    Post thumbnail
    The ability to exert cognitive control is central to human brain function, facilitating goal-directed task performance. However, humans exhibit limitations in the duration over which they can exert cognitive control -a phenomenon referred to as cognitive fatigue. This study explores a computational rationale for cognitive fatigue in continual learning scenarios: cognitive fatigue serves to limit the extended performance of one task to avoid the forgetting of previously learned tasks. Our study employs a meta-learning framework, wherein cognitive control is optimally allocated to balance immediate task performance with forgetting of other tasks. We demonstrate that this model replicates common patterns of... [Read Article]
  • Bias in Motion: Theoretical Insights into the Dynamics of Bias in SGD Training

    Anchit Jain, Rozhin Nobahari, Aristide Baratin, Stefano Sarao Mannelli
    Post thumbnail
    Post thumbnail
    Machine learning systems often acquire biases by leveraging undesired features in the data, impacting accuracy variably across different sub-populations. Current understanding of bias formation mostly focuses on the initial and final stages of learning, leaving a gap in knowledge regarding the transient dynamics. To address this gap, this paper explores the evolution of bias in a teacher-student setup modeling different data sub-populations with a Gaussian-mixture model. We provide an analytical description of the stochastic gradient descent dynamics of a linear classifier in this setting, which we prove to be exact in high dimension. Notably, our analysis reveals how different properties... [Read Article]
  • Explore other publications here.