StefSM Lab group photo

We are a research group based in the Data Science and AI Division of the Department of Computer Science and Engineering at Chalmers University of Technology and University of Gothenburg.

Our mission is to develop a fundamental understanding of learning in artificial and biological systems, with a focus on the origins of bias, generalisation, and optimisation dynamics in machine learning. We combine tools from statistical physics, neuroscience, and deep learning theory.

  • Bias generation and amplification in ML systems. What is behind biases in an ML system? What is the impact of our design choices?
  • Learning differences in biological and artificial neural networks. Curriculum learning, continual learning, transfer learning — same concepts but very different results in animals and machines. Why?
  • Optimisation in rough landscapes. Connecting dynamics and landscape properties in optimisation.

Keen to know more? Check out the research page or browse our recent publications.

The group is led by Stefano Sarao Mannelli, Assistant Professor at Chalmers and Gothenburg University, and Visiting Lecturer at the University of the Witwatersrand and it is composed by Flavio Nicoletti (Postdoc), Chenxiao Ma (PhD Student), Loek van Rossem (PhD Student), and Jie Huang (Master Student). Learn more about our group here.

Latest News

May 2026
New preprint from the group! Flavio, Chenxiao, Enrico, Luca, and Stefano released The Interplay of Data Structure and Imbalance in the Learning Dynamics of Diffusion Models, studying how class structure and imbalance shape generalisation and memorisation in diffusion models.
May 2026
ICML decisions are out, and we are excited to have two accepted papers! Jie, Bruno, and Stefano's work on local minima in high-dimensional two-layer ReLU networks was accepted, and Devon, Richard, Benjamin, Steven, and Stefano's position paper on model collapse and low-resource communities was accepted as a Spotlight.
May 2026
Great news! Stefano has been awarded a Vetenskapsfond grant from Wilhelm och Martina Lundgrens stiftelser, supporting the group's research on the theoretical foundations of machine learning.
Mar 2026
Stefano presented the group's research goals and recent results at the MIND Institute at the University of the Witwatersrand.
Mar 2026
Chenxiao and Flavio presented our work on diffusion models and data imbalance at the RAIL Lab and CAANDL, respectively.
Feb 2026
Come visit us this June! Flavio and Stefano are organising the workshop Mathematical Foundations of AI in Chalmers, on June 8-10. Many high-profile speakers are joining us, to discuss recent theoretical advancements in Diffusion Models, Transformers and Associative Memories.
Feb 2026
Flavio is presenting our ongoing work On class unbalance in diffusion models at the 15th Nordic Workshop on Statistical Physics in NORDITA, as a spotlight talk!
Jan 2026
Great news! New positions opening soon. We have been awarded the AI Alignment Project, securing nearly £1M in funding. This will support the hiring of two Postdoctoral Researchers and one Research Assistant to work on developing the theoretical foundations of AI safety.
Nov 2025
Stefano will present our work paper on bias dynamics at the MLLS seminars in Copenhagen this week!
Sept 2025
EurIPS Workshop decisions are out, and we are excited to see our workshop "Unifying Perspectives on Learning Biases" accepted at the conference! Interested in the topic? Send us your contribution!

You can find all the news here.

Recent publications

  • The Interplay of Data Structure and Imbalance in the Learning Dynamics of Diffusion Models

    Flavio Nicoletti, Chenxiao Ma, Enrico Ventura, Luca Saglietti, Stefano Sarao Mannelli
    Real-world datasets differ across classes in both structure and frequency, but most theory for diffusion models assumes homogeneous data. This work develops a high-dimensional analytical framework for class-dependent learning in score-based diffusion models. Using a random-features model trained on Gaussian mixtures, the paper characterizes how class variance, centroid geometry, and sampling imbalance shape the timing of generalization and memorization. The analysis predicts that diffusion models may memorize some classes while others remain underlearned, and the theory is validated with U-Net experiments on Fashion MNIST. [Read Article]
  • Position: the Stochastic Parrot in the Coal Mine. Model Collapse is a Threat to Low-Resource Communities

    Devon Jarvis, Richard Klein, Benjamin Rosman, Steven James, Stefano Sarao Mannelli
    Model collapse can degrade generative models when they are trained on outputs from earlier models. This position paper argues that the problem compounds existing concerns around large language models, including cultural bias, data degradation, environmental cost, and inefficient resource use. The authors highlight how these dynamics can disproportionately harm low-resource and marginalized communities by reducing training efficiency and shifting generated data away from the tails of real data distributions. The paper concludes with mitigation directions and a call to treat model collapse as a democratization risk. [Read Article]
  • Sharp description of local minima in the loss landscape of high-dimensional two-layer ReLU neural networks

    Jie Huang, Bruno Loureiro, Stefano Sarao Mannelli
    We study the population loss landscape of two-layer ReLU networks in a realisable teacher-student setting with Gaussian covariates. The work shows that local minima admit an exact low-dimensional representation through summary statistics, giving a sharp and interpretable description of the landscape. It also links local minima to attractive fixed points of one-pass SGD dynamics and characterizes how overparameterisation changes the geometry of minima and the accessibility of global solutions. [Read Article]
  • Thinking of Neural Networks Like a Physicist: The Statistical Physics of Machine Learning

    Kai Jappe Sandbrink, Stefano Sarao Mannelli, Florent Krzakala
    This pedagogical paper introduces statistical-physics approaches to machine learning, based on material presented at Analytical Connectionism 2023. It reviews how tools such as the replica method and approximate message passing illuminate unsupervised learning problems, then turns to supervised learning and neural-network dynamics in lazy-learning and feature-learning regimes. The paper closes by connecting these ideas to current research directions and cognitive psychology. [Read Article]
  • Bias-inducing geometries: an exactly solvable data model with fairness implications

    Stefano Sarao Mannelli, Federica Gerace, Negar Rostamzadeh, Luca Saglietti
    Post thumbnail
    Post thumbnail
    Machine learning (ML) may be oblivious to human bias but it is not immune to its perpetuation. Marginalisation and iniquitous group representation are often traceable in the very data used for training, and may be reflected or even enhanced by the learning models. In the present work, we aim at clarifying the role played by data geometry in the emergence of ML bias. We introduce an exactly solvable high-dimensional model of data imbalance, where parametric control over the many bias-inducing factors allows for an extensive exploration of the bias inheritance mechanism. Through the tools of statistical physics, we analytically characterise... [Read Article]
  • Explore other publications here.