Latest News
You can find all the news here.
Recent publications
- Real-world datasets differ across classes in both structure and frequency, but most theory for diffusion models assumes homogeneous data. This work develops a high-dimensional analytical framework for class-dependent learning in score-based diffusion models. Using a random-features model trained on Gaussian mixtures, the paper characterizes how class variance, centroid geometry, and sampling imbalance shape the timing of generalization and memorization. The analysis predicts that diffusion models may memorize some classes while others remain underlearned, and the theory is validated with U-Net experiments on Fashion MNIST. [Read Article]
Position: the Stochastic Parrot in the Coal Mine. Model Collapse is a Threat to Low-Resource Communities
Devon Jarvis, Richard Klein, Benjamin Rosman, Steven James, Stefano Sarao MannelliModel collapse can degrade generative models when they are trained on outputs from earlier models. This position paper argues that the problem compounds existing concerns around large language models, including cultural bias, data degradation, environmental cost, and inefficient resource use. The authors highlight how these dynamics can disproportionately harm low-resource and marginalized communities by reducing training efficiency and shifting generated data away from the tails of real data distributions. The paper concludes with mitigation directions and a call to treat model collapse as a democratization risk. [Read Article]Sharp description of local minima in the loss landscape of high-dimensional two-layer ReLU neural networks
Jie Huang, Bruno Loureiro, Stefano Sarao MannelliWe study the population loss landscape of two-layer ReLU networks in a realisable teacher-student setting with Gaussian covariates. The work shows that local minima admit an exact low-dimensional representation through summary statistics, giving a sharp and interpretable description of the landscape. It also links local minima to attractive fixed points of one-pass SGD dynamics and characterizes how overparameterisation changes the geometry of minima and the accessibility of global solutions. [Read Article]Thinking of Neural Networks Like a Physicist: The Statistical Physics of Machine Learning
Kai Jappe Sandbrink, Stefano Sarao Mannelli, Florent KrzakalaThis pedagogical paper introduces statistical-physics approaches to machine learning, based on material presented at Analytical Connectionism 2023. It reviews how tools such as the replica method and approximate message passing illuminate unsupervised learning problems, then turns to supervised learning and neural-network dynamics in lazy-learning and feature-learning regimes. The paper closes by connecting these ideas to current research directions and cognitive psychology. [Read Article]Bias-inducing geometries: an exactly solvable data model with fairness implications
Stefano Sarao Mannelli, Federica Gerace, Negar Rostamzadeh, Luca SagliettiMachine learning (ML) may be oblivious to human bias but it is not immune to its perpetuation. Marginalisation and iniquitous group representation are often traceable in the very data used for training, and may be reflected or even enhanced by the learning models. In the present work, we aim at clarifying the role played by data geometry in the emergence of ML bias. We introduce an exactly solvable high-dimensional model of data imbalance, where parametric control over the many bias-inducing factors allows for an extensive exploration of the bias inheritance mechanism. Through the tools of statistical physics, we analytically characterise... [Read Article]
Explore other publications here.