🚀 Open Positions: AI Safety
We are looking for two Postdoctoral Researchers to join the lab and work on the theoretical foundations of AI Safety. This will be part of a larger collaboration with Andrew Saxe and Jin Hwa Lee at UCL and Principia!
View Postdoc Vacancies & Apply
Note: We will soon be opening a Research Assistant position focused on testing these theoretical results on frontier models. Stay tuned!
Latest News
You can find all the news here.
Recent publications
- Machine learning (ML) may be oblivious to human bias but it is not immune to its perpetuation. Marginalisation and iniquitous group representation are often traceable in the very data used for training, and may be reflected or even enhanced by the learning models. In the present work, we aim at clarifying the role played by data geometry in the emergence of ML bias. We introduce an exactly solvable high-dimensional model of data imbalance, where parametric control over the many bias-inducing factors allows for an extensive exploration of the bias inheritance mechanism. Through the tools of statistical physics, we analytically characterise... [Read Article]
Curriculum learning in humans and neural networks
Younes Strittmatter*, Stefano Sarao Mannelli*, Miguel Ruiz-Garcia, Sebastian Musslick, Markus Wolfgang Hermann SpitzerThe sequencing of training trials can significantly influence learning outcomes in humans and neural networks. However, studies comparing the effects of training curricula between the two have typically focused on the acquisition of multiple tasks. Here, we investigate curriculum learning in a single perceptual decision-making task, examining whether the behavior of a parsimonious network trained on different curricula would be replicated in human participants. Our results show that progressively increasing task difficulty during training facilitates learning compared to training at a fixed level of difficulty or at random. Furthermore, a sequences designed to hamper learning in a parsimonious neural network... [Read Article]A Theory of Initialisation's Impact on Specialisation
Devon Jarvis, Sebastian Lee, Clémentine Carla Juliette Dominé, Andrew M Saxe, Stefano Sarao MannelliPrior work has demonstrated a consistent tendency in neural networks engaged in continual learning tasks, wherein intermediate task similarity results in the highest levels of catastrophic interference. This phenomenon is attributed to the network's tendency to reuse learned features across tasks. However, this explanation heavily relies on the premise that neuron specialisation occurs, i.e. the emergence of localised representations. Our investigation challenges the validity of this assumption. Using theoretical frameworks for the analysis of neural networks, we show a strong dependence of specialisation on the initial condition. More precisely, we show that weight imbalance and high weight entropy can favour... [Read Article]Optimal Protocols for Continual Learning via Statistical Physics and Control Theory
Francesco Mori, Stefano Sarao Mannelli, Francesca MignaccoArtificial neural networks often struggle with catastrophic forgetting when learning multiple tasks sequentially, as training on new tasks degrades the performance on previously learned ones. Recent theoretical work has addressed this issue by analysing learning curves in synthetic frameworks under predefined training protocols. However, these protocols relied on heuristics and lacked a solid theoretical foundation assessing their optimality. In this paper, we fill this gap combining exact equations for training dynamics, derived using statistical physics techniques, with optimal control methods. We apply this approach to teacher-student models for continual learning and multi-task problems, obtaining a theory for task-selection protocols maximising... [Read Article]How to choose the right transfer learning protocol? A qualitative analysis in a controlled set-up
Federica Gerace, Diego Doimo, Stefano Sarao Mannelli, Luca Saglietti, Alessandro LaioTransfer learning is a powerful tool enabling model training with limited amounts of data. This technique is particularly useful in real-world problems where data availability is often a serious limitation. The simplest transfer learning protocol is based on ``freezing" the feature-extractor layers of a network pre-trained on a data-rich source task, and then adapting only the last layers to a data-poor target task. This workflow is based on the assumption that the feature maps of the pre-trained model are qualitatively similar to the ones that would have been learned with enough data on the target task. In this work, we... [Read Article]
Explore other publications here.