Stefano Sarao Mannelli

I am Stefano Sarao Mannelli, a tenure-track Assistant Professor in the Data Science and AI division in the Computer Science department of Chalmers University of Technology and Gothenburg University, and Visiting Lecturer at the University of the Witwatersrand. Prior to my current position, I worked as a postdoc with Andrew Saxe at the University College London and the University of Oxford and obtained a Ph.D. in Physics applied to Machine Learning at the University of Paris-Saclay supervised by Lenka Zdeborova. My research focuses on analysing machine learning problems using a model-based approach, where the complexity of the problem is reduced to obtain a parsimonious solvable model that still captures the phenomenon of interest. In my previous works, I applied several variations of this approach to study problems in learning, such as transfer learning, continual learning, and curriculum learning.

Contacts and information

Curriculum Vitae

> Research Experience

WASP-AI Assistant Professor Tenure Track, Department of Computer Science and Engineering, Chalmers University of Technology and Gothenburg University, Gothenburg, Sweden
Sept. 2024 – Present
Senior Research Fellow, Gatsby Computational Neuroscience Unit and Sainsbury Wellcome Centre, University College London, London, UK
Jan. 2023 – Aug. 2024
Supervisor: Andrew Saxe
Research Fellow, Gatsby Computational Neuroscience Unit and Sainsbury Wellcome Centre, University College London, London, UK
Jun. 2021 – Dec. 2022
Supervisor: Andrew Saxe
Research Fellow, Department of Experimental Psychology, University of Oxford, Oxford, UK
Oct. 2020 – Jun. 2021
Supervisor: Andrew Saxe
Visiting Researcher, New York University (NYU), New York, USA
Mar. 2020 – Apr. 2020
Visiting Researcher, Kavli Institute for Theoretical Physics (KITP), University of California Santa Barbara, Santa Barbara, USA
Jan. 2019 – Mar. 2019
Visiting Researcher, Duke University, Durham, USA
Feb. 2018
Research Intern, IPhT, CEA, Saclay, France
Mar. 2017 – Jul. 2017
Supervisor: Lenka Zdeborová
Research Intern, CMLA, ENS Cachan, Cachan, France
Jan. 2016 – Apr. 2016
Supervisor: Nicolas Vayatis

> Education

Ph.D. in Theoretical Physics, IPhT, CEA, Saclay, France
Oct. 2017 – Oct. 2020
Supervisor: Lenka Zdeborová
M.Sc. in Electronic Engineering, Politecnico di Torino, Turin, Italy
Sept. 2016 – Oct. 2017
Grade: 110/110 cum laude
M.Sc. in Physics of Complex Systems, Politecnico di Torino – SISSA, Turin, Italy
Sept. 2014 – Jul. 2016
Grade: 110/110 cum laude
M2 in Physique Théorique, Paris Diderot, UPMC, and ENS Cachan, Paris, France
Sept. 2014 – Jul. 2016
Alta Scuola Politecnica Diploma, Politecnico di Torino and Politecnico di Milano, Italy
Sept. 2014 – Jul. 2016
M.Sc. in Engineering Physics, Politecnico di Milano, Milan, Italy
Sept. 2014 – Jun. 2017
Grade: 110/110 cum laude
B.Sc. in Mathematics for Engineering, Politecnico di Torino, Turin, Italy
Sept. 2011 – Jul. 2014
Grade: 110/110 cum laude

> Organisation of Events

KITP Program on Statistical Physics & Neurobiology of Learning in High-Level Cognition – Organizer (future event), Santa Barbara, USA
Jun.-Jul. 2027
Conference on the Neuroscientific and Psychological Theories of Development in the Deep Learning Era – Organizer (future event), Santa Barbara, USA
Jul. 2027
School on Analytical Connectionism – Organizer (future event), London, UK
Aug. 2025
Secured £25K funding
Workshop in Advancements in High-Dimensional Methods for Machine Learning – Organizer
May 2025
Secured 135,500 SEk
Introduction to Statistical Physics for ML Theory – Organizer
Apr. 2025
School on Analytical Connectionism – Organizer, New York, USA
Aug. 2024
Secured $152K funding
Workshop on Bridging Analytical and Experimental Insights into Representational Change (CoSyNe) – Organizer, Lisbon, Portugal
Mar. 2024
Workshop on Analytical Approaches for Neural Network Dynamics – Organizer, Paris, France
Oct. 2023
Secured €10K funding
School and Workshop on Analytical Connectionism – Organizer, London, UK
Aug.–Sep. 2023
Secured £42K funding
Workshop on Communication Across Communities in Machine Learning Research and Practice (FAccT 2022) – Organizer, Seoul, South Korea
Jun. 2022
Workshop on Science and Engineering of Deep Learning (ICLR 2021) – Organizer, Virtual
May 2021

> Research Projects

Bias Generation and Amplification
Developing a theoretical framework to identify the key factors contributing to ML misbehaviour against population subgroups.
Bridging Biological and Artificial Neural Networks
Investigating the learning differences between biological and artificial neural networks to improve both NN models and our understanding of the brain.
Connecting Dynamics and Landscapes in High Dimensions
Reconciling and identifying potential limitations of loss landscape properties in relation to the asymptotic performance of ML systems.

> Students and Postdocs

Loek Von Rossem – PhD student (3rd year)
Co-supervisor: Andrew Saxe
Chenxiao Ma – PhD student
Start: Feb. 2025
Flavio Nicoletti – Postdoc
Start: Mar. 2025

> Grants and Awards

Academic Grant – CM Lerici Foundation
Jan. 2025
Amount: 60,000 SEK
Travel Grant – Guarantor of Brains
Jul. 2024
Amount: £1,000
Travel Grant – G-Research
Feb. 2024
Amount: £1,400
UK–IT Trustworthy AI Exchange Programme – The Alan Turing Institute
Nov. 2023
Amount: £4,500
SCGB Conference Awards – Simons Foundation
Mar. 2023
Amount: £3,000
Ph.D. Scholarship – CEA
Sept. 2017
Mobility Grant "Tesi su proposta" – Politecnico di Torino
Feb. 2017 & Feb. 2016
Amount: €2,200 each
Accommodation Scholarship "Borsa Talenti" – Fondazione CEUR
Sept. 2011, 2012 & 2013
3rd Prize – Individual Math Competition Alfa Class – Fondazione CRT
Sept. 2012
Amount: €500

> Invited Talks

Statistical Physics & Machine Learning: Moving Forward
Cargese, France
Aug. 2025
6th Youth in High-Dimensions: Recent Progress in Machine Learning, High-Dimensional Statistics and Inference & How creative is Generative AI? Perspectives from Science and Philosophy
Trieste, Italy
Jul. 2025
Workshop on Spurious Correlation and Shortcut Learning: Foundations and Solutions (ICLR 2025)
Singapore
Apr. 2025
AI for active matter: From plankton to robots
Virtual
Mar. 2025
14th Nordic Workshop on Statistical Physics: Biological, Complex and Non-Equilibrium Systems
Stockholm, Sweden
Mar. 2025
Fairness of AI in Medical Imaging
Virtual
Dec. 2024
At the Crossroads of AI and Cognitive Science
Gothenburg, Sweden
Nov. 2024
Informal Workshop on the Problem of Class Imbalance
Paris, France
Nov. 2024
2024 CHAIR Structured Learning Workshop
Gothenburg, Sweden
Oct. 2024

> Positions of Responsibility

Organizer, Colloquium Series – Department of Computer Science and Engineering, Chalmers University of Technology
Academic Year 2025–2026
Organizer, Neuroscience External Seminars – Gatsby Computational Neuroscience Unit
Academic Year 2022–2023
Organizer, SaxeLab Seminars
Academic Year 2021–2022
Area Chair
Venues:
- ICLR
- ICML 2025 Workshop on High-dimensional Learning Dynamics
Reviewer
Venues:
- NeurIPS
- ICML
- ICLR
- SciPost
- JSTAT
- JPhysA
- Physica A
- PNAS
- TMLR
- PRE

Publications

Curriculum learning in humans and neural networks
Younes Strittmatter*, Stefano Sarao Mannelli*, Miguel Ruiz-Garcia, Sebastian Musslick, Markus Wolfgang Hermann Spitzer
CogSci 2025

The sequencing of training trials can significantly influence learning outcomes in humans and neural networks. However, studies comparing the effects of training curricula between the two have typically focused on the acquisition of multiple tasks. Here, we investigate curriculum learning in a single perceptual decision-making task, examining whether the behavior of a parsimonious network trained on different curricula would be replicated in human participants. Our results show that progressively increasing task difficulty during training facilitates learning compared to training at a fixed level of difficulty or at random. Furthermore, a sequences designed to hamper learning in a parsimonious neural network... [Read Article]
A Theory of Initialisation's Impact on Specialisation
Devon Jarvis, Sebastian Lee, Clémentine Carla Juliette Dominé, Andrew M Saxe, Stefano Sarao Mannelli
ICLR 2025

Prior work has demonstrated a consistent tendency in neural networks engaged in continual learning tasks, wherein intermediate task similarity results in the highest levels of catastrophic interference. This phenomenon is attributed to the network's tendency to reuse learned features across tasks. However, this explanation heavily relies on the premise that neuron specialisation occurs, i.e. the emergence of localised representations. Our investigation challenges the validity of this assumption. Using theoretical frameworks for the analysis of neural networks, we show a strong dependence of specialisation on the initial condition. More precisely, we show that weight imbalance and high weight entropy can favour... [Read Article]
Optimal Protocols for Continual Learning via Statistical Physics and Control Theory
Francesco Mori, Stefano Sarao Mannelli, Francesca Mignacco
ICLR 2025

Artificial neural networks often struggle with catastrophic forgetting when learning multiple tasks sequentially, as training on new tasks degrades the performance on previously learned ones. Recent theoretical work has addressed this issue by analysing learning curves in synthetic frameworks under predefined training protocols. However, these protocols relied on heuristics and lacked a solid theoretical foundation assessing their optimality. In this paper, we fill this gap combining exact equations for training dynamics, derived using statistical physics techniques, with optimal control methods. We apply this approach to teacher-student models for continual learning and multi-task problems, obtaining a theory for task-selection protocols maximising... [Read Article]
A meta-learning framework for rationalizing cognitive fatigue in neural systems
Yujun Li, Rodrigo Carrasco-Davis, Younes Strittmatter, Stefano Sarao Mannelli, Sebastian Musslick
CogSci 2024 (Oral)

The ability to exert cognitive control is central to human brain function, facilitating goal-directed task performance. However, humans exhibit limitations in the duration over which they can exert cognitive control -a phenomenon referred to as cognitive fatigue. This study explores a computational rationale for cognitive fatigue in continual learning scenarios: cognitive fatigue serves to limit the extended performance of one task to avoid the forgetting of previously learned tasks. Our study employs a meta-learning framework, wherein cognitive control is optimally allocated to balance immediate task performance with forgetting of other tasks. We demonstrate that this model replicates common patterns of... [Read Article]
Bias in Motion: Theoretical Insights into the Dynamics of Bias in SGD Training
Anchit Jain, Rozhin Nobahari, Aristide Baratin, Stefano Sarao Mannelli
Accepted to NeurIPS 2024

Machine learning systems often acquire biases by leveraging undesired features in the data, impacting accuracy variably across different sub-populations. Current understanding of bias formation mostly focuses on the initial and final stages of learning, leaving a gap in knowledge regarding the transient dynamics. To address this gap, this paper explores the evolution of bias in a teacher-student setup modeling different data sub-populations with a Gaussian-mixture model. We provide an analytical description of the stochastic gradient descent dynamics of a linear classifier in this setting, which we prove to be exact in high dimension. Notably, our analysis reveals how different properties... [Read Article]
Tilting the Odds at the Lottery: the Interplay of Overparameterisation and Curricula in Neural Networks
Stefano Sarao Mannelli, Yaraslau Ivashinka, Andrew Saxe, Luca Saglietti
ICML 2024

A wide range of empirical and theoretical works have shown that overparameterisation can amplify the performance of neural networks. According to the lottery ticket hypothesis, overparameterised networks have an increased chance of containing a sub-network that is well-initialised to solve the task at hand. A more parsimonious approach, inspired by animal learning, consists in guiding the learner towards solving the task by curating the order of the examples, i.e. providing a curriculum. However, this learning strategy seems to be hardly beneficial in deep learning applications. In this work, we propose an analytical study that connects curriculum learning and overparameterisation. In... [Read Article]
Why Do Animals Need Shaping? A Theory of Task Composition and Curriculum Learning
Jin Hwa Lee, Stefano Sarao Mannelli, Andrew Saxe
ICML 2024

Diverse studies in systems neuroscience begin with extended periods of training known as 'shaping' procedures. These involve progressively studying component parts of more complex tasks, and can make the difference between learning a task quickly, slowly or not at all. Despite the importance of shaping to the acquisition of complex tasks, there is as yet no theory that can help guide the design of shaping procedures, or more fundamentally, provide insight into its key role in learning. Modern deep reinforcement learning systems might implicitly learn compositional primitives within their multilayer policy networks. Inspired by these models, we propose and analyse... [Read Article]
The RL Perceptron: Generalisation Dynamics of Policy Learning in High Dimensions
Nishil Patel, Sebastian Lee, Stefano Sarao Mannelli, Sebastian Goldt, Andrew Saxe
Accepted to PRX

Reinforcement learning (RL) algorithms have proven transformative in a range of domains. To tackle real-world domains, these systems often use neural networks to learn policies directly from pixels or other high-dimensional sensory input. By contrast, much theory of RL has focused on discrete state spaces or worst-case analysis, and fundamental questions remain about the dynamics of policy learning in high-dimensional settings. Here, we propose a solvable high-dimensional model of RL that can capture a variety of learning protocols, and derive its typical dynamics as a set of closed-form ordinary differential equations (ODEs). We derive optimal schedules for the learning rates... [Read Article]
Optimal transfer protocol by incremental layer defrosting
Federica Gerace, Diego Doimo, Stefano Sarao Mannelli, Luca Saglietti, Alessandro Laio

Transfer learning is a powerful tool enabling model training with limited amounts of data. This technique is particularly useful in real-world problems where data availability is often a serious limitation. The simplest transfer learning protocol is based on ``freezing" the feature-extractor layers of a network pre-trained on a data-rich source task, and then adapting only the last layers to a data-poor target task. This workflow is based on the assumption that the feature maps of the pre-trained model are qualitatively similar to the ones that would have been learned with enough data on the target task. In this work, we... [Read Article]
An Analytical Theory of Curriculum Learning in Teacher-Student Networks
Luca Saglietti*, Stefano Sarao Mannelli*, Andrew Saxe
NeurIPS 2022

In animals and humans, curriculum learning -presenting data in a curated order- is critical to rapid learning and effective pedagogy. A long history of experiments has demonstrated the impact of curricula in a variety of animals but, despite its ubiquitous presence, a theoretical understanding of the phenomenon is still lacking. Surprisingly, in contrast to animal learning, curricula strategies are not widely used in machine learning and recent simulation studies reach the conclusion that curricula are moderately effective or ineffective in most cases. This stark difference in the importance of curriculum raises a fundamental theoretical question: when and why does curriculum... [Read Article]
Bias-inducing geometries: an exactly solvable data model with fairness implications
Stefano Sarao Mannelli, Federica Gerace, Negar Rostamzadeh, Luca Saglietti

Machine learning (ML) may be oblivious to human bias but it is not immune to its perpetuation. Marginalisation and iniquitous group representation are often traceable in the very data used for training, and may be reflected or even enhanced by the learning models. In the present work, we aim at clarifying the role played by data geometry in the emergence of ML bias. We introduce an exactly solvable high-dimensional model of data imbalance, where parametric control over the many bias-inducing factors allows for an extensive exploration of the bias inheritance mechanism. Through the tools of statistical physics, we analytically characterise... [Read Article]
Maslow's Hammer for Catastrophic Forgetting: Node Re-Use vs Node Activation
Sebastian Lee, Stefano Sarao Mannelli, Claudia Clopath, Sebastian Goldt, Andrew Saxe
ICML 2022

Continual learning - learning new tasks in sequence while maintaining performance on old tasks - remains particularly challenging for artificial neural networks. Surprisingly, the amount of forgetting does not increase with the dissimilarity between the learned tasks, but appears to be worst in an intermediate similarity regime. In this paper we theoretically analyse both a synthetic teacher-student framework and a real data setup to provide an explanation of this phenomenon that we name Maslow's hammer hypothesis. Our analysis reveals the presence of a trade-off between node activation and node re-use that results in worst forgetting in the intermediate regime. Using... [Read Article]
Probing transfer learning with a model of synthetic correlated datasets
Federica Gerace, Luca Saglietti, Stefano Sarao Mannelli, Andrew Saxe, Lenka Zdeborová
Machine Learning: Science and Technology

Transfer learning can significantly improve the sample efficiency of neural networks, by exploiting the relatedness between a data-scarce target task and a data-abundant source task. Despite years of successful applications, transfer learning practice often relies on ad-hoc solutions, while theoretical understanding of these procedures is still limited. In the present work, we re-think a solvable model of synthetic data as a framework for modeling correlation between data-sets. This setup allows for an analytic characterization of the generalization performance obtained when transferring the learned feature map from the source to the target task. Focusing on the problem of training two-layer networks... [Read Article]
Analytical Study of Momentum-Based Acceleration Methods in Paradigmatic High-Dimensional Non-Convex Problems
Stefano Sarao Mannelli, Pierfrancesco Urbani
NeurIPS 2021

The optimization step in many machine learning problems rarely relies on vanilla gradient descent but it is common practice to use momentum-based accelerated methods. Despite these algorithms being widely applied to arbitrary loss functions, their behaviour in generically non-convex, high dimensional landscapes is poorly understood. In this work, we use dynamical mean field theory techniques to describe analytically the average dynamics of these methods in a prototypical non-convex model: the (spiked) matrix-tensor model. We derive a closed set of equations that describe the behaviour of heavy-ball momentum and Nesterov acceleration in the infinite dimensional limit. By numerical integration of these... [Read Article]
Epidemic mitigation by statistical inference from contact tracing data
Antoine Baker, Indaco Biazzo, Alfredo Braunstein, Giovanni Catania, Luca Dall’Asta, Alessandro Ingrosso, Florent Krzakala, Fabio Mazza, Marc Mezard, Anna Paola Muntoni, Maria Refinetti, Stefano Sarao Mannelli, Lenka Zdeborova
Proceedings of the National Academy of Sciences

Contact tracing is an essential tool to mitigate the impact of a pandemic, such as the COVID-19 pandemic. In order to achieve efficient and scalable contact tracing in real time, digital devices can play an important role. While a lot of attention has been paid to analyzing the privacy and ethical risks of the associated mobile applications, so far much less research has been devoted to optimizing their performance and assessing their impact on the mitigation of the epidemic. We develop Bayesian inference methods to estimate the risk that an individual is infected. This inference is based on the list... [Read Article]
Optimization and Generalization of Shallow Neural Networks with Quadratic Activation Functions
Stefano Sarao Mannelli, Eric Vanden-Eijnden, Lenka Zdeborova
NeurIPS 2020

We study the dynamics of optimization and the generalization properties of one-hidden layer neural networks with quadratic activation function in the overparametrized regime where the layer width m is larger than the input dimension d. We consider a teacher-student scenario where the teacher has the same structure as the student with a hidden layer of smaller width m*<=m. We describe how the empirical loss landscape is affected by the number n of data samples and the width m* of the teacher network. In particular we determine how the probability that there be no spurious minima on the empirical loss depends... [Read Article]
Complex Dynamics in Simple Neural Networks: Understanding Gradient Flow in Phase Retrieval
Stefano Sarao Mannelli, Giulio Biroli, Chiara Cammarota, Florent Krzakala, Pierfrancesco Urbani, Lenka Zdeborova
NeurIPS 2020

Despite the widespread use of gradient-based algorithms for optimising high-dimensional non-convex functions, understanding their ability of finding good minima instead of being trapped in spurious ones remains to a large extent an open problem. Here we focus on gradient flow dynamics for phase retrieval from random measurements. When the ratio of the number of measurements over the input dimension is small the dynamics remains trapped in spurious minima with large basins of attraction. We find analytically that above a critical ratio those critical points become unstable developing a negative direction toward the signal. By numerical experiments we show that in... [Read Article]
Thresholds of descending algorithms in inference problems
Stefano Sarao Mannelli, Lenka Zdeborova
Journal of Statistical Mechanics: Theory and Experiment

We review recent works (Sarao Mannelli et al 2018 arXiv 1812.09066, 2019 Int. Conf. on Machine Learning 4333–42, 2019 Adv. Neural Information Processing Systems 8676–86) on analyzing the dynamics of gradient-based algorithms in a prototypical statistical inference problem. Using methods and insights from the physics of glassy systems, these works showed how to understand quantitatively and qualitatively the performance of gradient-based algorithms. Here we review the key results and their interpretation in non-technical terms accessible to a wide audience of physicists in the context of related works. [Read Article]
Marvels and pitfalls of the Langevin algorithm in noisy high-dimensional inference
Stefano Sarao Mannelli, Giulio Biroli, Chiara Cammarota, Florent Krzakala, Pierfrancesco Urbani, Lenka Zdeborova
Physical Review X

Gradient-descent-based algorithms and their stochastic versions have widespread applications in machine learning and statistical inference. In this work, we carry out an analytic study of the performance of the algorithm most commonly considered in physics, the Langevin algorithm, in the context of noisy high-dimensional inference. We employ the Langevin algorithm to sample the posterior probability measure for the spiked mixed matrix-tensor model. The typical behavior of this algorithm is described by a system of integrodifferential equations that we call the Langevin state evolution, whose solution is compared with the one of the state evolution of approximate message passing (AMP). Our... [Read Article]
Who is Afraid of Big Bad Minima? Analysis of gradient-flow in spiked matrix-tensor models
Stefano Sarao Mannelli, Giulio Biroli, Chiara Cammarota, Florent Krzakala, Lenka Zdeborova
NeurIPS 2019 (Spotlight)

Gradient-based algorithms are effective for many machine learning tasks, but despite ample recent effort and some progress, it often remains unclear why they work in practice in optimising high-dimensional non-convex functions and why they find good minima instead of being trapped in spurious ones. Here we present a quantitative theory explaining this behaviour in a spiked matrix-tensor model. Our framework is based on the Kac-Rice analysis of stationary points and a closed-form analysis of gradient-flow originating from statistical physics. We show that there is a well defined region of parameters where the gradient-flow algorithm finds a good global minimum despite... [Read Article]
Passed & Spurious: Descent Algorithms and Local Minima in Spiked Matrix-Tensor Models
Stefano Sarao Mannelli, Florent Krzakala, Pierfrancesco Urbani, Lenka Zdeborova
ICML 2019

In this work we analyse quantitatively the interplay between the loss landscape and performance of descent algorithms in a prototypical inference problem, the spiked matrix-tensor model. We study a loss function that is the negative log-likelihood of the model. We analyse the number of local minima at a fixed distance from the signal/spike with the Kac-Rice formula, and locate trivialization of the landscape at large signal-to-noise ratios. We evaluate analytically the performance of a gradient flow algorithm using integro-differential PDEs as developed in physics of disordered systems for the Langevin dynamics. We analyze the performance of an approximate message passing... [Read Article]

Contacts and information

Curriculum Vitae

> Research Experience

> Education

> Organisation of Events

> Research Projects

> Students and Postdocs

> Grants and Awards

> Invited Talks

> Positions of Responsibility

Publications

Curriculum learning in humans and neural networks

A Theory of Initialisation's Impact on Specialisation

Optimal Protocols for Continual Learning via Statistical Physics and Control Theory

A meta-learning framework for rationalizing cognitive fatigue in neural systems

Bias in Motion: Theoretical Insights into the Dynamics of Bias in SGD Training

Tilting the Odds at the Lottery: the Interplay of Overparameterisation and Curricula in Neural Networks

Why Do Animals Need Shaping? A Theory of Task Composition and Curriculum Learning

The RL Perceptron: Generalisation Dynamics of Policy Learning in High Dimensions

Optimal transfer protocol by incremental layer defrosting

An Analytical Theory of Curriculum Learning in Teacher-Student Networks

Bias-inducing geometries: an exactly solvable data model with fairness implications

Maslow's Hammer for Catastrophic Forgetting: Node Re-Use vs Node Activation

Probing transfer learning with a model of synthetic correlated datasets

Analytical Study of Momentum-Based Acceleration Methods in Paradigmatic High-Dimensional Non-Convex Problems

Epidemic mitigation by statistical inference from contact tracing data

Optimization and Generalization of Shallow Neural Networks with Quadratic Activation Functions

Complex Dynamics in Simple Neural Networks: Understanding Gradient Flow in Phase Retrieval

Thresholds of descending algorithms in inference problems

Marvels and pitfalls of the Langevin algorithm in noisy high-dimensional inference

Who is Afraid of Big Bad Minima? Analysis of gradient-flow in spiked matrix-tensor models

Passed & Spurious: Descent Algorithms and Local Minima in Spiked Matrix-Tensor Models