Pranshu Malviya
AI Research Scientist at DRW
Research
I recently defended my PhD at Mila / Polytechnique Montreal, where I worked with Prof. Sarath Chandar on continual learning and neural network optimization. I am moving to DRW as an AI Research Scientist.
My research studies how models adapt under distribution shift as new tasks or data arrive over time, and how loss landscape geometry can guide learning toward minima that generalize better. Highlights include Manifold Metric (CoLLAs 2025, Oral), Lookbehind-SAM (ICML 2024), and Critical Momenta (TMLR 2024).
These projects owe a great deal to collaborators, especially Aristide Baratin (Samsung SAIT AI Lab Montreal), Razvan Pascanu (Google DeepMind), Quentin Fournier (Mila), and Gonçalo Mordido, as well as many other co-authors.
Before Montreal, I completed my MS at IIT Madras with Prof. Balaraman Ravindran and Prof. Sarath Chandar at RBCDSAI. There I worked on TAG (CoLLAs 2022) and Causal Fairness (ACML 2021).
Updates
Defended PhD thesis at Mila / Polytechnique Montreal
Defended PhD thesis at Mila / Polytechnique Montreal
Completed my PhD with Prof. Sarath Chandar and moving to DRW as an AI Research Scientist.
New preprint: CoPeP on continual pretraining for protein language models
New preprint: CoPeP on continual pretraining for protein language models
Benchmarking how protein language models handle continual pretraining -- with Darshan Patil, Mathieu Reymond, Quentin Fournier, and Sarath Chandar.
Joined DRW as AI Research Scientist Intern
Joined DRW as AI Research Scientist Intern
Working on AI/ML research at DRW in Montreal.
Paper accepted at CoLLAs 2025 (Oral): Manifold Metric
Paper accepted at CoLLAs 2025 (Oral): Manifold Metric
A loss landscape approach for predicting model performance.
Awarded PBEEE Doctoral Research Scholarship by FRQNT Quebec
Awarded PBEEE Doctoral Research Scholarship by FRQNT Quebec
Fonds de recherche du Québec — Nature et technologies doctoral scholarship.
Papers
Manifold Metric: A Loss Landscape Approach for Predicting Model Performance
Manifold Metric: A Loss Landscape Approach for Predicting Model Performance
Using loss landscape geometry to predict model generalization without held-out data.
Authors: P. Malviya, J. Huang, A. Baratin, Q. Fournier, S. Chandar
Lookbehind-SAM: k steps back, 1 step forward
Lookbehind-SAM: k steps back, 1 step forward
An efficient extension to Sharpness-Aware Minimization that leverages historical gradient information.
Authors: G. Mordido, P. Malviya, A. Baratin, S. Chandar
Promoting Exploration in Memory-Augmented Adam using Critical Momenta
Promoting Exploration in Memory-Augmented Adam using Critical Momenta
A memory-augmented optimizer that stores and retrieves critical momenta to promote exploration in the loss landscape.
Authors: P. Malviya, G. Mordido, A. Baratin, R. Babanezhad, J. Huang, S. Lacoste-Julien, R. Pascanu, S. Chandar
TAG: Task-based Accumulated Gradients for Lifelong Learning
TAG: Task-based Accumulated Gradients for Lifelong Learning
A gradient accumulation method for continual learning that prevents catastrophic forgetting.
Authors: P. Malviya, B. Ravindran, S. Chandar
An Introduction to Lifelong Supervised Learning
An Introduction to Lifelong Supervised Learning
A comprehensive primer on lifelong/continual supervised learning — survey of the field covering task-incremental, class-incremental, and domain-incremental settings.
Authors: S. Sodhani, M. Faramarzi, S.V. Mehta, P. Malviya, M. Abdelsalam, J. Rajendran, S. Chandar
Roles
DRW
Polytechnique Montreal
NPTEL, IIT Madras
Beyond Work
Photos from travel and hikes, sketches, book notes, and reading log.
Contact
Want to chat? Feel free to reach out via email.