I’m an Assistant Professor of Computer Science at Johns Hopkins, and a part-time Visiting Scientist at Abridge.
My research lies at the intersection of causality and machine learning, with the goal of enabling reliable decision-making and prediction in high-risk domains like healthcare. For more details on my research, see below - you can also see my CV. Previously, I was a postdoc at Carnegie Mellon University, working with Zack Lipton, and completed my PhD in Computer Science at MIT working with David Sontag.
If you are interested in working with me as a PhD student or postdoc, please see Joining the Group
Research Overview
When can we rely on machine learning in high-risk domains like healthcare? In the long-term, we want machine learning systems to be as reliable as any FDA-approved medication or diagnostic test.
This goal is complicated by the need for causal reasoning and robust performance: To support decision-making, we want to draw causal conclusions about the impact of model recommendations (e.g., will recommending a particular drug lead to better patient outcomes?). Moreover, we want our models to perform well across different hospitals and patient populations, including those that differ from the hospitals / populations seen during model development.
These objectives complicate the development and validation of reliable models, as they run into limitations of what our data can tell us without further assumptions. For instance, we only observe outcomes for the treatments that were actually prescribed to patients, not all possible treatments. Similarly, we do not observe performance on every conceivable hospital where a model might be deployed, but only on the (typically much more limited) data we have access to.
I approach these challenges using tools from causality and statistics, often incorporating external knowledge into the process of model validation and design.
Selected publications (Full List)
Medical Adaptation of Large Language and Vision-Language Models: Are We Making Progress?
Daniel P. Jeong, Saurabh Garg, Zachary C. Lipton, Michael Oberst
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024
[paper], [extended version]
Auditing Fairness under Unobserved Confounding
Emily Byun, Dylan Sam, Michael Oberst, Zachary Lipton, Bryan Wilder
International Conference on Artificial Intelligence and Statistics (AISTATS), 2024
[paper]
Evaluating Robustness to Dataset Shift via Parametric Robustness Sets
Nikolaj Thams*, Michael Oberst*, David Sontag
Neural Information Processing Systems (NeurIPS), 2022
[paper], [code] *Equal Contribution
Regularizing towards Causal Invariance: Linear Models with Proxies
Michael Oberst, Nikolaj Thams, Jonas Peters, David Sontag
International Conference on Machine Learning (ICML), 2021
[paper], [video], [slides], [poster], [code]
A Decision Algorithm to Promote Outpatient Antimicrobial Stewardship for Uncomplicated Urinary Tract Infection
Sanjat Kanjilal, Michael Oberst, Sooraj Boominathan, Helen Zhou, David C. Hooper, David Sontag
Science Translational Medicine, 2020
[article], [code], [dataset]