ncibtep@nih.gov

Bioinformatics Training and Education Program

From Genes to Patients - and Back to Hypotheses: Foundation Models and AI Agents for Multi-Scale Biomedical Discovery

Distinguished Speakers Seminar Series

From Genes to Patients - and Back to Hypotheses: Foundation Models and AI Agents for Multi-Scale Biomedical Discovery

 When: Feb. 26th, 2026 1:00 pm - 2:00 pm

Seminar Series Details:

Presented By:
Jure Leskovec PhD (Stanford University)
Where:
Online
Organized By:
BTEP
Jure Leskovec PhD (Stanford University)

About Jure Leskovec PhD (Stanford University)

Jure Leskovec is Professor of Computer Science at Stanford University, Chief Scientist at Pinterest, and investigator at Chan Zuckerberg Biohub. Dr. Leskovec was the co-founder of a machine learning startup Kosei, which was later acquired by Pinterest. His research focuses on machine learning and data mining large social, information, and biological networks. Computation over massive data is at the heart of his research and has applications in computer science, social sciences, marketing, and biomedicine. This research has won several awards including a Lagrange Prize, Microsoft Research Faculty Fellowship, the Alfred P. Sloan Fellowship, and numerous best paper and test of time awards. It has also been featured in popular press outlets such as the New York Times and the Wall Street Journal. Leskovec received his bachelor's degree in computer science from University of Ljubljana, Slovenia, PhD in machine learning from Carnegie Mellon University and postdoctoral training at Cornell University. 

About this Class

Scientific discovery is increasingly limited not by data availability, but by our ability to integrate evidence, generate hypotheses, and iteratively test them at scale. Recent advances in foundation models and large language models suggest a new paradigm: AI systems that not only model data, but actively participate in the scientific process as agents. In this talk, I will present a unified view of our recent work on foundation models and agentic systems that aim to make biomedical knowledge transferable, multi-scale, and scientifically testable.

First, I will discuss Universal Cell Embeddings (UCE), a self-supervised foundation model that produces robust, annotation-free cell representations that generalize across datasets and species, enabling zero-shot transfer for single-cell biology without per-dataset retraining. Building on this “universal” cell representation layer, I will introduce PULSAR, a multi-scale, multicellular architecture that explicitly propagates information from genes to cells to multicellular systems, yielding unified donor-level representations for tasks such as disease classification, biomarker prediction, and forecasting future clinical events in the human immune system.

Second, I will connect these models to the broader agenda of the AI Virtual Cell: high-fidelity, multi-scale neural simulators of cellular state and dynamics, and the key scientific and engineering priorities needed to make them real and useful for biology and medicine. Finally, I will move from models to agents. Biomni defines a general-purpose biomedical agent environment with a large, structured action space grounded in real biomedical tools, software, and databases—enabling LLM-based agents to do biomedical work, not just talk about it. To ensure that agent-generated claims can be validated rigorously, I will present Popper, an agentic hypothesis-validation framework inspired by falsification, combining LLM-driven experimental design with sequential statistical testing and explicit Type-I error control. Together, these systems suggest a path toward AI that learns universal biological representations, composes them across scales, and supports end-to-end discovery loops grounded in tools, data, and statistical rigor.