Title: Sequence modeling from molecular to genome scale with Evo
Speaker: Brian Hie, PhD
Affiliation: Stanford University, Chemical Engineering and Data Science
Position: Assistant Professor
Host: Jiening Zhu, PhD – Gerber Lab
Date: Monday November 18, 2024
Time: 4:00-5:00PM ET
Zoom: https://partners.zoom.us/j/82163676866
Meeting ID: 821 6367 6866
Abstract: The genome is a sequence that encodes the DNA, RNA, and proteins orchestrating an organism’s function. We present Evo, a long-context genomic foundation model with a frontier architecture trained on millions of prokaryotic and phage genomes, and report the first scaling laws on DNA to complement observations in language and vision. Evo generalizes across DNA, RNA, and proteins, enabling zero-shot function prediction competitive with domain-specific language models and the generation of functional CRISPR-Cas and transposon systems, representing the first examples of protein-RNA and protein-DNA co-design with a language model. Evo also learns how small mutations affect whole-organism fitness and generates megabase-scale sequences with plausible genomic architecture. These prediction and generation capabilities span molecular to genome scales of complexity, advancing our understanding and control of biology.
Brian Hie is an Assistant Professor of Chemical Engineering at Stanford University, the Dieter Schwarz Foundation Stanford Data Science Faculty Fellow, and an Innovation Investigator at Arc Institute, where his group conducts research at the intersection of biology and machine learning.