November ABC Seminar: Brian Hie, PhD – Stanford University – “Sequence modeling from molecular to genome scale with Evo”

November ABC Seminar: Brian Hie, PhD – Stanford University – “Sequence modeling from molecular to genome scale with Evo”

The genome is a sequence that encodes the DNA, RNA, and proteins orchestrating an organism’s function. We present Evo, a long-context genomic foundation model with a frontier architecture trained on millions of prokaryotic and phage genomes, and report the first scaling laws on DNA to complement observations in language and vision. Evo generalizes across DNA, RNA, and proteins, enabling zero-shot function prediction competitive with domain-specific language models and the generation of functional CRISPR-Cas and transposon systems, representing the first examples of protein-RNA and protein-DNA co-design with a language model. Evo also learns how small mutations affect whole-organism fitness and generates megabase-scale sequences with plausible genomic architecture. These prediction and generation capabilities span molecular to genome scales of complexity, advancing our understanding and control of biology.

Title: Sequence modeling from molecular to genome scale with Evo
Speaker:  Brian Hie, PhD
Affiliation: Stanford University, Chemical Engineering and Data Science
Position:  Assistant Professor
Host: Jiening Zhu, PhD – Gerber Lab

Date: Monday November 18, 2024
Time: 4:00-5:00PM ET
Zoom: https://partners.zoom.us/j/82163676866
Meeting ID: 821 6367 6866

Brian Hie is an Assistant Professor of Chemical Engineering at Stanford University, the Dieter Schwarz Foundation Stanford Data Science Faculty Fellow, and an Innovation Investigator at Arc Institute, where his group conducts research at the intersection of biology and machine learning.

Click here to be added to our mail list.

October ABC Seminar: Zhi Huang, PhD – Univ of Pennsylvania – “A pathologist–AI collaboration framework for enhancing diagnostic accuracies and efficiencies”

October ABC Seminar: Zhi Huang, PhD – Univ of Pennsylvania – “A pathologist–AI collaboration framework for enhancing diagnostic accuracies and efficiencies”

The integration of Artificial Intelligence (AI) in clinical pathology has faced significant hurdles due to constraints in data collection and challenges associated with model transparency and interpretability.  In this talk, we introduce a novel digital pathology AI framework named nuclei.io, which leverages active learning and incorporates real-time feedback from human experts. This innovative approach empowers pathologists to quickly generate diverse datasets and develop models for various clinical applications. To demonstrate the effectiveness of our framework, we conducted two user studies employing such human–AI collaboration strategy. These studies focused on two key areas: the identification of plasma cells in endometrial biopsies and the detection of colorectal cancer metastasis in lymph nodes. The results from these studies showed significant enhancements in sensitivity, accuracy, and diagnostic efficiency with the integration of AI. Our findings underscore the benefits of the human-in-the-loop AI framework, highlighting its potential to transform the field of digital pathology.

Speaker:  Zhi Huang, PhD
Affiliation: Perelman School of Medicine, University of Pennsylvania
Position:  Assistant Professor (incoming), Dept of Pathology and Laboratory Medicine, Dept of Biostatistics, Epidemiology and Informatics
Research Links: https://www.zhihuang.ai
Host: Andrew Song, PhD – Mahmood Lab

Date: Monday, October 21, 2024
Time: 4:00-5:00PM ET
Zoom: https://partners.zoom.us/j/82163676866
Meeting ID: 821 6367 6866

Zhi Huang is an incoming tenure-track assistant professor at the University of Pennsylvania starting January 2025. He obtained his PhD in Electrical and Computer Engineering (ECE) from Purdue University in August 2021. Since August 2021, He has been a postdoctoral fellow at Stanford University. His research focuses on AI/ML innovation and its application to medicine, with topics including vision-language foundation model for pathology, human-AI collaboration, neurodegenerative diseases, etc. His research has drawn wide public attention (including the New York Times, Stanford Magazine, and Stanford Scope) and has resulted in translational innovations. In 2022, Zhi Huang co-founded nuclei.io — a human-in-the-loop AI platform for digital pathology. It was selected as one of only 9 Stanford Catalyst 2023 cohort innovations.

Click here to be added to our mail list.

ABC Hybrid Seminar: Bokai Zhu, PhD – Ragon Institute – “Integration of spatial-omics and single-cell data across modalities with weakly linked features”

ABC Hybrid Seminar: Bokai Zhu, PhD – Ragon Institute – “Integration of spatial-omics and single-cell data across modalities with weakly linked features”

Bokai ZhuAdvancements in single-cell and spatial-omics technologies have created a need for integrating datasets across modalities with limited and weakly correlated features, such as those between spatial proteomics and transcriptomics. Existing tools, usually designed for strongly linked data, often fail in these scenarios. Recently, we have developed a series of methods (Mario and MaxFuse), that improves integration by refining weak correlations between modalities through an iterative smoothing and co-embedding process, and achieves single-cell level matching across these weakly linked modalities, enabling in-depth understanding of tissue micro-environments.

Speaker:  Bokai Zhu, PhD
Affiliation: Ragon Institute of MGH, Harvard and MIT, Broad Institute, MIT
Position:  Postdoctoral Research Fellow
Research Links: https://bokaizhu.github.io/
Host: Muhammad Shaban, PhD – Mahmood Lab

Date: Monday, September 23, 2024
Time: 1:00-2:00PM ET
In-person: Duncan Reid Conference Room (directions below)
Zoom: https://partners.zoom.us/j/82163676866
Meeting ID: 821 6367 6866

Bokai Zhu is currently a postdoctoral researcher supervised by Prof. Alex Shalek at the Ragon Institute at MGH, MIT, and Harvard. Prior to that, Dr. Zhu obtained his PhD in Microbiology and Immunology from Stanford University, under the supervision of Prof. Garry Nolan. He received a bachelor’s degree in Biology from Cornell University and Zhejiang University. Dr. Zhu’s doctoral research focused on: 1) Experimental assay development for multiplex imaging platforms; 2) Computational algorithm development for single-cell multi-omic integrations; 3)Application of the above tools in various biological systems.

Click here to be added to our mail list.

Dan MacDonald awarded Banting Fellowship

Dan MacDonald awarded Banting Fellowship

Daniel MacDonald, a Research Fellow in the Gibson Lab, is a 2024 recipient of the Banting Postdoctoral Fellowship.

The Banting Postdoctoral Fellowship program provides funding to the very best postdoctoral applicants, both nationally and internationally, who will positively contribute to the country’s economic, social and research-based growth. The award is designed for Canadian citizens, permanent residents of Canada and foreign citizens of Canada.

Dan is a second year fellow researching machine learning for the gut microbiome. Researchers typically study the microbiome by counting the different species of microbes in a stool sample—there may be hundreds to thousands of species and trillions of individual microbes. Stool samples are a good representation of the microbes found in the colon, but they underrepresent species in the small intestine and species that stick to the intestinal walls, which may play a critical role in maintaining the host’s bodily functions. Previously, researchers in the lab developed uncertainty-aware machine learning (ML) models of the gut microbiome that can predict how the hundreds-to-thousands of microbial species grow and die in response to changes in diet. These ML models were trained using stool samples, which underrepresent species upstream from the colon, and aren’t designed to accommodate other types of samples, such as microbe samples from tissue in the small intestine. In this research, Dan and other lab members are developing new uncertainty-aware ML models that will be trained not only on stool samples, but also using microbe measurements throughout the intestinal tract. This will provide insight into the as-yet unknown microbial interactions of the entire intestinal tract. They are designing this model to flexibly incorporate new measurement modalities, such as advanced imaging techniques, which will allow them to quickly adapt this model for new experiments in the ever-growing microbiome field, shedding light on the hidden inhabitants of our bodies.

I’m grateful to be a recipient of a 2024 Banting Postdoctoral Fellowship. This is shared achievement, as it was only possible with the support and countless opportunities provided by Dr. Travis Gibson and our team in the Division of Computational Pathology at BWH. I’d also like to express gratitude to my PhD supervisor, Prof. David Steinman, who laid the foundations for my academic pursuits and shaped my approach to research. It is truly an honour to receive this fellowship from NSERC, and I look forward to making meaningful scientific advancements through the completion of this research.  –Dan MacDonald

 

Post Doctoral Fellow in Deep Learning for Microbiome Spatial Omics – Gerber Lab

The Gerber Lab (http://gerber.bwh.harvard.edu) is a multidisciplinary group at Brigham and Women’s Hospital/Harvard Medical School that develops novel computational models and high-throughput experimental systems to understand the role of the microbiota in human diseases, and applies these findings to develop new diagnostic tests and therapies. A long-standing and continuing focus of the lab is on incorporating principled probabilistic models into machine learning methods. The director of the lab, Dr. Georg Gerber, MD, PhD, MPH, uses his unique expertise, combining deep learning method development, medical microbiology, and human pathology, to leverage cutting-edge technologies to tackle scientifically and clinically important problems.

We are looking for an exceptional researcher who will play a major role in new initiatives in the lab to develop novel deep learning (DL) approaches to further understanding of the spatial organization of the microbiome–the trillions of microbes living on and within us—and its interactions with mammalian cells. The successful candidate will be highly motivated and creative, taking a lead role in developing new deep learning-based methods, analyzing data, and interpreting results. Although experience analyzing data from biological systems is required, microbiome specific knowledge is not.

Qualifications:

  • PhD in Computer Science, Computational Biology, or other highly quantitative discipline.
  • Outstanding publication track record.
  • Strong mathematical background and skills.
  • Experience developing DL methods.
  • Experience analyzing data from biological systems, including sequencing data.
  • Solid programming skills in Python, including PyTorch.
  • Superior verbal and written communication skills, and ability to work on multidisciplinary teams.

Environment:  the Gerber Lab is located in the Brigham and Women’s Hospital Division of Computational Pathology (http://comp-path.bwh.harvard.edu) at Harvard Medical School (HMS). With a recent grant from the Massachusetts Life Science center, the Division has built the Lab for AI/Deep Learning for the Microbiome, which has a state-of-the-art GPU cluster for model development, training and deployment. BWH is part of the greater Longwood Medical Area in Boston, a rich, stimulating environment conducive to intellectual development and research collaborations, which includes HMS, Harvard School of Public Health and Boston Children’s Hospital.

To apply: email a single PDF including cover letter, CV, brief research statement and a list of at least three references to Dr. Georg Gerber (ggerber@bwh.harvard.edu).

We are an equal opportunity employer and all qualified applicants will receive consideration for employment without regard to race, color, religion, sex, national origin, disability status, protected veteran status, gender identity, sexual orientation, pregnancy and pregnancy-related conditions or any other characteristic protected by law.

May ABC Seminar: Jean du Terrail, PhD – Owkin – “Federated Learning in Healthcare in the Real-World: Examples and Practical Challenges”

May ABC Seminar: Jean du Terrail, PhD – Owkin – “Federated Learning in Healthcare in the Real-World: Examples and Practical Challenges”

With the never-ending revolutions of data-driven approaches that started already more than a decade ago, it is surprising at first that the pace of ML discoveries in medicine seems substantially slower than the ones found in consumer applications such as ChatGPT. The answer to that paradox is relatively simple: in healthcare, data is hard to access as it is expensive and siloed in medical institutions and thus data-hungry methods trained on only one center on limited data often fail to generalize to another. In order to break those silos while protecting patient data and thus enabling new medical discoveries via ML, federated learning (FL) is a promising approach. However, in practice, applying FL in real-life contexts presents numerous challenges. This talk will use some of the FL research projects that Owkin has spearheaded in order to illustrate the realities of FL in healthcare and discuss the remaining milestones on the road to FL technologies and ML becoming the new engine of medicine research.

Special Seminar: Anshul Kundaje, PhD – “Using deep learning models to debug regulatory genomics experiments and decode cis-regulatory syntax”

Special Seminar: Anshul Kundaje, PhD – “Using deep learning models to debug regulatory genomics experiments and decode cis-regulatory syntax”

BWH Computational Pathology Special Seminar

Title: Using deep learning models to debug regulatory genomics experiments and decode cis-regulatory syntax
Speaker: Anshul Kundaje, PhD
Affiliation: Stanford University
Position: Associate Professor, Genetics and Computer Science

Date: Monday April 22, 2024
Time: 4:00PM-5:00PM ET
Zoom: https://partners.zoom.us/j/82163676866
Meeting ID: 821 6367 6866

Anshul Kundaje, PhD, is Associate Professor of Genetics and Computer Science at Stanford University. His primary research area is large-scale computational regulatory genomics. The Kundaje lab develops deep learning models of gene regulation and model interpretation methods to decipher non-coding DNA and genetic variation associated with disease. Dr. Kundaje has led computational efforts to develop widely used resources in collaboration with several NIH consortia including ENCODE, Roadmap Epigenomics and IGVF. Dr. Kundaje is a recipient of the 2016 NIH Director’s New Innovator Award and the 2014 Alfred Sloan Fellowship.

Links: The Encyclopedia of DNA Elements (ENCODE) Project, Stanford University, MIT

Technical Research Assistant I – Gibson and Walt Labs

A position is open for a full-time research assistant (RA) in the Department of Pathology at Brigham and Women’s Hospital. Under the supervision of Dr. David Walt and Dr. Travis Gibson, the selected candidate will provide technical assistance for research pertaining to the human gut microbiome and its interactions with the host immune system. This position will provide the opportunity to work with histologic, immunohistochemical, molecular cytogenetic, and in situ protocols in a state-of-the-art lab environment.

Among other responsibilities, the selected candidate will perform multiplexed error-robust fluorescence in situ hybridization (MERFISH). The RA will perform all stages of the MERFISH protocol, including tissue fixation, permeabilization, hybridization, embedding, and clearing, in addition to imaging using a custom-built semi-automated robotic system built by a team at BWH. The selected candidate will primarily work independently to prepare and process tissue samples while optimizing the respective protocols, and will work closely with the surrounding research team to interpret the resulting data. Previous experience with protocols involving RNA is considered a strong asset. Other responsibilities of the selected candidate include assistance with the execution of the ultrasensitive single molecule array (Simoa) assay. Research conducted by the candidate in collaboration with the research team will provide new insights into the dynamics of the gut microbiome and will address underlying biological questions.

The responsibilities of the selected candidate may include:

  1.  Independently performing routine and non-routine experimental protocols of moderate to high complexity. Experimental work will include preparing buffers and reagents without RNase contamination and evaluating the reagents for viability.
  2.  Performing literature searches relevant to the execution of the required protocols.
  3. In collaboration with the PIs or Research Manager, modifying existing research techniques and potentially establishing new techniques.
  4. Communicating progress professionally with collaborators and with the scientific community.
  5. Coordinating and scheduling tests and procedures, and documenting experimental work accurately and in detail.
  6. Ordering laboratory supplies related to their assigned tasks.
  7. Following all lab safety protocols.

BA/BS in biological/physical science required.  Some prior research preferred but not essential.

Skills/Abilities/Competencies Required:

  • Sound analytical and organizational skills.
  • Requires good oral and written communication skills.
  • Must be able to logically and effectively structure tasks and set priorities.
  • Ability to identify potential problems and troubleshoot solutions.

About the lab environment

The Gibson Lab is located in the Division of Computational Pathology at Brigham and Women’s Hospital (BWH), a Harvard Medical School teaching hospital, which is the second largest non-university recipient of NIH research funding. The broad mandate of the Division of Computational Pathology is to develop and apply advanced computational methods for furthering the understanding, diagnosis, and treatment of human diseases. The Division is situated within the BWH Department of Pathology, which houses over 40+ established investigators, 50+ postdoctoral research fellows, and 100+ research support staff. In addition, BWH is part of the greater Longwood Medical Area in Boston, a rich, stimulating environment conducive to intellectual development and research collaborations, which includes the Harvard Medical School quad, Harvard School of Public Health, Boston Children’s Hospital, and the Dana Farber Cancer Institute. Many of our lab members also have appointments at the Massachusetts Institute of Technology and the Broad Institute.

Applications Process

Submit: (1) cover letter; (2) curriculum vitae to: Utkarsh Sharma, usharma1@bwh.harvard.edu

March ABC Seminar: Andrew H. Song – Brigham and Women’s Hospital – “Towards 3D pathology – The opportunities and challenges”

March ABC Seminar: Andrew H. Song – Brigham and Women’s Hospital – “Towards 3D pathology – The opportunities and challenges”

Human tissue, which is inherently three-dimensional (3D), is traditionally examined through standard-of-care histopathology as limited two-dimensional (2D) cross sections that can poorly represent the tissue due to sampling bias. To holistically characterize 3D histomorphology, 3D imaging modalities have been developed, but clinical translation is hampered by the complex and time-consuming requirements for manual evaluation, as well as the current lack of computational platforms to distill clinical insights from these large, high-resolution datasets. We present a deep learning model for processing tissue volumes and predicting patient outcomes with weak supervision. Recurrence risk-stratification models were trained with archived prostate cancer specimens imaged with open-top light-sheet microscopy or microcomputed tomography. By comprehensively capturing 3D morphologies, 3D block-based prognostication achieves superior performance to traditional 2D slice-based approaches, including existing clinical/histopathological baselines. Incorporating larger tissue volumes is shown to improve prognostic accuracy. This framework offers a promising direction for clinical decision support and 3D biomarker discovery, with the potential to further catalyze the growth of 3D spatial biology techniques for clinical applications.

Publication: Song AH et al., “Weakly supervised AI for efficient analysis of 3D pathology samples”, Cell (2024, In Press) Preprint

Speaker: Andrew H. Song, PhD
Affiliation: Brigham and Women’s Hospital, Harvard Medical School
Position: Research Fellow

Date: Monday, March 25, 2024
Time: 4:00PM-5:00PM ET

Zoom: https://partners.zoom.us/j/82163676866
Meeting ID: 821 6367 6866

In Person: VTC-2006b, Hale BTM 2nd Floor
Brigham and Women’s Hospital
60 Fenwood Rd, Boston 02115

Andrew is currently a postdoctoral research fellow at Brigham and Women’s Hospital and Harvard Medical School, working with Prof. Faisal Mahmood in the AI4Pathology group since early 2022. His current research focus is developing DL-based frameworks for 3D computational pathology, using images from diverse 3D imaging modalities. In his previous life, he received a Ph.D. from Massachusetts Institute of Technology (MIT) Electrical Engineering and Computer Science (EECS), co-advised by Prof. Emery N. Brown (MIT) and Prof. Demba Ba (Harvard), working at the intersection of computational neuroscience and statistical signal processing.

February ABC Seminar: Vitalii Kleshchevnikov, PhD – Wellcome Sanger Institute – “Probabilistic models to resolve cell identity and tissue architecture”

February ABC Seminar: Vitalii Kleshchevnikov, PhD – Wellcome Sanger Institute – “Probabilistic models to resolve cell identity and tissue architecture”

Cell identity drives cell-cell communication and tissue architecture and is in return regulated by cell-extrinsic cues. Cell identity is determined by the combination of intrinsic developmentally established transcription factor use (TF) and constitutive as well as cell communication-dependent TF activities. Presented work shows two probabilistic models that we developed to advance the understanding of these processes using single-cell and spatial genomic data.

Spatial transcriptomic technologies promise to resolve cellular wiring diagrams of tissues in health and disease, but comprehensive mapping of cell types in situ remains a challenge. Here we present cell2location, a Bayesian model that can resolve fine-grained cell types in spatial transcriptomic data and create comprehensive cellular maps of diverse tissues. Cell2location accounts for technical sources of variation and borrows statistical strength across locations, thereby enabling the integration of single cell and spatial transcriptomics with higher sensitivity and resolution than existing tools. We assess cell2location in three different tissues and demonstrate improved mapping of fine-grained cell types. In the mouse brain, we discover fine regional astrocyte subtypes across the thalamus and hypothalamus. In the human lymph node, we spatially map a rare pre-germinal center B cell population. In the human gut, we resolve fine immune cell populations in lymphoid follicles. Collectively our results present cell2location as a versatile analysis tool for mapping tissue architectures in a comprehensive manner.

Python package is provided here:  https://github.com/BayraktarLab/cell2location.

Cell identity and plasticity is regulated by a combinatorial code mediated by transcription factors and the cell communication environment. Systematically dissecting how the regulatory code robustly defines the vast complexity of cell populations across tissues is a long-standing challenge. Measured using the assay for transposase-accessible chromatin with sequencing (ATAC-seq), DNA accessibility provides a readout of intermediate gene regulation steps at single-cell resolution, with technologies measuring both RNA and ATAC providing the necessary evidence to build mechanistic models of regulation. Existing methods address one or several subproblems of modelling DNA accessibility. For example, the DNA sequence-based deep learning models represent combinatorial interactions and in-vivo TF-DNA recognition preferences. In contrast, GRN models use TF abundance profiles across cells and in-vitro-derived TF-DNA recognition preferences, optionally incorporating ATAC-seq data as a filter. All models learn cell-type specific weights and properties and don’t generalize to new TF abundance states such as new cell types. Therefore, we are missing an end-to-end mechanistic model that represents all steps of the biological process, that generalizes to both new DNA sequences and TF abundance combinations and can simultaneously characterize hundreds to thousands of cell states observed in single-cell genomics atlases. Here, we formulated cell2state, a mechanistic end-to-end probabilistic model of TF recruitment to a chromatin locus and downstream TF effect on DNA accessibility. Cell2state is designed to achieve the generalization of regulatory predictions to unseen cell types. Cell2state A) estimates TF nuclear protein abundance and models B) how TFs recognize DNA, C) how TF sites in DNA lead to TF recruitment to a chromatin locus, D) how the activity of DNA-associated TFs affects chromatin accessibility. To evaluate generalization, we defined the computational problem and developed a workflow for predicting the scATAC-seq readout for previously unseen chromosomes and cell types. We show that cell2state outperforms the state-of-the-art deep learning models (ChromDragoNN) at explaining DNA accessibility differences across cells. Finally, to look at cell state plasticity, we developed ways to use cell2state to simulate the possible chromatin states given TF abundance of source cell types.

Speaker:  Vitalii Kleshchevnikov, PhD
Affiliation:  Wellcome Sanger Institute
Position:  Bioinformatician @ Bayraktar, Stegle, Teichmann group
Host: Daniel MacDonald, Gibson Lab

Date: Monday February 26, 2024
Time: 10:00AM-11:00AM ET
Zoom: https://partners.zoom.us/j/82163676866
Meeting ID: 821 6367 6866

Vitalii Kleshchevnikov is driven by a deep interest in three key areas: i) understanding the regulatory code which allows a single genome to specify the full diversity of cell populations and their interaction, ii) formalizing the biology of these processes into mechanistic AI/ML models, and iii) accelerating the therapy development to address ageing alterations in these processes. Vitalii did his PhD jointly supervised by Dr Omer Bayraktar, Dr Oliver Stegle, Dr Sarah Teichmann at Wellcome Sanger Institute (2018-2023) and will present the published and ongoing work. Prior to PhD, Vitalii worked on the role of peptide motifs (SLiMs) in intracellular signaling (Dr Evangelia Petsalaki, EMBL-EBI), predicting CRISR KO mutational outcomes (Dr Leopold Parts, Wellcome Sanger Institute) and profiling protein interactions in accelerated ageing (A*STAR) – while completing MSc and BSc in Kyiv, Ukraine.

Click here to be added to our mail list.