representation learning – Brigham and Women's Hospital, Division of

Square-framed headshot of Yongju Lee, PhD, in red collared shirt and glasses. The graph neural network (GNN) and transformer model are two renowned neural network architectures for obtaining contextual embeddings from biomedical data. However, each model has a trade-off in terms of the required dataset for training and representation power of the model. As examples, I will discuss the TEA-graph which employs GNN to define the contextual pathological features related to cancer patients’ survival, and GRIP, which utilizes a combination of GNN and transformer to define the set of immune receptors linked to patients’ survival.

Furthermore, an interesting and complex biomedical data rich in contextual information is genomics. Similar to how vision and language research leverages a transformer-based foundation model – the model trained with datasets ranging from millions to billions through self-supervised learning, showing powerful performance for a wide range of downstream applications – nowadays, we can train a large model using ~50M single-cell RNA-seq datasets. Some initial efforts have already shown promising results in understanding genetic mechanisms through perturbation prediction and in silico perturbations. With the contextual gene embedding obtained from the model, we can even transfer gene embedding for analyzing bulk RNA-seq datasets. Aligned with these efforts, I would like to share the recent progress to obtain meaningful contextual gene embedding utilizing the transformer architecture and discuss opportunities for multi-modal training to link transcriptomics with images or text.

Research links
https://www.nature.com/articles/s41551-022-00923-0
https://ojs.aaai.org/index.php/AAAI/article/view/25645
https://www.nature.com/articles/s41586-023-06139-9

Yongju Lee is a Postdoctoral Fellow at Genentech Research and Early Development, under the mentorship of Aviv Regev since spring 2023. He recently earned his Ph.D. from the Department of Electrical and Computer Engineering at Seoul National University, advised by Sunghoon Kwon. His research focuses on tailoring deep learning models for various biomedical data modalities and accelerating scientific and medical discovery by interpreting the deep learning model outcomes. He has developed methods for pathology image, immune repertoire, and spatial omics data. His ongoing research involves establishing a single-cell foundation model and expanding its capabilities to include biomedical images and text data.

All Welcome! Note this event will take place on Zoom: https://partners.zoom.us/j/82163676866

Click here to be added to our mail list.

For further information about this seminar series, contact tarnoldmages@bwh.harvard.edu

Brigham and Women's Hospital, Division of

Computational Pathology

Yongju Lee, PhD, Genentech – “Contextual representation of pathology, immune repertoire by transformer and graph neural network, and transcriptomic contextual embedding via single-cell foundation model”