The graph neural network (GNN) and transformer model are two renowned neural network architectures for obtaining contextual embeddings from biomedical data. However, each model has a trade-off in terms of the required dataset for training and representation power of the model. As examples, I will discuss the TEA-graph which employs GNN to define the contextual pathological features related to cancer patients’ survival, and GRIP, which utilizes a combination of GNN and transformer to define the set of immune receptors linked to patients’ survival.
Furthermore, an interesting and complex biomedical data rich in contextual information is genomics. Similar to how vision and language research leverages a transformer-based foundation model – the model trained with datasets ranging from millions to billions through self-supervised learning, showing powerful performance for a wide range of downstream applications – nowadays, we can train a large model using ~50M single-cell RNA-seq datasets. Some initial efforts have already shown promising results in understanding genetic mechanisms through perturbation prediction and in silico perturbations. With the contextual gene embedding obtained from the model, we can even transfer gene embedding for analyzing bulk RNA-seq datasets. Aligned with these efforts, I would like to share the recent progress to obtain meaningful contextual gene embedding utilizing the transformer architecture and discuss opportunities for multi-modal training to link transcriptomics with images or text.
Yongju Lee is a Postdoctoral Fellow at Genentech Research and Early Development, under the mentorship of Aviv Regev since spring 2023. He recently earned his Ph.D. from the Department of Electrical and Computer Engineering at Seoul National University, advised by Sunghoon Kwon. His research focuses on tailoring deep learning models for various biomedical data modalities and accelerating scientific and medical discovery by interpreting the deep learning model outcomes. He has developed methods for pathology image, immune repertoire, and spatial omics data. His ongoing research involves establishing a single-cell foundation model and expanding its capabilities to include biomedical images and text data.
All Welcome! Note this event will take place on Zoom: https://partners.zoom.us/j/82163676866
For further information about this seminar series, contact email@example.com