Mahmood Lab develops deep learning model for transforming tissue images: Nature Biomedical Engineering 2022

Mahmood Lab develops deep learning model for transforming tissue images: Nature Biomedical Engineering 2022

Histological artefacts in cryosectioned tissue can hinder rapid diagnostic assessments during surgery. Formalin-fixed and paraffin-embedded (FFPE) tissue provides higher quality slides, but the process for obtaining them is laborious (typically lasting 12–48 h) and hence unsuitable for intra-operative use. Here we report the development and performance of a deep-learning model that improves the quality of cryosectioned whole-slide images by transforming them into the style of whole-slide FFPE tissue within minutes. The model consists of a generative adversarial network incorporating an attention mechanism that rectifies cryosection artefacts and a self-regularization constraint between the cryosectioned and FFPE images for the preservation of clinically relevant features. Transformed FFPE-style images of gliomas and of non-small-cell lung cancers from a dataset independent from that used to train the model improved the rates of accurate tumour subtyping by pathologists.

Ozyoruk, K.B., Can, S., Darbaz, B. et al. A deep-learning model for transforming the style of tissue images from cryosectioned to formalin-fixed and paraffin-embedded. Nat. Biomed. Eng 6, 1407–1419 (2022). https://doi.org/10.1038/s41551-022-00952-9

Mahmood Lab develops self-supervised deep learning algorithm: Nature Biomedical Engineering 2022

Mahmood Lab develops self-supervised deep learning algorithm: Nature Biomedical Engineering 2022

The adoption of digital pathology has enabled the curation of large repositories of gigapixel whole-slide images (WSIs). Computationally identifying WSIs with similar morphologic features within large repositories without requiring supervised training can have significant applications. However, the retrieval speeds of algorithms for searching similar WSIs often scale with the repository size, which limits their clinical and research potential. Here we show that self-supervised deep learning can be leveraged to search for and retrieve WSIs at speeds that are independent of repository size. The algorithm, which we named SISH (for self-supervised image search for histology) and provide as an open-source package, requires only slide-level annotations for training, encodes WSIs into meaningful discrete latent representations and leverages a tree data structure for fast searching followed by an uncertainty-based ranking algorithm for WSI retrieval. We evaluated SISH on multiple tasks (including retrieval tasks based on tissue-patch queries) and on datasets spanning over 22,000 patient cases and 56 disease subtypes. SISH can also be used to aid the diagnosis of rare cancer types for which the number of available WSIs is often insufficient to train supervised deep-learning models.

Chen, C., Lu, M.Y., Williamson, D.F.K. et al. Fast and scalable search of whole-slide images via self-supervised deep learning. Nat. Biomed. Eng 6, 1420–1434 (2022). https://doi.org/10.1038/s41551-022-00929-8

Gerber Lab’s “MDITRE: Scalable and Interpretable Machine Learning for Predicting Host Status from Temporal Microbiome Dynamics” is mSystems Editor’s Pick

Gerber Lab’s “MDITRE: Scalable and Interpretable Machine Learning for Predicting Host Status from Temporal Microbiome Dynamics” is mSystems Editor’s Pick

Longitudinal microbiome data sets are being generated with increasing regularity, and there is broad recognition that these studies are critical for unlocking the mechanisms through which the microbiome impacts human health and disease. However, there is a dearth of computational tools for analyzing microbiome time-series data. To address this gap, we developed an open-source software package, Microbiome Differentiable Interpretable Temporal Rule Engine (MDITRE), which implements a new highly efficient method leveraging deep-learning technologies to derive human-interpretable rules that predict host status from longitudinal microbiome data. Using semi-synthetic and a large compendium of publicly available 16S rRNA amplicon and metagenomics sequencing data sets, we demonstrate that in almost all cases, MDITRE performs on par with or better than popular uninterpretable machine learning methods, and orders-of-magnitude faster than the prior interpretable technique. MDITRE also provides a graphical user interface, which we show through case studies can be used to derive biologically meaningful interpretations linking patterns of microbiome changes over time with host phenotypes. 

Mahmood Lab’s Pan-cancer integrative histology-genomic analysis is featured on cover of Cancer Cell

Mahmood Lab’s Pan-cancer integrative histology-genomic analysis is featured on cover of Cancer Cell

The rapidly emerging field of computational pathology has demonstrated promise in developing objective prognostic models from histology images. However, most prognostic models are either based on histology or genomics alone and do not address how these data sources can be integrated to develop joint image-omic prognostic models. Additionally, identifying explainable morphological and molecular descriptors from these models that govern such prognosis is of interest. We use multimodal deep learning to jointly examine pathology whole-slide images and molecular profile data from 14 cancer types. Our weakly supervised, multimodal deep-learning algorithm is able to fuse these heterogeneous modalities to predict outcomes and discover prognostic features that correlate with poor and favorable outcomes. We present all analyses for morphological and molecular correlates of patient prognosis across the 14 cancer types at both a disease and a patient level in an interactive open-access database to allow for further exploration, biomarker discovery, and feature assessment.

Chen RJ, Lu MY, Williamson DFK, Chen TY, Lipkova J, Noor Z, Shaban M, Shady M, Williams M, Joo B, Mahmood F. Pan-cancer integrative histology-genomic analysis via multimodal deep learning. Cancer Cell. 2022 Aug 8;40(8):865-878.e6. doi: 10.1016/j.ccell.2022.07.004. PMID: 35944502.

The Massachusetts Lab for Artificial Intelligence/Deep Learning for the Microbiome

The Massachusetts Lab for Artificial Intelligence/Deep Learning for the Microbiome

Through a $3.3M grant from the Massachusetts Life Science Center and in-kind support from Brigham and Women’s Hospital and Mass General Brigham, the BWH Massachusetts Host-Microbiome Center (MHMC) and Division of Computational Pathology will establish a new lab to develop and apply advanced AI/deep learning technologies to microbiome research. Dr. Georg Gerber, Chief of BWH Computational Pathology and co-director of the MHMC will head the new lab.

The microbiome is inherently complex and dynamic. Multi-omic data characterizing microbes in culture systems, animal models, and human populations can provide unique and complementary insights into these rich host-microbial ecosystems. However, to fully realize the potential of these data, sophisticated computational approaches are needed.

Artificial Intelligence (AI), and in particular Deep Learning (DL), are revolutionizing many fields, such as speech and image recognition. These technologies are also increasingly impacting the biomedical sciences.

The Lab aims to unleash the power of AI and DL technologies for the microbiome field.

Anchored by a dedicated large GPU with Tesla A100 nodes and CPU compute clusters, the Lab will develop custom AI/DL applications for the microbiome, deploy existing software in a managed and easy-to-use environment, and provide outreach and education to the microbiome community. The Lab will be staffed by principal investigators in the Division of Computational Pathology, as well as an application scientist and network engineers.

A joint initiative between the Brigham and Women’s Hospital (BWH) Division of Computational Pathology and the Massachusetts Host-Microbiome Center (MHMC), the Lab is funded by the Massachusetts Life Sciences Center and Brigham and Women’s Hospital/Mass General Brigham. Industry and academic users will be able to access the Lab through the MHMC’s existing core services model and through collaborations.

Gerber lab study showing gut metabolites predict C. diff recurrence

Gerber lab study showing gut metabolites predict C. diff recurrence

Clostridioides difficile infection (CDI) is the most common hospital acquired infection in the USA, with recurrence rates > 15%. Although primary CDI has been extensively linked to gut microbial dysbiosis, less is known about the factors that promote or mitigate recurrence. Using broad metabolomics data and statistics and machine learning models, Jen Dawkins, a HST PhD student and member of the Gerber lab, showed the metabolites in the gut can accurately predict C. difficile recurrence. These findings have implications for development of diagnostic tests and treatments that could ultimately short-circuit the cycle of CDI recurrence, by providing candidate metabolic biomarkers for diagnostics development, as well as offering insights into the complex microbial and metabolic alterations that are protective or permissive for recurrence.

Dawkins JJ, Allegretti JR, Gibson TE, McClure E, Delaney M, Bry L, Gerber GK. Gut metabolites predict Clostridioides difficile recurrence. Microbiome. 2022 Jun 9;10(1):87. doi: 10.1186/s40168-022-01284-1. PMID: 35681218; PMCID: PMC9178838.

Mahmood Lab’s deep learning-enabled assessment of cardiac transplant rejection study is published in Nature Medicine

Mahmood Lab’s deep learning-enabled assessment of cardiac transplant rejection study is published in Nature Medicine

Endomyocardial biopsy (EMB) screening represents the standard of care for detecting allograft rejections after heart transplant. Manual interpretation of EMBs is affected by substantial interobserver and intraobserver variability, which often leads to inappropriate treatment with immunosuppressive drugs, unnecessary follow-up biopsies and poor transplant outcomes. Here we present a deep learning-based artificial intelligence (AI) system for automated assessment of gigapixel whole-slide images obtained from EMBs, which simultaneously addresses detection, subtyping and grading of allograft rejection. To assess model performance, we curated a large dataset from the United States, as well as independent test cohorts from Turkey and Switzerland, which includes large-scale variability across populations, sample preparations and slide scanning instrumentation. The model detects allograft rejection with an area under the receiver operating characteristic curve (AUC) of 0.962; assesses the cellular and antibody-mediated rejection type with AUCs of 0.958 and 0.874, respectively; detects Quilty B lesions, benign mimics of rejection, with an AUC of 0.939; and differentiates between low-grade and high-grade rejections with an AUC of 0.833. In a human reader study, the AI system showed non-inferior performance to conventional assessment and reduced interobserver variability and assessment time. This robust evaluation of cardiac allograft rejection paves the way for clinical trials to establish the efficacy of AI-assisted EMB assessment and its potential for improving heart transplant outcomes.

Lipkova, J., Chen, T.Y., Lu, M.Y. et al. Deep learning-enabled assessment of cardiac allograft rejection from endomyocardial biopsies. Nat Med 28, 575–582 (2022). https://doi.org/10.1038/s41591-022-01709-2

Gibson Lab receives $2.2 Million NIH R35 grant “Machine Learning and Control Principles for Computational Biology “

Gibson Lab receives $2.2 Million NIH R35 grant “Machine Learning and Control Principles for Computational Biology “

Grant Abstract: With our increasing ability to measure biological data at scale and the digitalization of health records, computational thinking is becoming ever more important in the biological science and healthcare. The research directions proposed in this grant look to build robust machine learning models and tools for computational biology by including principles and analysis from other engineering fields, like control, that have a proven record of incorporating robustness into the systems they have automated. This increased robustness will save resources during the development of these machine learning models. It will also lead to more reliable diagnostics, clinical tools, and machine learning based biological discoveries. We have proposed three future research directions at the intersection of machine learning, control, and computational biology (a) modeling dynamical systems, (b) robust optimization schemes (c) control principles for in vivo modeling of microbial communities. The first proposed research area involves the development of flexible models for performing inference on dynamical systems models with time-series data. Dynamical systems models are able to learn mathematically causal relationships between variables, compared to other models whose parameters may only have correlative relationships. Our flexible models will be differentiable allowing them to be trained using the same efficient algorithms and hardware that have propelled deep learning models into the spotlight. These differentiable methods will allow for us to more easily integrate the uncertainty associated with biological measurements into our models. The second research area looks to develop more robust gradient optimization algorithms, the work horse for training deep neural networks. Many of the popular algorithms used to train deep neural networks were not explicitly designed to be robust. By developing more robust optimization techniques machine learning models trained on disparate data sets at different hospitals or labs will be more reproducible and will require less time for tuning parameters, ultimately saving resources as well. These robust optimization techniques will also aid in the certification of machine learning based tools that will ultimately be deployed in the clinic. The third research area we propose is an approach for the discovery and design of robust microbial communities. Communities of commensal, or engineered, bacteria have long been proposed as alternative therapies for the treatment of gut related illness (“bugs as drugs”). We propose a top down approach to identifying putative microbial consortia members from time-series experiments with germ free mice colonized by complex flora. By beginning the consortia design process in vivo we hope to overcome the challenge that many other attempts at consortia construction have encountered where in vitro designed communities do not reproduce their intended properties once transferred into living host organisms. The tools from this work will be built using open access software and all data will be made easily accessible and explorable to the public.

Link to NIH AwardGibson Lab Website

Gibson Lab receives $450K NIH R21 grant “Tracking the microbiome: purpose-built machine learning tools for tracking microbial strains over time”

Gibson Lab receives $450K NIH R21 grant “Tracking the microbiome: purpose-built machine learning tools for tracking microbial strains over time”

Grant Abstract: Approximately 150 million people annually experience urinary tract infections (UTI), the most common cause of which is uropathogenic Escherichia coli (UPEC). The gut is a known reservoir of UPEC, which typically reside at low abundance, but can transcend the periurethral area to invade the bladder. While the E. coli population within the gut can be diverse, it has been suggested that certain strains have a greater propensity to migrate and cause infection. This may be one driving factor to explain why half of those with an acute infection have a recurrence even after taking antibiotics that clear the first infection from the urinary tract. Being able to detect and track E. coli strains over time would have direct clinical applications for those patients who have frequent recurrences due to gut UPEC carriage. One such clinical application would be early detection and intervention before the onset of infection. Unfortunately, current metagenomic algorithms are not capable of performing strain tracking accurately enough for clinical relevance, especially for low abundance species such as E. coli. A major factor for this lack of accuracy is that all current state-of-the-art metagenomic tools completely ignore temporal dependence between samples. Even if it is known that multiple samples are from the same patient, current tools analyze those samples as if they were independent. Furthermore, many metagenomic tools ignore the sequence quality information that is provided for every nucleobase in every read. We propose to develop a more precise strain tracking algorithm that does take this additional information into account, making the tool host-time-quality aware. Finally, we will pilot and validate our algorithm on a clinically relevant gnotobiotic colonization model. Specifically, humanized germ-free mice will be undergoing two rounds of E. coli challenges with therapeutic perturbations from antibiotics or mannosides, a small molecule precision antibiotic-sparing therapeutic. We propose the following specific aims: (1) Develop the first purpose-built computational method for tracking bacterial strains in the microbiome over time, (2) Gnotobiotic mouse model undergoing UPEC challenges and a therapeutic perturbation. These aims would advance the microbiome field forward allowing for the future development of therapeutics and clinical diagnostics.

Link to NIH AwardGibson Lab Website

Mahmood Lab’s study on AI-based cancer origin prediction using conventional histology is published in Nature

Mahmood Lab’s study on AI-based cancer origin prediction using conventional histology is published in Nature

Cancer of unknown primary (CUP) origin is an enigmatic group of diagnoses in which the primary anatomical site of tumour origin cannot be determined1,2. This poses a considerable challenge, as modern therapeutics are predominantly specific to the primary tumour3. Recent research has focused on using genomics and transcriptomics to identify the origin of a tumour4–9. However, genomic testing is not always performed and lacks clinical penetration in low-resource settings. Here, to overcome these challenges, we present a deep-learning-based algorithm—Tumour Origin Assessment via Deep Learning (TOAD)—that can provide a differential diagnosis for the origin of the primary tumour using routinely acquired histology slides. We used whole-slide images of tumours with known primary origins to train a model that simultaneously identifies the tumour as primary or metastatic and predicts its site of origin. On our held-out test set of tumours with known primary origins, the model achieved a top-1 accuracy of 0.83 and a top-3 accuracy of 0.96, whereas on our external test set it achieved top-1 and top-3 accuracies of 0.80 and 0.93, respectively. We further curated a dataset of 317 cases of CUP for which a differential diagnosis was assigned. Our model predictions resulted in concordance for 61% of cases and a top-3 agreement of 82%. TOAD can be used as an assistive tool to assign a differential diagnosis to complicated cases of metastatic tumours and CUPs and could be used in conjunction with or in lieu of ancillary tests and extensive diagnostic work-ups to reduce the occurrence of CUP.

Lu, M.Y., Chen, T.Y., Williamson, D.F.K. et al. AI-based pathology predicts origins for cancers of unknown primary. Nature 594, 106–110 (2021). https://doi.org/10.1038/s41586-021-03512-4