Gibson Lab receives $450K NIH R21 grant “Tracking the microbiome: purpose-built machine learning tools for tracking microbial strains over time”

Gibson Lab receives $450K NIH R21 grant “Tracking the microbiome: purpose-built machine learning tools for tracking microbial strains over time”

Grant Abstract: Approximately 150 million people annually experience urinary tract infections (UTI), the most common cause of which is uropathogenic Escherichia coli (UPEC). The gut is a known reservoir of UPEC, which typically reside at low abundance, but can transcend the periurethral area to invade the bladder. While the E. coli population within the gut can be diverse, it has been suggested that certain strains have a greater propensity to migrate and cause infection. This may be one driving factor to explain why half of those with an acute infection have a recurrence even after taking antibiotics that clear the first infection from the urinary tract. Being able to detect and track E. coli strains over time would have direct clinical applications for those patients who have frequent recurrences due to gut UPEC carriage. One such clinical application would be early detection and intervention before the onset of infection. Unfortunately, current metagenomic algorithms are not capable of performing strain tracking accurately enough for clinical relevance, especially for low abundance species such as E. coli. A major factor for this lack of accuracy is that all current state-of-the-art metagenomic tools completely ignore temporal dependence between samples. Even if it is known that multiple samples are from the same patient, current tools analyze those samples as if they were independent. Furthermore, many metagenomic tools ignore the sequence quality information that is provided for every nucleobase in every read. We propose to develop a more precise strain tracking algorithm that does take this additional information into account, making the tool host-time-quality aware. Finally, we will pilot and validate our algorithm on a clinically relevant gnotobiotic colonization model. Specifically, humanized germ-free mice will be undergoing two rounds of E. coli challenges with therapeutic perturbations from antibiotics or mannosides, a small molecule precision antibiotic-sparing therapeutic. We propose the following specific aims: (1) Develop the first purpose-built computational method for tracking bacterial strains in the microbiome over time, (2) Gnotobiotic mouse model undergoing UPEC challenges and a therapeutic perturbation. These aims would advance the microbiome field forward allowing for the future development of therapeutics and clinical diagnostics.

Link to NIH AwardGibson Lab Website

Mahmood Lab’s study on AI-based cancer origin prediction using conventional histology is published in Nature

Mahmood Lab’s study on AI-based cancer origin prediction using conventional histology is published in Nature

Cancer of unknown primary (CUP) origin is an enigmatic group of diagnoses in which the primary anatomical site of tumour origin cannot be determined1,2. This poses a considerable challenge, as modern therapeutics are predominantly specific to the primary tumour3. Recent research has focused on using genomics and transcriptomics to identify the origin of a tumour4–9. However, genomic testing is not always performed and lacks clinical penetration in low-resource settings. Here, to overcome these challenges, we present a deep-learning-based algorithm—Tumour Origin Assessment via Deep Learning (TOAD)—that can provide a differential diagnosis for the origin of the primary tumour using routinely acquired histology slides. We used whole-slide images of tumours with known primary origins to train a model that simultaneously identifies the tumour as primary or metastatic and predicts its site of origin. On our held-out test set of tumours with known primary origins, the model achieved a top-1 accuracy of 0.83 and a top-3 accuracy of 0.96, whereas on our external test set it achieved top-1 and top-3 accuracies of 0.80 and 0.93, respectively. We further curated a dataset of 317 cases of CUP for which a differential diagnosis was assigned. Our model predictions resulted in concordance for 61% of cases and a top-3 agreement of 82%. TOAD can be used as an assistive tool to assign a differential diagnosis to complicated cases of metastatic tumours and CUPs and could be used in conjunction with or in lieu of ancillary tests and extensive diagnostic work-ups to reduce the occurrence of CUP.

Lu, M.Y., Chen, T.Y., Williamson, D.F.K. et al. AI-based pathology predicts origins for cancers of unknown primary. Nature 594, 106–110 (2021). https://doi.org/10.1038/s41586-021-03512-4

Mahmood Lab’s CLAM method, A Deep-Learning-based Pipeline for Data Efficient and Weakly Supervised Whole-Slide-level Analysis, published in Nature Biomedical Engineering

Mahmood Lab’s CLAM method, A Deep-Learning-based Pipeline for Data Efficient and Weakly Supervised Whole-Slide-level Analysis, published in Nature Biomedical Engineering

Deep-learning methods for computational pathology require either manual annotation of gigapixel whole-slide images (WSIs) or large datasets of WSIs with slide-level labels and typically suffer from poor domain adaptation and interpretability. Here we report an interpretable weakly supervised deep-learning method for data-efficient WSI processing and learning that only requires slide-level labels. The method, which we named clustering-constrained-attention multiple-instance learning (CLAM), uses attention-based learning to identify subregions of high diagnostic value to accurately classify whole slides and instance-level clustering over the identified representative regions to constrain and refine the feature space. By applying CLAM to the subtyping of renal cell carcinoma and non-small-cell lung cancer as well as the detection of lymph node metastasis, we show that it can be used to localize well-known morphological features on WSIs without the need for spatial labels, that it overperforms standard weakly supervised classification algorithms and that it is adaptable to independent test cohorts, smartphone microscopy and varying tissue content.

Lu, M.Y., Williamson, D.F.K., Chen, T.Y. et al. Data-efficient and weakly supervised computational pathology on whole-slide images. Nat Biomed Eng 5, 555–570 (2021). https://doi.org/10.1038/s41551-020-00682-w

$2.9M grant from the National Science Foundation  “The rules of microbiota colonization of the mammalian gut”

$2.9M grant from the National Science Foundation “The rules of microbiota colonization of the mammalian gut”

The Gerber lab in collaboration with the Wang lab at Columbia and the Gibson Lab at BWH have received a $2.9M grant from the National Science Foundation to develop and apply novel computational and experimental methods to elucidate fundamental rules governing the formation and maintenance of complex microbial ecosystems in the mammalian gut.

Abstract: Microbiomes, or the collections of trillions of bacteria and other micro-organisms living on, within and around us, have enormous impact on human life. For example, they help people digest food, promote the growth of farm animals and crops, and degrade pollutants in the environment. Despite the importance of microbiomes, the processes governing their formation and maintenance remain poorly understood. The mammalian gut is a particularly intriguing system for microbiome studies, since a diverse collection of microbes has evolved that specifically colonizes and functions in that environment. The goal of the project is to derive fundamental rules that describe and predict the dynamic process of microbial colonization of the mammalian gut. To achieve this goal, the team of investigators will develop new computer-based methods to automatically extract predictive and explanatory rules from large microbiome data sets. The team will also develop new experimental tools and generate data sets in mouse measuring how microbiomes change over time and across space in the mammalian gut. Overall, the project will further the understanding of the formation of microbiomes in mammals and can provide broader insights into the emergence of other microbial ecosystems, such as those in soil and marine environments. These insights could ultimately help scientists to rationally alter or maintain microbiomes in different environments to benefit human activities. The project will also generate practical resources for the scientific community (computer-based tools and datasets) and provide education on the microbiome to college and elementary school students through courses and hands-on labs.

A wealth of genomic data provides information as to which microbes are present in environments, but little insight into underlying factors that explain or predict complex assemblages of microbial consortia. This project aims to elucidate mechanistic factors that drive the dynamic process of microbial colonization of the mammalian gut. These determinants will be investigated at multiple systems scales, from the level of microbial communities down to the level of individual genes. The project will leverage high-throughput experimental methods developed by the investigators, to generate data characterizing functional genetic selection and spatial organization of microbiota in the mammalian gut. From the Computer Science perspective, the project will develop new computational methods to infer human-interpretable rules and other structured outputs from complex and noisy high-throughput microbiome datasets, using Bayesian and neural-style approaches that incorporate prior biological knowledge while scaling to massive datasets. This project has three main thrusts: 1) Learn microbial community-level rules that quantitatively predict population dynamics of mouse gut colonization and assess these rules across differing ranges of microbial diversity and composition, 2) Elucidate microbial gene-level mechanisms that predict mouse gut colonization dynamics, and 3) Profile microbial spatiotemporal organization and dynamics during gut colonization at the species and gene level to predict microbial community dynamics. The project is expected to establish a set of new computational and experimental tools and principles for understanding the rules of microbial colonization of the gut, with potential applications to other ecosystems including gut microbiota of non-mammalian species as well as complex environmental microbiota.