The Massachusetts Lab for Artificial Intelligence/Deep Learning for the Microbiome

The Massachusetts Lab for Artificial Intelligence/Deep Learning for the Microbiome

Through a $3.3M grant from the Massachusetts Life Science Center and in-kind support from Brigham and Women’s Hospital and Mass General Brigham, the BWH Massachusetts Host-Microbiome Center (MHMC) and Division of Computational Pathology will establish a new lab to develop and apply advanced AI/deep learning technologies to microbiome research. Dr. Georg Gerber, Chief of BWH Computational Pathology and co-director of the MHMC will head the new lab.

The microbiome is inherently complex and dynamic. Multi-omic data characterizing microbes in culture systems, animal models, and human populations can provide unique and complementary insights into these rich host-microbial ecosystems. However, to fully realize the potential of these data, sophisticated computational approaches are needed.

Artificial Intelligence (AI), and in particular Deep Learning (DL), are revolutionizing many fields, such as speech and image recognition. These technologies are also increasingly impacting the biomedical sciences.

The Lab aims to unleash the power of AI and DL technologies for the microbiome field.

Anchored by a dedicated large GPU with Tesla A100 nodes and CPU compute clusters, the Lab will develop custom AI/DL applications for the microbiome, deploy existing software in a managed and easy-to-use environment, and provide outreach and education to the microbiome community. The Lab will be staffed by principal investigators in the Division of Computational Pathology, as well as an application scientist and network engineers.

A joint initiative between the Brigham and Women’s Hospital (BWH) Division of Computational Pathology and the Massachusetts Host-Microbiome Center (MHMC), the Lab is funded by the Massachusetts Life Sciences Center and Brigham and Women’s Hospital/Mass General Brigham. Industry and academic users will be able to access the Lab through the MHMC’s existing core services model and through collaborations.

Gerber lab study showing gut metabolites predict C. diff recurrence

Gerber lab study showing gut metabolites predict C. diff recurrence

Clostridioides difficile infection (CDI) is the most common hospital acquired infection in the USA, with recurrence rates > 15%. Although primary CDI has been extensively linked to gut microbial dysbiosis, less is known about the factors that promote or mitigate recurrence. Using broad metabolomics data and statistics and machine learning models, Jen Dawkins, a HST PhD student and member of the Gerber lab, showed the metabolites in the gut can accurately predict C. difficile recurrence. These findings have implications for development of diagnostic tests and treatments that could ultimately short-circuit the cycle of CDI recurrence, by providing candidate metabolic biomarkers for diagnostics development, as well as offering insights into the complex microbial and metabolic alterations that are protective or permissive for recurrence.

Dawkins JJ, Allegretti JR, Gibson TE, McClure E, Delaney M, Bry L, Gerber GK. Gut metabolites predict Clostridioides difficile recurrence. Microbiome. 2022 Jun 9;10(1):87. doi: 10.1186/s40168-022-01284-1. PMID: 35681218; PMCID: PMC9178838.

Gibson Lab receives $2.2 Million NIH R35 grant “Machine Learning and Control Principles for Computational Biology “

Gibson Lab receives $2.2 Million NIH R35 grant “Machine Learning and Control Principles for Computational Biology “

Grant Abstract: With our increasing ability to measure biological data at scale and the digitalization of health records, computational thinking is becoming ever more important in the biological science and healthcare. The research directions proposed in this grant look to build robust machine learning models and tools for computational biology by including principles and analysis from other engineering fields, like control, that have a proven record of incorporating robustness into the systems they have automated. This increased robustness will save resources during the development of these machine learning models. It will also lead to more reliable diagnostics, clinical tools, and machine learning based biological discoveries. We have proposed three future research directions at the intersection of machine learning, control, and computational biology (a) modeling dynamical systems, (b) robust optimization schemes (c) control principles for in vivo modeling of microbial communities. The first proposed research area involves the development of flexible models for performing inference on dynamical systems models with time-series data. Dynamical systems models are able to learn mathematically causal relationships between variables, compared to other models whose parameters may only have correlative relationships. Our flexible models will be differentiable allowing them to be trained using the same efficient algorithms and hardware that have propelled deep learning models into the spotlight. These differentiable methods will allow for us to more easily integrate the uncertainty associated with biological measurements into our models. The second research area looks to develop more robust gradient optimization algorithms, the work horse for training deep neural networks. Many of the popular algorithms used to train deep neural networks were not explicitly designed to be robust. By developing more robust optimization techniques machine learning models trained on disparate data sets at different hospitals or labs will be more reproducible and will require less time for tuning parameters, ultimately saving resources as well. These robust optimization techniques will also aid in the certification of machine learning based tools that will ultimately be deployed in the clinic. The third research area we propose is an approach for the discovery and design of robust microbial communities. Communities of commensal, or engineered, bacteria have long been proposed as alternative therapies for the treatment of gut related illness (“bugs as drugs”). We propose a top down approach to identifying putative microbial consortia members from time-series experiments with germ free mice colonized by complex flora. By beginning the consortia design process in vivo we hope to overcome the challenge that many other attempts at consortia construction have encountered where in vitro designed communities do not reproduce their intended properties once transferred into living host organisms. The tools from this work will be built using open access software and all data will be made easily accessible and explorable to the public.

Link to NIH AwardGibson Lab Website

Gibson Lab receives $450K NIH R21 grant “Tracking the microbiome: purpose-built machine learning tools for tracking microbial strains over time”

Gibson Lab receives $450K NIH R21 grant “Tracking the microbiome: purpose-built machine learning tools for tracking microbial strains over time”

Grant Abstract: Approximately 150 million people annually experience urinary tract infections (UTI), the most common cause of which is uropathogenic Escherichia coli (UPEC). The gut is a known reservoir of UPEC, which typically reside at low abundance, but can transcend the periurethral area to invade the bladder. While the E. coli population within the gut can be diverse, it has been suggested that certain strains have a greater propensity to migrate and cause infection. This may be one driving factor to explain why half of those with an acute infection have a recurrence even after taking antibiotics that clear the first infection from the urinary tract. Being able to detect and track E. coli strains over time would have direct clinical applications for those patients who have frequent recurrences due to gut UPEC carriage. One such clinical application would be early detection and intervention before the onset of infection. Unfortunately, current metagenomic algorithms are not capable of performing strain tracking accurately enough for clinical relevance, especially for low abundance species such as E. coli. A major factor for this lack of accuracy is that all current state-of-the-art metagenomic tools completely ignore temporal dependence between samples. Even if it is known that multiple samples are from the same patient, current tools analyze those samples as if they were independent. Furthermore, many metagenomic tools ignore the sequence quality information that is provided for every nucleobase in every read. We propose to develop a more precise strain tracking algorithm that does take this additional information into account, making the tool host-time-quality aware. Finally, we will pilot and validate our algorithm on a clinically relevant gnotobiotic colonization model. Specifically, humanized germ-free mice will be undergoing two rounds of E. coli challenges with therapeutic perturbations from antibiotics or mannosides, a small molecule precision antibiotic-sparing therapeutic. We propose the following specific aims: (1) Develop the first purpose-built computational method for tracking bacterial strains in the microbiome over time, (2) Gnotobiotic mouse model undergoing UPEC challenges and a therapeutic perturbation. These aims would advance the microbiome field forward allowing for the future development of therapeutics and clinical diagnostics.

Link to NIH AwardGibson Lab Website

$2.9M grant from the National Science Foundation  “The rules of microbiota colonization of the mammalian gut”

$2.9M grant from the National Science Foundation “The rules of microbiota colonization of the mammalian gut”

The Gerber lab in collaboration with the Wang lab at Columbia and the Gibson Lab at BWH have received a $2.9M grant from the National Science Foundation to develop and apply novel computational and experimental methods to elucidate fundamental rules governing the formation and maintenance of complex microbial ecosystems in the mammalian gut.

Abstract: Microbiomes, or the collections of trillions of bacteria and other micro-organisms living on, within and around us, have enormous impact on human life. For example, they help people digest food, promote the growth of farm animals and crops, and degrade pollutants in the environment. Despite the importance of microbiomes, the processes governing their formation and maintenance remain poorly understood. The mammalian gut is a particularly intriguing system for microbiome studies, since a diverse collection of microbes has evolved that specifically colonizes and functions in that environment. The goal of the project is to derive fundamental rules that describe and predict the dynamic process of microbial colonization of the mammalian gut. To achieve this goal, the team of investigators will develop new computer-based methods to automatically extract predictive and explanatory rules from large microbiome data sets. The team will also develop new experimental tools and generate data sets in mouse measuring how microbiomes change over time and across space in the mammalian gut. Overall, the project will further the understanding of the formation of microbiomes in mammals and can provide broader insights into the emergence of other microbial ecosystems, such as those in soil and marine environments. These insights could ultimately help scientists to rationally alter or maintain microbiomes in different environments to benefit human activities. The project will also generate practical resources for the scientific community (computer-based tools and datasets) and provide education on the microbiome to college and elementary school students through courses and hands-on labs.

A wealth of genomic data provides information as to which microbes are present in environments, but little insight into underlying factors that explain or predict complex assemblages of microbial consortia. This project aims to elucidate mechanistic factors that drive the dynamic process of microbial colonization of the mammalian gut. These determinants will be investigated at multiple systems scales, from the level of microbial communities down to the level of individual genes. The project will leverage high-throughput experimental methods developed by the investigators, to generate data characterizing functional genetic selection and spatial organization of microbiota in the mammalian gut. From the Computer Science perspective, the project will develop new computational methods to infer human-interpretable rules and other structured outputs from complex and noisy high-throughput microbiome datasets, using Bayesian and neural-style approaches that incorporate prior biological knowledge while scaling to massive datasets. This project has three main thrusts: 1) Learn microbial community-level rules that quantitatively predict population dynamics of mouse gut colonization and assess these rules across differing ranges of microbial diversity and composition, 2) Elucidate microbial gene-level mechanisms that predict mouse gut colonization dynamics, and 3) Profile microbial spatiotemporal organization and dynamics during gut colonization at the species and gene level to predict microbial community dynamics. The project is expected to establish a set of new computational and experimental tools and principles for understanding the rules of microbial colonization of the gut, with potential applications to other ecosystems including gut microbiota of non-mammalian species as well as complex environmental microbiota.