Data-driven approaches for microbial enzyme and pathway discovery

30 Aug 2023
9:00-9:30

Data-driven approaches for microbial enzyme and pathway discovery

The rapidly evolving field of synthetic biology requires the inclusion of new enzymes in the toolbox of biological parts, e.g., for the biosynthesis and biodegradation of specialized molecules. A significant portion of microbial diversity, known as the “uncultivated majority,” has remained difficult to study in the laboratory setting, making it a potential gold mine for new enzymes and functions. In this talk, I will describe how we have accessed this untapped resource through metagenomic sequencing and recombinant protein production methods. A major bottleneck in this process is the identification of promising enzyme candidates among the vast number of uncharacterized protein sequences – a task akin to searching for a needle in a haystack. Here I will cover our research to narrow the metagenomic sequence search space to mine protein-coding sequences for the biosynthesis and biodegradation of organic molecules. Specifically, I will cover how we have incorporated statistical learning, secondary structural features, and genomic context into targeted metagenomic enzyme discovery workflows. We use design-build-test-learn cycles to construct new enzyme libraries, screen their functions, and learn patterns of enzyme-substrate specificity for iterative library design. Our overall goal is to advance automated discovery methods from (meta)genomes to build and optimize metabolic pathways.