Quantitative Biology Colloquium

Advancing Standards for Machine Learning in the Biosciences

When

4 p.m., Feb. 21, 2023

Where

The data revolution in the life sciences has brought on new challenges and opportunities. Large-scale and complex data from novel sensing mechanisms challenges existing infrastructure and methods of modeling and drawing inferences. Increasingly, researchers turn to machine learning (ML) methods to detect patterns and draw inferences from these datasets. While these methods have brought new insights and capabilities to process these data, new pitfalls have emerged. In many cases, the reproducibility of these studies is challenging as standards continue to evolve. Further, ML methods are being developed at a rapid rate, making it difficult for researchers to assess the utility and limitations of new methods. New standards are needed to improve the reproducibility, reusability, and comparison of ML and other computational methods. In this talk, I will discuss my efforts to contribute to the advancement of standards for ML in the biosciences. In particular, I will present a detailed evaluation and comparison of computational tools for detecting bacteriophage in metagenomes, and discuss a case study in which we provide an example deep learning workflow that meets a high level of reproducibility.