Abstract: Algorithms for fast, accurate, and interpretable sequence annotation
Annotation of genomic and protein sequences depends on fast, accurate, and interpretable methods for classification. In this talk, I will introduce the concepts of sequence annotation and the sequence alignment algorithms used as the basis of annotation. I will then share our recent efforts in developing improved algorithmic approaches that increase the speed of alignment without sacrificing hard-won accuracy. Finally, I will describe a new method for adjudicating between competing sequence annotations, which enables uncertainty quantification, overlap arbitration, and identification of nested insertions and instances of recombinations.