I presented in RECOMB Comparative Genomics 2023, in Istanbul (see my slides). Our paper, with Elenora Rachtman and Siavash Mirarab, CONSULT-II: Taxonomic Identification Using Locality Sensitive Hashing won the best paper award.
The abstract is given below.
Metagenomics is widely used to study the microbiome using environmental samples, and taxonomic classification of reads is a precursor to many analyses of such data. Taxonomic classification requires comparing sample reads against a reference dataset of known organisms. Crucially, the genomes represented in a sample may be phylogenetically distant from their closest match in the reference set. Thus, simply mapping reads to genomes is insufficient; we need to find inexact matches to species with substantial distance. While k-mer-based methods, such as Kraken, have proved popular, they have limited ability to match against distant taxa. In this paper, we use locality-sensitive hashing to design a k-mer-based method that can match reads to genomes with higher distance than existing methods. We build on an earlier contamination detection method, CONSULT, to add taxonomic classification abilities. We show in a series of experiments that our method, CONSULT-II, has higher recall than alternatives when precision is about the same. Its results can also be summarized to obtain a taxonomic profile, which we show outperforms leading methods with respect to some measurement criteria. CONSULT-II is available on GitHub https://github.com/bo1929/CONSULT-II.

A shot taken by Can Alkan during the presentation, İTÜ Maçka Oteli, Istanbul.
I successfully defended my MSc thesis
Our paper has been published in Natural Language Engineering