As our group often sequence a lot of bacteriophages, we often want to know if they are new genus and/or species, as quickly as possible. So after several incarnations we have developed taxMyPhage to do this as efficiently as possible. It included contributions several undergrad research projects (Maria Lestido , Moi Thomas and Deven Webster), further brain storming with Thomas and code development, Remi who made it work a lot quicker and made all the conda packages work, and taxonomy guidance and testing from Dann. We now have a standalone tool and Webversion and preprint.
What it will do
- Classify dsDNA phages at the genus and or species level only
The webserver will provide taxonomy for a predicted phage and allow a download of a upper right matrix of similarity against other phages classified by ICTV . It will NOT compare against or phages in NCBI, only species that are classified by ICTV.
The standalone tools offers the same, with additional options. The standalone tool has no restriction on the number of input sequences and can be run on 1000s of sequences. If the provided genomes are not complete then inaccurate results maybe obtained, we are implementing the algorithm developed here, that normalises for sequence similarity over total genome length. Additionally the standalone tool will produce similarity matrices for multi-fasta input, allowing calculation of intergenomic similarity. Uses cases for this might be to identify representative sequences from a large dataset.
Interpretation
The output of taxMyPhage will produce be top right matrix (example below) and a tsv that provides the assigned taxonomy.
Example of new genus and new species
The matrix below shows the query has < 70% ANI to any currently classified phage, thus will be a new genus based on ICTV criteria of >70% being within the same genus. As it is < 70% ANI , it is also < 95% ANI , thus is also a new species. Based on the closest genomes identified in searching, these were all members of the Phapecoctavirus. Further phylogenetic analysis would be required to confirm this is the closest Genera of phages – this is just based on similarity cutoffs.
Example of an existing species and conflicts in taxonomy
Below the query >95% ANI to an existing phage so would be the same species (Muminvirus mumin) . However, there are other things to note .The query has greater than 70% >ANI with two distinct genera (others already classified phages do too). Taxonomy is not perfect , taxMyPhage cannot solve these issues but will report them , for the user then to decide what to do.
Warnings will look like this