With the influx of phage genomic data, there have been several changes to bacteriophage taxonomy. See the paper “A Roadmap for Genome-Based Phage Taxonomy” by Evelien Adriaenssens and Dann Turner who have led these efforts with their work with ICTV. As a result, the classical phage families of Podoviridae, Siphoviridae and Myoviridae are: kaput, have shuffled off their mortal coil, given up the ghost, or any other preferred phrase… generally, they are no more. Some will be upset by this, I am sure. But personally, I am more than happy to see them go. While they were useful for a period of time, the ability to rapidly sequence most phage genomes and put phages “in boxes” based on their genomic content, rather than what they looked like, offers so much resolution to describe differences. For Podoviridae, Siphoviridae and Myoviridae aficionados, the morphotypes of Myovirus, Podovirus and Siphovirus live on.
As a result of updates to phage taxonomy, 100s of new genera have now been created. For our own work, we are interested in rapidly identifying which genus newly isolated phages fall within. Consequently, we have collected genomes of all (dsDNA) phages classified by ICTV and organised them into directories based on genus. Extracted the common marker terL for each phage that can then be used for input into alignments for creating phylogenies, to determine how related a new phage may be. Additionally, we have run VIRIDIC on each genus.
We are hoping to have an automated process available that will rapidly identify if a new phage genome falls within a known phage genus, based on currently ICTV guidelines. As others are probably comparing phage genomes to know taxa, we have made all the data downloadable as a single file called ICTV_genera.tar.gz, which can be downloaded here as it may well be useful to others.
It contains 1653 directories (Genera)
Each directory contains:
- *gff files of every ICTV classified phage genome of that genus.
- *fsa individual fasta files every ICTV classified phage genome of that genus.
- *_terL.ffn – automated extract of terL gene of every ICTV classified phage genome of that genus.
- 04_VIRIDIC_out folder
- *pdf heatmap for that genus
- *clusters.csv file from VIRIDIC for that genus
- *MA_genCol.csv from VIRIDIC for that genus
- If genera only have 1 species, we don’t run VIRIDIC for obvious reasons
- If the directory is empty, it is a genus of an RNA phage (still sorting this- see above about dsDNA)
To get the data use :
tar -xvf ICTV_genera.tar.gz
If you find this useful, consider citing Cook, et al 2021. INfrastructure for a PHAge REference Database: Identification of Large-Scale Biases in the Current Collection of Cultured Phage Genomes. PHAGE. https://doi.org/10.1089/phage.2021.0007