Phage Genomes Dec 2022

See our publication in PHAGE to read about how this dataset is produced and some of our analyses of it. Please consider citing this paper if you are using this database of information on this webpage. You can also generate an up-to-date version of the database, with useful files for vConTACT2, MASH, and IToL using our Perl script available on Github. Updates to the script this month include a new column to the tsv outputs which include anything identified as “host” or “lab_host” within the original Genbank files. However, these values may be inconsistent or downright bizarre (so please use them with caution).

We also recently added annotations using PHROGs (more details available here), and you can download the updated annotations from here (please note that we won’t be re-uploading the updated annotations on a monthly basis. This file is the most recent and for the December 2022 data. Recommended to use this data if you are running for the first time).

If you don’t want to run the script yourself, please download all of the files ready-made from below:

1Dec2022_data.tsv

1Dec2022_data_excluding_refseq.tsv

1Dec2022_genomes.db

1Dec2022_genomes.fa

1Dec2022_genomes.fa.msh

1Dec2022_genomes_excluding_refseq.fa

1Dec2022_itol_family_annotations.txt

1Dec2022_itol_genus_annotations.txt

1Dec2022_itol_host_annotations.txt

1Dec2022_itol_length_annotations.txt

1Dec2022_itol_lowest_taxa_annotations.txt

1Dec2022_itol_node_label_annotations.txt

1Dec2022_itol_subfamily_annotations.txt

1Dec2022_millardlab_website_table.txt

1Dec2022_phages_downloaded_from_genbank.gb

1Dec2022_refseq_genomes.fa

1Dec2022_unique_names.txt

1Dec2022_vConTACT2_family_annotations.tsv

1Dec2022_vConTACT2_gene_to_genome.csv

1Dec2022_vConTACT2_genus_annotations.tsv

1Dec2022_vConTACT2_host_annotations.tsv

1Dec2022_vConTACT2_lowest_taxa_annotations.tsv

1Dec2022_vConTACT2_proteins.faa

1Dec2022_vConTACT2_subfamily_annotations.tsv


Dec2022Genomes