Phage Genomes Feb 2023

See our publication in PHAGE to read about how this dataset is produced and some of our analyses of it. Please consider citing this paper if you are using this database of information on this webpage. You can also generate an up-to-date version of the database, with useful files for vConTACT2, MASH, and IToL using our Perl script available on Github. Updates to the script this month include a new column to the tsv outputs which include anything identified as “host” or “lab_host” within the original Genbank files. However, these values may be inconsistent or downright bizarre (so please use them with caution).

We also recently added annotations using PHROGs (more details available here), and you can download the updated annotations from here (please note that we won’t be re-uploading the updated annotations on a monthly basis. This file is the most recent and for the December 2022 data. Recommended to use this data if you are running for the first time).

If you don’t want to run the script yourself, please download all of the files ready-made from below:

01Feb2023_data.tsv.gz

01Feb2023_data_excluding_refseq.tsv.gz

01Feb2023_genomes.db.gz

01Feb2023_genomes.fa

01Feb2023_genomes.fa.msh

01Feb2023_genomes_excluding_refseq.fa.gz

01Feb2023_itol_family_annotations.txt.gz

01Feb2023_itol_genus_annotations.txt

01Feb2023_itol_host_annotations.txt

01Feb2023_itol_length_annotations.txt

01Feb2023_itol_lowest_taxa_annotations.txt

01Feb2023_itol_node_label_annotations.txt

01Feb2023_itol_subfamily_annotations.txt

01Feb2023_millardlab_website_table.txt

01Feb2023_phages_downloaded_from_genbank.gb

01Feb2023_refseq_genomes.fa

01Feb2023_unique_names.txt

01Feb2023_vConTACT2_family_annotations.tsv

01Feb2023_vConTACT2_gene_to_genome.csv

01Feb2023_vConTACT2_genus_annotations.tsv

01Feb2023_vConTACT2_host_annotations.tsv

01Feb2023_vConTACT2_lowest_taxa_annotations.tsv

01Feb2023_vConTACT2_proteins.faa

01Feb2023_vConTACT2_subfamily_annotations.tsv

Feb2023_data