Recently PHROGs was released by Terzian et al (https://doi.org/10.1093/nargab/lqab067 ). Full details are provided on their webpages and publication. Briefly their curated dataset provides tens of thousands of PHROGs with a standardised annotation attributed to each PHROG. All of this is available through their searchable website and can also be downloaded.
For first pass phage genome annotation this seems like a great resources. We standardly use Prokka for annotation of phage genomes, that allows custom hmm databases to be used for annotation. Unfortunately the HMMs provided directly by the PHROGs team don`t sit neatly into Prokka and allow the annotation linked to the PHROG to appear in the final annotation, because of differences in formats.
However, as they provided all their data in an easily downloadable form. We have taken this and reformatted to produce HMMs with the annotations included so it plays nicely with HMMER3 as part of Prokka . We have produced a single file that can but put in /opt/prokka/db/hmm directory of Prokka. Thanks to Thomas Sicheritz-Pontén for helping with sorting out getting the correct annotation into the 38,000 HMMs …
A single file containing all HMMs that can be directly added to Prokka , can be downloaded here. Warning its 3 Gb when unzipped. Thanks to Terzian et al who did all the hard work on producing the original PHROGs and curated annotation and making it available , we have just reformatted it for our own use and anybody else that might want to use it with prokka..
To get it running within prokka. Locate the installation of prokka
In my case this results in output of /usr/local/bioinf/prokka/db
and [08:43:23] * HMMs: all_VOG HAMAP
telling us there are already some HMMs databases called all_VOG & HAMAP
Within /usr/local/bioinf/prokka/db is the a directory called hmm
Thus, the full path is /usr/local/bioinf/prokka/db/hmm
The downloaded database needs to be copied into /usr/local/bioinf/prokka/db/hmm
Then run $prokka –setupdb
Running the command $prokka –listdb
[08:43:23] * HMMs: all_phrogs all_VOG HAMAP
all_phrogs will now be used by prokka. If you only want to use the PHROGs database, consider using the prokka flag of –hmms and specify /usr/local/bioinf/prokka/db/hmm/all_phrogs
Full details on adding databases are explained on the Prokka github page