SEA-PHAGES Logo

The official website of the HHMI Science Education Alliance-Phage Hunters Advancing Genomics and Evolutionary Science program.

Welcome to the forums at seaphages.org. Please feel free to ask any questions related to the SEA-PHAGES program. Any logged-in user may post new topics and reply to existing topics. If you'd like to see a new forum created, please contact us using our form or email us at info@seaphages.org.

All posts created by cdshaffer

| posted 15 Nov, 2019 21:24
This is to announce the release of Starterator version 1.2

This release has some graphical updates and incorporates cluster information into the text reporting. The cluster information can be particularly helpful for large complex phams but has little or no value for small phams comprised of a single cluster. The new version will be used to create all online pham reports starting with database version 233 which should be posted to the web based pham reports the afternoon of Nov 15th. For anyone still using the virtual machine the new version will be made available soon for download and update (details to be posted soon).

The main new features in this update:

1. Tracks now display only that region of the genes with starts, and, if the tracks are judged to be highly complex, the image is further zoomed in to show only the region surrounding called start sites
2. Tracks are now grouped by subcluster and the color of the track changes with each subcluster (Thanks for the idea go to Sally Molloy)
3. Cluster information of each phage is now added to the "Summary by Start" section (Thanks for the feature request go to Claire Rinehart)
4. A new "Summary by cluster" section has been added to give info on manual annotations specifically for each cluster found in the pham
5. Details on the number of manual annotations (MA's) for each start in a gene are now given in the "Gene information" section.

Also added to the whole phage report (available if you still run starterator in a VM):
6. A new summary table on the analysis of manual annotations of pham members and the degree of consensus with the annotated start.
7. Simplified reporting focusing on the relevant results for the phage in the whole phage report
8. Support for parsing of the "CDS function" file exported from Pecaan when using the whole unphamerated phage analysis
Posted in: StarteratorRelease of Starterator version 1.2
| posted 03 Oct, 2019 16:10
down for me as well when since last Monday
Posted in: tRNAsAragorn Issue
| posted 01 Oct, 2019 14:53
You can get the translations of any gene using /api/genes/{GeneID}/ but the API returns everything in JSON format not FASTA format and you would have to do them one at a time.
As an example if you want the info on gene 1 of phage Dori call this URL:

https://phagesdb.org/api/genes/Dori_CDS_1/

in the results you will get the amino acid sequence in the "translation".

Another alternative, although more manual, would be to use the "Download all sequences" button on the pham page of phagesdb. This will give you all the protein sequences of all members of the pham in FASTA format. Depending on your exact goal this might be good enough, you just need to figure out which phams you want and decide if all pham members are appropriate for your purpose.

Finally, if you have a specific gene list and you want all the sequences programmatically, you could consider using the phamerator database and use mysql to query the database based on gene names; the data you want is in the gene table in the translation field. You could get to that data by either using a graphical user interface or the command line.

Edits December 2021: the name of the gene has changed from Dori_1 to Dori_CDS_1; URL was updated. The current names of the genes are available from the "genesbyphage" request i.e. https://phagesdb.org/api/genesbyphage/Dori )
Edited 30 Dec, 2021 00:18
Posted in: General Message BoardPhagesDB Data API Question
| posted 07 Aug, 2019 19:10
I believe the "major" and "minor" terms relate back to very early papers where they purified phage proteins and ran them out on protein gels. So a "major" proteins would be the very intense band on the gel and would presumably be high copy number. There would also be weak or even very faint banks on the gel and these would be the "minor" proteins. The terms then had less to do with size but rather abundance.
Posted in: Cluster EA Annotation TipsMinor tail Protein on EA genomes
| posted 05 Aug, 2019 22:30
I always think that an HHPRED of 100% probability is an indication that it is very confident that the protein you are searching with belongs to the same family of proteins it is matching. Chitin is a polymer of β-(1→4)-linkages of N-acetylglucosamine (NAG) while peptidoglycan is alternating residues of β-(1,4) linked N-acetylglucosamine and N-acetylmuramic acid.

I always try to remind my students that functional annotation down to one specific substrate is very risky. In my mind it is easy to change just one or two amino acids at exactly the right place to radically change the binding properties of an enzyme to any one specific substrate. And one should be even more careful when there are known to be many different but related substrates [like polysaccharides].

So annotation of this type needs a "sanity check". Does it make sense that a phage would have an enzyme that cleaves chitin? No, not really, so I would not annotate this as a chitinase. One the other hand, it does make sense that some ancient protein that does something with N-acetylglucosamine could easily evolve into a chitinase along one branch and evolve into some other protein that somehow interacts with peptidoglycan in another branch. I would come away pretty confident that this phage protein either binds to NAG or cleaves β-(1→4)-linkages of NAG.

Since there is no approved term for "protein either binds to NAG or cleaves β-(1→4)-linkages of NAG" I would stick with "minor tail protein" but I would also be pretty confident of that annotation given the structural similarity of chitin and peptidoglycan.
Posted in: Cluster EA Annotation TipsMinor tail Protein on EA genomes
| posted 19 Jul, 2019 18:05
Ribosome Modulation Factors (RMF) were originally described as a set of proteins that appear to be involved in ribosome dimerization activity during the stationary phase of bacterial growth. See this JBchem article. According to this uniprot entry they have also been shown to hinder translation initiation.

Phage Gilgamesh (see phage in PECAAN for all the details) gene 51 (46895-47104) is a 70 aa long protein which aligns by HHPRED to the v (lowercase v not the capital V) chain of crystal 6H4N and to the A chain of crystal 2JRM. Both these chains were annotated as "Ribosome Modulation Factor". Both HHPRED alignments are in the high 80-90% coverage, 99% probability and an e scores of around e-11.

HHpred also aligns to the Pfam entry for ribosome modulation factor, with about the same coverage and e values.

I see no reason to modify the current term used and would stick with "Ribosome Modulation Factor" and it is generic enough to cover how phage might be using these proteins differently than the host but the function is still likely to have something to do with modulating ribosome activity.
Edited 19 Jul, 2019 19:57
Posted in: Request a new function on the SEA-PHAGES official listRibosomal Modulation Factor
| posted 15 Jul, 2019 20:07
for phage Saftant gene 68 (45610-45200) we have found good evidence for an anti-restriction protein but the matches are not to ArdA-like like proteins but rather a second type of antirestriction proteins called OCR. The OCR-type anti-restiction protein is found in phage T7 and is one of the first protein products expressed following infection (see this NAR paper). OCR type are typically shorter (~120 AA) than ArdA (170-180) and have sufficiently different crystal structures that I think we should avoid using the ArdA-like approved term for this protein. Here is the image of the two structures side by side.

For details, see phage saftant in PECAAN but briefly saftant 68 hits by hhpred with 94.4% probability to the crystal of gene 0.3 protein from phage T7 (D chain of this crystal) with ~90% coverage of both subject and query. I would propose either adding this second type as an approved function "antirestriction protein, OCR-like" or simply adding another term without the ArdA-like (i.e. "antirestriction protein" ) to avoid over annotation.
Edited 15 Jul, 2019 20:10
Posted in: Request a new function on the SEA-PHAGES official listnew antirestriction protein type
| posted 28 Jun, 2019 19:11
Christian and I are working on a version of the VM with the experimental version of starterator in the faculty account as well as k-phamerate for validation and phamnexus for splitstree prep. Unfortunately, there is no easy way to get the newer version of aragorn that trims the genes properly installed on that old Ubuntu 14.04 VM.
Posted in: tRNAsAragorn Issue
| posted 25 Jun, 2019 18:23
OK since we are working on a few phage here, I have installed command line Aragorn ver 1.2.38. If anyone is in a hurry and does not want to wait until web based Aragorn is back up, I can easily run aragorn command line for you, just send me an email with the sequence as a fasta file attachment and I can email back the aragorn results.
Posted in: tRNAsAragorn Issue
| posted 25 Jun, 2019 16:33
A quick google search suggests that the "ERR_CONTENT_LENGTH_MISMATCH" error is probably an issue with nginx (the web server software) so yes it is time for an email.

I did look for anywhere else that was running public available Aragorn but only found a galaxy server running the older 1.2.36 which does not appear to trim correctly (based on a single prediction in phage saftant that we ran to test).

If anyone is desperate, one can easily install the command line version with conda on mac, or apt-get on ubuntu version 18 or 19. Still looking for a viable web based solution.
Edited 25 Jun, 2019 17:44
Posted in: tRNAsAragorn Issue