SEA-PHAGES Logo

The official website of the HHMI Science Education Alliance-Phage Hunters Advancing Genomics and Evolutionary Science program.

Welcome to the forums at seaphages.org. Please feel free to ask any questions related to the SEA-PHAGES program. Any logged-in user may post new topics and reply to existing topics. If you'd like to see a new forum created, please contact us using our form or email us at info@seaphages.org.

All posts created by cdshaffer

| posted 26 Jul, 2021 23:40
I have run across a very clean HHPRED hit to the crystal structure 2OST in phage Jada (CDS 81782 - 82135) and Bartholomune (CDS 105264 - 105620). There are two more genes in the pham group, Braelyn_205 which is annotated NKF and Racecar_217 which is currently annotated "Holiday junction resolvase" and is quite a bit larger than the other three.

For Jada the protein has an HHPRED hit with 99.9% probability and 99.1% coverage to the 2OST crystal. There are secondary hits to Holiday junction resolvases but these alignments are only about 45% coverage. This Jada protein also aligns in HHPRED to the Pfam PF11645 with 98.9% prob 98% coverage.

the paper associated with this crystal is here: The restriction fold turns to the dark side: a bacterial homing endonuclease with a PD-(D/E)-XK motif.

Yhis paper described a different type of endonuclease with a fold pattern and critical amino acids which is distinct from the more typical HNH endonucleases. Following the typical structure of our other endonucleases I would propose either "PD-(D/E)-XK endonuclease" based on the terms used in the above EMBO paper or the same without the dashes (i.e. PD(D/E)XK ) or the variant used with the Pfam "PD-(D/E)XK endonuclease"

Note jada protein does not align to any of the other types of endonucleases (HNH, RusA, G-I-Y Y-I-G, or LAGLIDADG).
Edited 26 Jul, 2021 23:50
Posted in: Request a new function on the SEA-PHAGES official listPD-(D/E)XK endonuclease
| posted 21 Jul, 2021 17:28
I have seen this behavior before but for me it is almost always if I submit a sequence for a phage that is not yet in the phamerator database. So it never surprised me there were no links as there was noghing in the phamerator database to link to yet. The work around is to use the pham links in the phagesdb blast results. Those links in the right hand column link to the phagesdb page for the pham report which have links to the starterator page.

One think you might also want to try (which I have had variable success) is to go to the admin page, search for your phage name in the Phage table and click the edit button. Then see what phage it matching in the "Phamerator Phage Match" menu. For phage that I have that are not in the phamerator database I try to pick the phage most similar to my phage by BLAST, but you might also want to see if your phage is on the list and is selected. You might have to be the "owner" of the phage for that to work. Not sure.
Posted in: PECAANPECAAN not showing Starterator and Pham info
| posted 07 Jul, 2021 18:18
Claire and (presumably) Dex and any other coders,
thanks for fixing this bug and all the work you do to keep updating and improving PECAAN.
Posted in: PECAANPECAAN and tRNA notes problem?
| posted 21 Jun, 2021 22:01
Thanks for being patient, I have been away visiting my sister

There have been some issues with phagesdb pham assignments lately. So phamerator.org, Starterator and PECAAN all agree with each other but phagesdb has those same genes in a different pham, thus the link from phagesdb to starterator will not work. I just checked a few genes in Tarkin and could not find any broken links, so the most recent update to version 414 may have fixed things.

If not, the best thing to do is double check the pham number on phamerator.org that should give you the correct pham number for the starterator report. I do appreciate the desire to delay some curriculum until students are somewhat comfortable with the the basics, I do the same with my class. So if you have not introduced phamerator.org or PECAAN you can use the "whole phage" starterator report. When I use these reports I just use the search feature in my PDF viewer to search for the stop coordinate, this will usually only appear in the top table and the last page of the report for the gene. I then scroll up to find all the other bits. The "whole phage" Starterator report for phage Tarkin is available here:

https://wustl.box.com/s/7xsd2c9zvh44iob96avgdqa3urh3uwzy

P.S. For anyone else, feel free to post a request here for a whole phage report or send a request by email. They take about 3 minutes of my time to start the process and about 2 minutes to post once the analysis is done, so happy to post reports for any requests.
Posted in: StarteratorPham not found in Starterator
| posted 09 Jun, 2021 18:06
As a follow up, Here is a PDF I created with examples from Starterator and how to interpret them. It goes from simple/obvious interpretation to more subtle. Unfortunately it is not an exhaustive set of examples but I think it did help my students.

Starterator_examples.pdf
Edited 18 Jan, 2022 04:33
Posted in: StarteratorStarterator intro lecture
| posted 04 Jun, 2021 20:59
I am posting here the short 14 slide deck I use for the introduction to starterator in my phage class for 2021. I usually introduce starterator late in the training after they have already had a chance to work with Glimmer, GeneMark, SD, gap score, and BLAST results. Feel free to use or edit.

intro_starterator.pptx

Edit to update link 1/18/2022
Edited 18 Jan, 2022 04:34
Posted in: StarteratorStarterator intro lecture
| posted 03 Jun, 2021 18:07
Sometimes we just cannot see the simplest problems because we are concentrating so hard on the task at hand.
You should try installing DNA Master on a Windows VM, not on an Ubuntu VM. There is a way to get Windows for free. I think the most recent instructions are here:

https://phagesdb.org/media/docs/InstallingWindowsOnMac.pdf

But if anyone knows of a more recent set of instructions please post.
Posted in: DNA MasterDNA master server down?
| posted 14 May, 2021 18:20
Just a follow up. when I had two tandem start codons I always picked the longer gene model (based on the "All other things being equal, a longer call is usually preferable," rule) but recent work with mass spec on phage proteins suggest otherwise. I am quoting now from the online guide (this page on revising your annotations) with the somewhat obscure rule that came out of that mass spec work [note i have added the underline for emphasis]

Can the start site of the downstream gene be extended so that the gene covers more of the gap? Carefully consider all possible start sites for the downstream gene. If a longer one is available, compare it to the current start site to see if it is a similar or better choice. All other things being equal, a longer call is usually preferable, but do not extend genes just to fill a gap. The exception to this are genes with two start codons in tandem, in these cases all of our wet bench experiments support the second of the two codons as the correct start.
Posted in: Choosing Start SitesF1 gene needs help on start site
| posted 04 May, 2021 20:09
Wow, so cool. Never heard of a SGNH domain. It appears to be a specific subtype of acytltransferase. I have not had time to do a suitable in depth on this but here is the paper:
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7448272/

That paper on the crystal is only 10 months old so not surprising it has not come up before. My brief survey suggests a good general annotation here might be "carbohydrate acetyltransferase" but that would need more work to provide more evidence to have the term added to the approved list. Since both domains in this protein appear to both be related to acetyltransferases of two different types I would think "acetyltransferase" is a good match from the options on the current approved terms list.
Posted in: Functional Annotationacyltransferase and SGNH domains
| posted 20 Apr, 2021 15:32
I think Deb is right in that you you should check for alignments to domains. I can see quite a few HHPRED matches that start in the middle of the subject but align to amino acid 1 or 2 when start 797 is selected.

When I get situations like this, I have my students take the amino acid sequence of the longer form and do an hhpred search. Then look at the results and ask: do those "extra" amino acids at the beginning (42 amino acids in this case) also aligning to the subject. If those amino acids do align, we take it as pretty evidence that those first amino acids are in the protein and we pick the longer form, if the amino acids to not align we pick the shorter form.
Posted in: Choosing Start SitesSecond opinion Cluster F1 Gene Start Site