SEA-PHAGES Logo

The official website of the HHMI Science Education Alliance-Phage Hunters Advancing Genomics and Evolutionary Science program.

Welcome to the forums at seaphages.org. Please feel free to ask any questions related to the SEA-PHAGES program. Any logged-in user may post new topics and reply to existing topics. If you'd like to see a new forum created, please contact us using our form or email us at info@seaphages.org.

GG cluster DNA primse/helicase

| posted 06 Apr, 2023 22:49
I could use some thoughts on an interesting gene in Huwbert, called stop codon is 58,402. This is syntenic to a Pham called as RecA-like DNA recombinase in Triscuit. But I am not so sure. Nucleotides 335-568/9 find homology to RecA-like recombinases, but only to the AAA-ATPase domain (HHPred). Likewise HHPred picks up other proteins that have an AAA-ATPase in the same stretch. From nt 1-312, there is strong predicted structural homology to Replication Protein B Primase, a function not in the approved function list. Not completely sure where to go with this thing. I could call a DNA primase/helicase, but do I actually have evidence for the helicase function, or just for an AAA-ATPase. A primase would not need an ATPase, so I rather suspect the helicase is there. Anyway, I'd love some feedback. Thanks.
| posted 07 Apr, 2023 17:35
TL;DR: I see on Pecaan that your current annotation is "DNA Primase/helicase" and I would also annotate that way as well.

Long story:
I ran a full HHPRED result and looked at the overall general structure of this protein. See attached annotated imare. I noted in the results 3 regions each with a unique set of hits. In looking at the N terminal region the consensus of the 4 hits in the green box strongly suggest some kind of primase or polymerase. Since primases are a specialized type of polymerases cannot really tell from just those 4 hits what would be the best annotation. However the middle section has over 100 hits and I cut off the image with the top 15 or so. The general consensus of all those hits in the blue box are (as you said) some kind of helicase which is also supported by the AAA-ATPase hits. The really curious part is all the hits in the region of the orange box, the general consensus of all those hits is a DNA binding domain. I don't ever remember seeing a DNA binding domain attached to a helicase domain. I did look through the pfam architectures to see if this combination has been seen before but my 10 minutes of poking did not reveal any examples like this. So taken together we have a DNA binding domain, a helicase domain, and a primase/polymerase domain. It looks to me like maybe a polymerase specifically designed to start replication at a single location "maybe?". It would take a really deep dive to know for sure this combination of domains is truly novel but with just my 30 minutes poking around I could not find anything quite like this, hence I have no good annotation because there is nothing else like this I could find in the published literature to link to. So the practical solution in my mind is to pick the best of the approved terms, which in this case is to highlight the presence of both the primase domain and the helicase domain. Hence I would annotate as you did: DNA primase/helicase

As with many annotations, not ideal, one might say "not even good" but in the end it is the best term we have.
| posted 09 Apr, 2023 14:57
Thanks–that is really quite helpful. We'll leave it a primase/helicase and move along, but I do think that this gene warrants a deep-dive. I'll put it on my list of potential student projects.
| posted 28 May, 2024 13:02
As a follow up to this, I have found more genes with the same architecture. My gene of interest is the Andre gene that has its stop at 49133. It also has the pattern 1) DNA primase/polymerase, 2) Helicase, 3) DNA binding domain. In the process of looking for a way to separate this from the typical DNA primase/helicase architechture, I noticed that the example gene given in the approved function list for "DNA primase/polymerase/helicase" (GreenHearts_47) also has this same architecture. (The approved function list specifies that it is supposed to have three different domains.)

So far as I can tell, the difference between the approved function list's example “DNA primase/helicase” and the approved function list's example “DNA primase/polymerase/helicase” does not seem to have anything to do with the presence of an additional polymerase domain. Rather, the second gene includes a C-terminal DNA binding domain. I attached annotated screenshots of these HHPred hits in the same format that Chris used above. The pham containing the Andre gene (with stop 49133) calls a mix of "DNA primase/helicase" and "DNA primase/polymerase/helicase." I assume the genes in the pham generally have a similar architecture to the Andre gene (HHPred screenshot in attachment). The approved function list does not have a call along the lines of “DNA primase/helicase/DNA binding," but I’m wondering if that is the gene architecture that most of the genes currently called “DNA primase/polymerase/helicase” actually have?

Is there a way to search PhagesDB genes by the functional annotations they were assigned? I'd be interested in looking at more genes with the "DNA primase/polymerase/helicase" call in other phams.
Edited 28 May, 2024 13:39
| posted 28 May, 2024 17:57
As far as I know there is no way to use just the graphical interface to get the list of all proteins with a certain functional call. You can do this easily with the command line. Just to get you started I created a few files really easy to do if you know how to search the Actino_draft database.

So 1st I did a check of the variatinon of the terms that you might be interested in. So here are with all the various functional terms that have been used that include "primase". That file is here: http://phages.wustl.edu/primase_terms.txt

Next here is a list of all phams where at least one member is annotated with a term that starts "DNA primase". There are 70 of those phams you could look at in more detail: http://phages.wustl.edu/phams_with_primase.txt

finally I create a long tab delimed list that reports the phage, the genes, the phams and the function where the function starts with "DNA primase…". This list is just over 3600 entries so you would probably want to download, open in Excel or similar and filter. You could download here: http://phages.wustl.edu/primase_phams.txt
| posted 28 May, 2024 18:08
If you click the "phages" dropdown on PhagesDB, scroll to the bottom of the list and click "phams" you will be brought to the 'find a pham' search page where the last option is to Search gene notes–that will let you search "DNA primase/polymerase/helicase". Be aware that some of the DNA primase/polymerase/helicase annotations only have two domains where the primase/polymerase domain hits (CDD) to the bi-functional Prim-pol domain (should show up as an N-terminal domain with helicase as a C-terminal domain).
 
Login to post a reply.