SEA-PHAGES Logo

The official website of the HHMI Science Education Alliance-Phage Hunters Advancing Genomics and Evolutionary Science program.

Welcome to the forums at seaphages.org. Please feel free to ask any questions related to the SEA-PHAGES program. Any logged-in user may post new topics and reply to existing topics. If you'd like to see a new forum created, please contact us using our form or email us at info@seaphages.org.

All posts created by kcornely

| posted 05 Apr, 2023 19:23
We are annotating a cluster A3 phage LBerry and I found this discussion to be really helpful, because I didn't expect to find minor tail proteins on the left hand side of the genome. We have three genes, gp3, gp4, and gp5, that are candidate minor tail proteins. The gp4 and gp5 proteins seem promising because we get HHPred hits to collagen-type proteins, and BLAST hits to phage tail fiber proteins. The sizes of gp4 and gp5 also seem to be about right (945 bp and 636 bp, respectively). We're not sure about gp3. The size is smaller (321 bp), but we get some nice HHPred hits to tail fibers for L5 and the E. coli phage T5 (for example, the L subunit of phage T4; PDB code = 7QG9). Is gp3 still too short to be a minor tail protein? Or could we still call the function as a minor tail protein because gp3 might be a subunit that is part of a larger assembly?
Posted in: Cluster A Annotation Tipsminor tail proteins
| posted 08 Aug, 2022 22:54
Thanks, Debbie and Karen! I met with my students today, and I also consulted with a biochemistry colleague, and we investigated some of the other genes in pham 37230. There are 266 members of this pham, and I looked on phagesdb and these genes are all about 800-1000 bp in length. I looked at some of the other pham members and I used the domain function in Phamerator (and I also did a BLASTp on NCBI to get the domain information for Langerak_46 and BiteSize_54) and from what I can see, these proteins all have hits to exonuclease domains that cover the length of the protein, and no alignments with helicases. And the encoded proteins don't appear to be large enough to have both exonuclease and helicase functionalities. It is possible that the earlier annotations of the P1 cluster phages of this pham as RecB-like exonuclease/helicase were incorrect?

I'm also curious about RedRock_72 as an example of a RecB-like exonuclease/helicase. I looked on phagesdb and found that gp72 in RedRock belongs to pham 24. Virtually all members of pham 24 are assigned a Cas4 exonuclease function (as Karen says in her post), and again, these genes are about the same size (800-900 base pairs)indicating that the proteins aren't large enough to have both helicase and nuclease functionalities. Maybe we don't have an example of a phage protein that has both functionalities because one doesn't exist, as least as far as we know?

Thanks to both of you to responding to my posts–my next task is to look over a QC'd genome that we recently received, and we got quite a few of the functional assignment wrong. As a biochemist, I ought to be good at this, and it's important for me to get this right so that I can properly mentor my students.

Thanks again,
Kathleen
Posted in: Cluster P Annotation TipsRecB-like exonuclease/helicase or Cas4 family exonuclease?
| posted 05 Aug, 2022 14:56
Karen, this is very helpful, thank you. An example of a RecB-like protein would be very helpful. I understand your thought process and your logic as identifying this protein as a Cas4 endonuclease. But why do so many other P1 cluster phages assign the function as RecB-like exonuclease/helicase? Shouldn't proteins in the same pham all have the same function?
Thanks again for your reply,
Kathleen
Posted in: Cluster P Annotation TipsRecB-like exonuclease/helicase or Cas4 family exonuclease?
| posted 03 Aug, 2022 18:48
Greetings, all! I'd like to come back to my previous post regarding Starterator. We are encountering quite a few genes in Frankenweenie in which the suggested Starterator start is not the same as the most annotated start. I give an example above; here's another example: For gene 30, the original Glimmer call @bp 15143 has strength 3.13 ** not called by GeneMark; ST: start @15143 is the suggested start, but start @15146 has five most annotated. If we choose the start at 15146, which we believe is correct, we obtain q1:s1 data for all five BM cluster phages, with an alignment of 98.8%. In addition, the 15146 start produces a -1 bp overlap and expresses a gene length of 249 bp. All of the other five BM cluster phages that express the gene of this pham (7493) are also 249 bp. The Glimmer/Starterator start at 15143 provides a -4 bp overlap. So this gene has both -4 and -1 bp overlap options and the advice we got from the forum is to go with the -1 bp overlap.

So we believe that the most annotated start, 15146, is correct. But we don't understand why Starterator didn't call the most annotated start.

As always, thanks for helping me to understand Starterator better!
Best wishes,
Kathleen
Posted in: Cluster BM Annotation TipsInterpreting Starterator data in BM cluster phages
| posted 03 Aug, 2022 18:32
Greetings all!

We generated our R2I for Langerak and were pleased to see that we had assigned all functional assignments correctly except for one: Gene 46 (31633-32562). This gene belongs to pham 37230. On Phamerator, all of the other most closely related P1 cluster phages (Donovan, FirstPlacePfu, HUHilltop, Jebeks and Techage) all have this same pham, and the assigned function is RecB-like exonuclease/helicase. On the SEA PHAGES function list, we are told the following:

RecB-like exonuclease/helicase: If both a helicase and nuclease domain are present, the RecB label should be used.

Cas4-family exonuclease: This family of exonucleases is similar to the exonuclease domain of RecB. The Cas4 label should be used if the gene includes only the exonuclease region. IF the gene also includes a helicase domain, the RecB label should be used. Cas4 family nucleases tend to have alignments to the crystal structure 4R5Q_A, 41C1_A and to the PD-(D/E)XK nuclease superfamily (PF12705.7, among others)

On HHpred for Langerak, we did get hits to the Cas4 family nucleases mentioned in the above paragraph. But we also got hits to helicases: 3U4Q_B, 4CEI_A, 6PPU_B, 1W36_E, and others. We decided to assign the function as the RecB-like exonuclease/helicase because (1) we thought we had fulfilled the requirement stated on the SEA PHAGES official function list that we should have hits to both helicases and exonucleases and (2) All of the other P1 cluster phages with the same pham used this functional assignment on Phamerator.

But we were marked wrong.

I'd like to get this right going forward, so could someone please help me out?

As always, thanks for helping me "review to improve"!

Best wishes,
Kathleen
Posted in: Cluster P Annotation TipsRecB-like exonuclease/helicase or Cas4 family exonuclease?
| posted 11 Jul, 2022 20:36
Greetings, all! We are annotating Frankenweenie, a BM cluster phage and had some questions regarding the interpretation of Starterator data. Here's an example: The start of gene 144 is called by Glimmer and GeneMark at 89678. This is also the suggested start on Starterator. But the most annotated start on Starterator is 89666 for all of the other five BM cluster phages. It also results in q1:s1 BLAST data for all of the other five BM cluster phages and as an added bonus, produces a -4 bp overlap and a gene length more in line with the other five BM cluster phages. So our question is: why are we seeing so many "suggested starts" on Phamerator for this phage that are not the same as the "most annotated" starts? And why, when we do choose the most annotated start (rather than the SS), it seems the better fit? Thanks in advance for helping me to understand Starterator better!
Posted in: Cluster BM Annotation TipsInterpreting Starterator data in BM cluster phages
| posted 11 Jul, 2022 18:38
Greetings, all! I have another comment about the 4 bp overlap. We are annotating Frankenweenie, a BM cluster phage that has only six members total. We are occasionally encountering situations where it doesn't make sense to call the start at the position that provides a 4 bp overlap. For example, for gene 95, Glimmer and GeneMark call the start at 65479, which provides a 4 bp overlap. But to us, it makes sense to call the start at 65470. The 65470 start (1) is not the SS, but is MA on Starterator, (2) it provides us with q1:s1 BLAST his to the other five BM cluster phages, and (3) the product is similar in size to the other five BM cluster phages. I'd love to hear your insights on this observation, and thanks in advance!
Posted in: Cluster BM Annotation Tips4 bp overlap in Streptomyces phages?
| posted 17 Jun, 2022 02:00
Thanks Chris and Debbie! This is incredibly helpful.
Posted in: Cluster BM Annotation TipsFrameshift in BM cluster phages?
| posted 16 Jun, 2022 15:46
Thank you Debbie! This is very helpful.
Posted in: Cluster BM Annotation Tips4 bp overlap in Streptomyces phages?
| posted 16 Jun, 2022 15:45
Greetings, all! We are annotating Frankenweenie, a Streptomyces BM cluster phage. We haven't annotated a BM cluster phage before. We were wondering if gp65 and gp66 were tail assembly chaperones and if gp66 was translated as a frameshift. There are six BM cluster phages, and the only phage in which the frameshift is called is Satis. It is not called for the other BM cluster phages. A frameshift seems likely for Frankenweenie because there is considerable overlap between gp65 and gp66. Thanks in advance for your help!
Kathleen
Posted in: Cluster BM Annotation TipsFrameshift in BM cluster phages?