SEA-PHAGES Logo

The official website of the HHMI Science Education Alliance-Phage Hunters Advancing Genomics and Evolutionary Science program.

Welcome to the forums at seaphages.org. Please feel free to ask any questions related to the SEA-PHAGES program. Any logged-in user may post new topics and reply to existing topics. If you'd like to see a new forum created, please contact us using our form or email us at info@seaphages.org.

JulietS (C1): Holliday junction resolvase?

| posted 24 Jan, 2023 00:09
Posting on behalf of a phage hunter at UCLA:

Hey Y’all,

I am a student at UCLA and we are annotating phage JulietS (C1). One of the genes I am annotating is stop @48994 F and I am having some problems with it. Attached are screenshots of the results.

There is coding potential in both the host trained (start site around 48600) and Self trained (start site around 48500). Glimmer and GeneMark call the start site at 48518 but this provides a gap of 297. The smallest gap is 84 with a start site of 48305 but that isn't covered in the coding potential. Moreover, there are only 2 non-draft genomes in this pham (54797 1/23/23). One of these phages, Rabinovish, called the function as Holliday junction resolvase. This phage has a gene downstream which isn’t found in JulietS and has an overlap of -91 bp. The other phage, ScottMcG, does not call a function and has a gap of 328 bp.

Where this gets extra complicated is looking at Phagesdb BLAST on PECAAN. There are several non-draft phages (EmmaElysia C1, Melpomini C1, Napoleon13 C1) and called the function as Holliday Junction resolvase but have the pham as 3757. Changing the start site does not change the pham sadly.

I was hoping to get some help on figuring out this tricky gene. All help is appreciated and thank you for your time!

Tejas Bouklas
Lecturer, UCLA
| posted 24 Jan, 2023 01:34
Hi phage biologist!

I'm not completely sure what your question is, but I see why this gene was a tricky one. The fact that it is in a pham with only two other draft genes is indeed suspicious. However, like you said there are hits to genes in other phams that do have functions. Extending the start site of your gene will not automatically re-assign the pham, because Phameration is only done once when we auto-annotate the genome. However, the sequence will still change, and it might result in a sequence that does match up well with the genes from those other phams.

First I looked at the pham pages for the two phams that came up a lot as evidence:
https://phagesdb.org/phams/3757/ (this pham contained the hits in HHpred with the best e-values)
https://phagesdb.org/phams/1046/

Since 3757 had the best hits, I went and looked at the pham members. Most of them had a length of 690bp, which happens to be the length of your LORF in PECAAN.

I selected the start site of 48305 in PECAAN, and then went and pulled the sequence that resulted from that start site. I also pulled one sequence from each of the two phams above so I could align and compare them in NCBI BLASTp.

>JulietS_88
MSNKAYHRTHCKVCGAKITDGTQFSTCSQHRRGWNKREVATCPCGQPAENIRSKYCGEACRKKWGKKPPARMVTKICIGCGNEFSRPHYYPGKMKYCSNACSHRQVKKVRDKFIADLPEGAIVFHSGWEIRFWAACLRFDIPIRSYDGPDIETSQGVYRPDFIIGKPNEERVVDVKGWLRPESEVKCREAGVHLVTKQELLRLESGDSLDAHRALLWNSGMNTHTAPLY

>Ava3_101
MSNKAYHRTHCKVCGAKITDGTQFSTCSQHRRGWNKREVATCPCGQPAENIRSKYCGEACRKKWGKKPPARMVTKICIGCGNEFSRPHYYPGKMKYCSNACSHRQVKKVRDKFIADLPEGAIVFHSGWEIRFVAACERFDIPWRSYDGPDIETSQGVYRPDFIIGKPNEERVVDVKGWLRPESEVKCREAGVHLVTKQELLRLESGDSLDAHRALLWNSGMNTHTAPLY

>BananaFence_99
MSICANPECGKEFEQPNKYRTAKTCSKECRYAVSASTTKASSGRWETKVCPCGVEFQSAVNKPKTYHDWDCMMKHRQEDARASRTCENPECGKEFTYFKRQNQRTCSPECRNKVTAMKRENNYPECQTCGVSTGSYNRIYCDEHRPNRPGRKPAPRITATCLCCGEEFTRPENYPGKMKYCSNACSHRQVKKVRDKFIADLPEGAIVFHSMWEVRFVAACERFDIPWRSYDGPDIETSQGVYRPDFIIGKPNEERVVDVKGWLRPESEVKCREAGVHLVTKQELLRLESGDSLDAHRALLWNSGMNTHTAPLY

When I aligned all these sequences in BLAST (try it yourself! use the "align two or more sequences" checkbox), I see an identical match between JulietS_88 and Ava3_101 (the one from Pham 3757). Although the start site we chose doesn't have the best RBS score, it does close the gap quite nicely, and it produces a gene that matches the other pham members.

Using this new sequence, the HHpred hit for 1M0D_B (Endodeoxyribonuclease I; Holliday junction resolvase,) has a probability of 99%, coverage of 36% (just over our threshold), e-value 4^-10, and aligns with residues 2-95 in this PDB entry (https://www.rcsb.org/structure/1M0D). The entire entry is only 140 residues long, so we have a match to most of the protein. This gives me confidence that this is a good hit for this gene, and Holliday junction resolvase can be called as the function.
 
Login to post a reply.