The official website of the HHMI Science Education Alliance-Phage Hunters Advancing Genomics and Evolutionary Science program.

Welcome to the forums at Please feel free to ask any questions related to the SEA-PHAGES program. Any logged-in user may post new topics and reply to existing topics. If you'd like to see a new forum created, please contact us using our form or email us at

No coding potential where others call a gene

| posted 10 Mar, 2023 19:47
We are annotating Olgasclover. Our 3 most closely related phages, Jodelie19, Aflac, and Figliar, call gene 4 in the second frame. They all had genemark call a gene in that frame, based on the genemark maps. Glimmer called a reverse gene there for us. We have no coding potential in frame 2 despite high homology in that area. Frame 3, where we have some coding potential, would cause a 69bp overlap. See attachment for genemark screenshots of Olgasclover and Aflac and the phamerator map of closely related phages. Any tips?
| posted 13 Mar, 2023 15:58
Hi Erin!

It looks like inserting a gene here, like your close relatives did (and deleting the reverse that was called) will probably give you nice 1 or 4bp overlaps, with both the upstream and downstream genes. Check to see; if so, this is really good evidence that you have a gene there. And if you BLAST this ORF, do you get 1:1 alignments with your close relatives? Also good evidence.

It looks like the frame that you DO have coding potential in (that would create the 69bp overlap) is also present in your relatives. I think of this non-specific coding potential as "bleed-over" from other frames. Sometimes it's real, sometimes it's not, sometimes it's "misplaced" in the incorrect frame (this is not particularly scientific, just my interpretation when I see these things). But it does seem to be consistent between the genomes you shared, so there's probably something in that region, but maybe not in that frame. Remember also that the GM outputs are generated anew each time they're run, so it's conceivable that you would see coding potential in that region, in the appropriate frame, in another iteration of running Genemark. You may also want to check to see what the GM output looks like when you run it trained on the host (I assume the one you're showing is trained on Self). Sometimes you'll get different information from each one.

Generally, even if there's no CP for your phage, if close relatives have called a gene there and I get 1:1 hits, I'd insert the gene. CP and/or 1:1 Blast hits are good enough for me. I'd rather call a gene that isn't there than NOT call one that is!
| posted 13 Mar, 2023 18:34
I looked at this too and I agree with Nikki. there is not strong evidence that there is a gene here at all, but I'm inclined to call the gene with the 4 bp overlap. Interesting that in some genomes the sequence, though not identical, does have coding potential in that frame. It is a clear demonstration of how the math of coding potential prediction breaks down with small genes.
Login to post a reply.