The official website of the HHMI Science Education Alliance-Phage Hunters Advancing Genomics and Evolutionary Science program.

Welcome to the forums at Please feel free to ask any questions related to the SEA-PHAGES program. Any logged-in user may post new topics and reply to existing topics. If you'd like to see a new forum created, please contact us using our form or email us at

gene at end of some EK1s?

| posted 09 Dec, 2020 00:02

We're curious to hear what others think about this situation. We're working on CrunchyBoi and PineapplePluto and here's what we see.

CrunchyBoi, PineapplePluto, and Pabst (EK1) share a ~100 bp insertion on the right end of the genome relative to other EKs. Glimmer calls a gene in this region in all three, and there is coding potential across the orf in self-trained Genemark. The orf fills the gap created by the insertion. Its overlaps are a bit large (7 bp overlap with the upstream gene, and 14 bp overlap with the downstream gene). SD z=2.5. Nucleotide substitutions in the ORF among the three genomes change 14 codons: 10/14 codon changes are synonymous and do not alter the AA, and 3 of the 4 nonsynonymous changes make chemically conserved AA changes (V->A, N->D, E->K; non-conserved = W->G). We might expect to see this pattern if natural selection is keeping this ORF open and producing a protein. (I have attached views of the data.)

If the orf was in the middle of a genome, we would probably call it as a small gene. But the EKs are circularly permuted, and this reverse ORF sits over the boundary at the genome ends, causing us to proceed with more caution.

Should we call it a gene?

Edited 09 Dec, 2020 01:06
| posted 09 Dec, 2020 01:35
I'll weigh in. I like the way you are thinking, I would call this gene. Easily if it were in the middle somewhere; with trepidation that it is at the end of the genome. None of our commonly used programs 'like' genes that wrap around the end of the genome. GenBank consistently asks us if that is a really thing because we report a 'linear' genome with a gene that wraps around the ends. It is difficult to grasp that a phage genome circularizes, go figure! Phamerator is also not able to easily display a wrap around gene. If you look carefully at genome with an end-wrap around gene (like in some of the cluster C1 genomes), you will find what looks like a vertical random number displayed on a phamerator map at the midpoint of the genome. So while it is a great find, it is very unsatisfying to see it displayed. I want to link this to the cluster specific tips forums, because it may be the only place where this gene is celebrated.
As a footnote, when identifying base 1 of a circularly permuted genome I try very hard to find a gap, so i don't do this. Darn if the biology didn't catch me.
Curious what others might think about it all.
Login to post a reply.