The official website of the HHMI Science Education Alliance-Phage Hunters Advancing Genomics and Evolutionary Science program.

Welcome to the forums at Please feel free to ask any questions related to the SEA-PHAGES program. Any logged-in user may post new topics and reply to existing topics. If you'd like to see a new forum created, please contact us using our form or email us at

Mid-gene deletion causing frameshift and orphams

| posted 12 Jul, 2019 01:59
I am annotating Fireball (DC) and appear to have a mid-gene deletion resulting in two orphams. The region has sequence conservation with Gene58 in Muzi (pham 45614). The deletion results in a frameshift with parts of the original gene in two different frames which are now orphams. How do I annotate this? The first gene would be truncated, so that makes sense, but should I include the second gene? The possible start sites result in a giant overlap or cutting off of coding potential? Or do I leave the second gene off altogether, since it wouldn't be likely to be expressed?
| posted 12 Jul, 2019 15:29
I had something similar in Mashley (EG). Genes 42 and 43 were orphams, the result of a frameshift that split a gene in approximately half. Interestingly, they split a "membrane protein" into two "membrane protein" proteins! Both orphams retained several TM domains! So they both got called that way.

In your case, I think calling them both NKF genes is probably appropriate (unless there is a function you can attribute to either/both combined). Without bench data to suggest otherwise, as long as it's got coding potential, it's probably safer to assume it does get made, even if it's non-functional.
| posted 12 Jul, 2019 22:01
Hi Jordan,
I would call both genes, and I would call the second one to include all of the coding potential, even though it is a huge overlap. Then I would carefully look to see if the two pieces have functional domains that I could call separately. If you get stuck, ask again. also, what does the sequence look like that causes this gene to be disrupted. You called it a deletion, is it a single base? We should probably ask for confirmation of the sequence.
| posted 12 Jul, 2019 22:26
Thanks for the advice! The deletion is just under 300bp, so I don't think it is a sequencing error (see attached phamerator screenshot). In PHamerator Fireball draft genes 61 and 62 align to Mutzi gene 58, which was also supported by blastn of this region of Fireball having two separate hits to Mutzi_58. Neither the original pham or the two orphams have any evidence of functions, so I'm just trying to decide on starts.
| posted 13 Jul, 2019 14:01
Here is some more advice (and only advice!)
I would call these genes this way:
45007 - 45616 NKF
45477 - 45776 NKF
45773 - 46057 HTH

I wouldn't be surprised to learn that the first two genes are some sort of methytransferases. this MO (two overlapping genes) can be found in some of other methyltransferases. But there is not enough evidence to call them.

My 2 cents. This is the first of its kind, so don't sweat it. And you have looked at this set of genes in context of the entire genome and I have not.

Good luck,
| posted 03 Jun, 2021 14:19
We have found in Phage Leperchaun (cluster F1) a split methyltransferase with very good HHpred evidence (gp68/gp69). In other cluster F1 there is one large methylase gene in this region, but in Leperchuan the first smaller ORF has a stop codon and the switch to frame 2 picks up the sequence with an 80bp overlap and a large second gene. In several of the HHpred hits, there are alignment to the same PDB that show continuous coverage between the 2 Leperchuan genes. There is no consensus slippery sequence but it will be interesting to see if this is the case. See attachment.
RS Pollenz
Login to post a reply.