The official website of the HHMI Science Education Alliance-Phage Hunters Advancing Genomics and Evolutionary Science program.

Welcome to the forums at Please feel free to ask any questions related to the SEA-PHAGES program. Any logged-in user may post new topics and reply to existing topics. If you'd like to see a new forum created, please contact us using our form or email us at

How to choose the start of the first gene for a circularly permuted genome

| posted 26 Jan, 2018 21:27
One of our phages, Riparian, has circularly permuted genome ends, and there is a gene that straddles the genome ends (see attached dnam5 file). Can we proceed with annotation normally, or is this going to be a problem later on?

Reply from Deborah Jacobs-Sera:
When you have a circularly permuted genome (that is – a genome without defined ends) we cut it in a gap upstream of the terminase. I cut Riparian according to those guidelines. Sometimes small genes are called that we cannot support with sufficient data. That wrap-around gene is one such gene. Also note that I cut the genome so that gene 1 starts on bp1. So the start of the first gene should be bp 1!

I double-checked my decision of where best to cut this genome and am sticking with it. I do not believe that wrap-around gene is real when you check out the overlap with gene 1.
| posted 04 Apr, 2018 21:16

As a follow-up question I was wondering what the best way to annotate the first gene of a circularly permuted genome is, in particular with respect to the gap. To be more specific the phage we are currently annotating (KaiHaiDragon) has a 52992 bp long genome but the stop site of the last gene is 52944. Moreover, the start of the first gene is bp 1 meaning that, in theory, if the genome is circularly permuted as announced on PhagesDB there is a 48 bp gap between the last and the first gene. Now, should we treat our genome as circularly permuted or as linear as we annotate it? In other words, for the first gene should we indicate there is a 48 bp gap or no gap at all?

Thank you,

| posted 05 Apr, 2018 12:19
Hi Arturo,
You are right that in a circularly permuted genome you should pay attention to the gap between gene 1 and its upstream gene (in this case, the last gene).
However– we ask you write down the gap to highlight the space or overlap between genes and get you to think about whether or not you've chosen the correct start. Since you know where gene 1 starts, it becomes somewhat irrelevant to note the gap. So gene 1s in all the genomes have always received a pass on the "gap". you can just write "n/a" for gene 1.
| posted 15 May, 2018 13:21
Related to this.

We also have a wrap around gene in a C1 phage. The online documentation explains how to annotate this situation, but in the online example the wrap around gene is listed as the first gene in the list, even tho it's gene 252.

Ours is gene 258 and is listed as gene 258, is this OK?
| posted 15 May, 2018 14:42
We have a similar but a bit more complicated case: A C1 genome that ends with a gene starting at 155863 ending at 155955 and a second gene starting at 155596 ending to position 18. However there is the 1st gene that starts at position 15..
Any ideas how to deal with that??
| posted 15 May, 2018 14:59
We have that too. I think it's OK. The 3 bp overlap between the end of the wrap around gene and gene 1 is pretty common.
The end of the wrap around is ATGA. The TGA is a stop here

The beginning of gene 1 is ATGA the ATG is a start here.

I guess this is a 4 bp overlap - again, pretty common
| posted 15 May, 2018 15:11
Hi All,
We consider "gene 1" to be the gene with the lowest start coordinate.

sometimes the computer programs will reorder the list of genes to put the gene that contains the lowest coordinate at the top of the list.
thus, the last wrap-around gene is gene 258 (or whatever) is at the top because it contains bp 1. Gene 1 starts at bp 15.

hope that helps.

Login to post a reply.