The official website of the HHMI Science Education Alliance-Phage Hunters Advancing Genomics and Evolutionary Science program.

Welcome to the forums at Please feel free to ask any questions related to the SEA-PHAGES program. Any logged-in user may post new topics and reply to existing topics. If you'd like to see a new forum created, please contact us using our form or email us at

Frameshift in DY phage Tarzan (and others)

| posted 16 Mar, 2023 18:27
I am QC-ing the frameshift in DY cluster phage Tarzan. It is a GN to GK, -1 frameshift with a slippery sequence of GGG.GGA.AAC, where it appears that the first A gets counted twice. This is most similar to the GN to GK slip seen as #15 in the table within the Bioinformatics guide (GGGAAAT). However, 2 of the 3 annotated phages (Jojo24 and Reyja) annotate the slip at the third G instead, making the slip actually a GG to GG.

I lean towards the GN to GK, since it's more similar to others where it slips at the first A, but perhaps there's something I don't know? Which one is it?

Screenshot from Tarzan attached.
| posted 17 Mar, 2023 01:00
Hi. Both choices produce the same protein sequence and unless the experiment is done to label the amino acids in the sequence, I don't think one is better than the other. There is no GN or GK decision, it is whether the second G in the final sequence of AGGK is in the first frame or the second. Does that make sense?
| posted 17 Mar, 2023 12:49
Thanks for that, Debbie! I often wonder what exactly makes the sequence "slippery" other than the obvious repeat of bases, but it's nice to know that as long as the amino acid sequence comes out right, we don't have to worry too much about which coordinate/base we annotate as being repeated.
| posted 17 Mar, 2023 14:51
That is not entirely correct. Some folks capture all of the slip in one region, where it is to be slipped between the 2 (meaning the annotation of the shared nucleotide occurs in the middle). Because the slippage happens around a lot of glycines, the sequence can look right but be actually annotated incorrectly.

In this particular case, there are 2 predicted overlapping slippery sequences. The run of Gs and the canonical XXXYYYZ. In this particular case, I can't tell which one is used without some bench work.

Edited 17 Mar, 2023 16:07
Login to post a reply.