The official website of the HHMI Science Education Alliance-Phage Hunters Advancing Genomics and Evolutionary Science program.

Welcome to the forums at Please feel free to ask any questions related to the SEA-PHAGES program. Any logged-in user may post new topics and reply to existing topics. If you'd like to see a new forum created, please contact us using our form or email us at

Nucleotide Length On Files Different - How? How to fix?

| posted 27 Feb, 2020 23:06
We just attempted to merge the DNAM5 files from 4 different students. Apparently, 2 somehow change the DNA sequence length. Two genomes are 2bp longer than the original. They have completed the annotation of their section of the genome. But because of the different lengths, we cannot merge files.


1. Is there a way to align the two DNA sequence files to determine where the nucleotides have been added or deleted?

2. If not, what is the easiest solution.


The students can see where the genome length actually changed in their various file reiteration and save. They have gone back to the file with the correct genome length and are copying notes from the version with the wrong length. They are also readjusting start codons for each gene.

Is there an easier way to do this? Obviously had we noticed the 2bp length change when it happened would could have eliminated a lot of work by going one file iteration back instead of about 20…. {Frowny face emoji}
Edited 27 Feb, 2020 23:07
| posted 28 Feb, 2020 02:28
Here are my best answers:
Is there a way to align the two DNA sequence files to determine where the nucleotides have been added or deleted?
Yes! Use the align 2 sequences at ncbi's blastn.

But my best guess is that there are 2 Ns at the end of the longer sequence. i would check there first.
| posted 28 Feb, 2020 15:51
OK "Smarty Pants". You were right. There are indeed two "Ns" at the end of the sequence.

However, we do not know how to remove them (or how they got inserted).

For completion sake on this thread, could you please describe the process for removing them?

(My student has deleted them, changed the feature list, posted and saved. But the "Ns" return in the saved .DNAM5 file once reopened. So a functional protocol would be very helpful and might help others down the road. By the way, the student thinks you must be a wizard to have known about the Ns! smile )
Edited 28 Feb, 2020 15:52
| posted 01 Mar, 2020 13:01
Hi Greg and all,
Delete the 2 N's in the sequence pane of the file. Then click the "Raw" button (top right corner). I could try to tell you what "Raw" means, but I do not know the origin. What this button does is some form of reformatting to the sequence (to endure there is no 'hidden' characters) that allows the change to be posted. (Also make sure that the little 'lock' icon in the bottom right corner is 'open (unlocked).
Login to post a reply.