The official website of the HHMI Science Education Alliance-Phage Hunters Advancing Genomics and Evolutionary Science program.

Welcome to the forums at Please feel free to ask any questions related to the SEA-PHAGES program. Any logged-in user may post new topics and reply to existing topics. If you'd like to see a new forum created, please contact us using our form or email us at

Starterator Documentation

| posted 13 Feb, 2020 02:25
I am looking for a document that explains this:

In the PDF for Pham 96380 at

Gene: Magic8_Draft_1 Start: 1, Stop: 567, Start Num: 23
Candidate Starts for Magic8_Draft_1:
(Start: 23 @1 has 190 MA's), (Start: 25 @10 has 5 MA's), (31, 70), (32, 73), (68, 280), (72, 331), (81,
460), (98, 535), (102, 547), (106, 559),

This is all the documentation I can find on the Bioinformatics Guide:

And the material on the PDF seems out of date:
"Annotation and bioinformatic analysis of bacteriophage genomes - user guide to DNA master" dated 2017 - now I cannot locate the PDF on-line.
| posted 17 Feb, 2020 03:38
If I understand your question, you want to know what MA stands for! MA = manual annotation (an annotation that has been reviewed by a human being as opposed to an auto-annotation).
So for this gene, the auto-annotation has picked the start at base 1 (which is the 23rd start in this pham). When genes in this pham have been reviewed, start 23 at base 1 was picked in 190 manual annotations, while start 25 at base 10 has only 5 manual annotations. You may still want to look at CP data to see if this choice was best! Hope this helps!
| posted 17 Feb, 2020 12:36
Thank you so much. Yes, that is what I wanted to know.

I don't see that explanation anywhere else on the Bioinformatics Guide. The old format is here:

Also, there is this instruction:
Results of the Starterator analysis. If the start is conserved throughout the members of the pham and yields the longest genes, write SS for “suggested start.” If it is not applicable, (orpham, only one possible start choice, etc) record NA. If it was run and was not informative (for example, the full length nucleotide sequences of the pham members are identical and all starts are present in all pham members, or pham members are too diverse and no real consensus is reached), write NI.

Which I think maybe could add the point about MA where it mentions "conserved throughout"… that would be "conserved throughout MOST"? The "longest genes part," I am not sure about.
Login to post a reply.