SEA-PHAGES Logo

The official website of the HHMI Science Education Alliance-Phage Hunters Advancing Genomics and Evolutionary Science program.

Welcome to the forums at seaphages.org. Please feel free to ask any questions related to the SEA-PHAGES program. Any logged-in user may post new topics and reply to existing topics. If you'd like to see a new forum created, please contact us using our form or email us at info@seaphages.org.

Interpreting Starterator data in BM cluster phages

| posted 11 Jul, 2022 20:36
Greetings, all! We are annotating Frankenweenie, a BM cluster phage and had some questions regarding the interpretation of Starterator data. Here's an example: The start of gene 144 is called by Glimmer and GeneMark at 89678. This is also the suggested start on Starterator. But the most annotated start on Starterator is 89666 for all of the other five BM cluster phages. It also results in q1:s1 BLAST data for all of the other five BM cluster phages and as an added bonus, produces a -4 bp overlap and a gene length more in line with the other five BM cluster phages. So our question is: why are we seeing so many "suggested starts" on Phamerator for this phage that are not the same as the "most annotated" starts? And why, when we do choose the most annotated start (rather than the SS), it seems the better fit? Thanks in advance for helping me to understand Starterator better!
| posted 03 Aug, 2022 18:48
Greetings, all! I'd like to come back to my previous post regarding Starterator. We are encountering quite a few genes in Frankenweenie in which the suggested Starterator start is not the same as the most annotated start. I give an example above; here's another example: For gene 30, the original Glimmer call @bp 15143 has strength 3.13 ** not called by GeneMark; ST: start @15143 is the suggested start, but start @15146 has five most annotated. If we choose the start at 15146, which we believe is correct, we obtain q1:s1 data for all five BM cluster phages, with an alignment of 98.8%. In addition, the 15146 start produces a -1 bp overlap and expresses a gene length of 249 bp. All of the other five BM cluster phages that express the gene of this pham (7493) are also 249 bp. The Glimmer/Starterator start at 15143 provides a -4 bp overlap. So this gene has both -4 and -1 bp overlap options and the advice we got from the forum is to go with the -1 bp overlap.

So we believe that the most annotated start, 15146, is correct. But we don't understand why Starterator didn't call the most annotated start.

As always, thanks for helping me to understand Starterator better!
Best wishes,
Kathleen
| posted 07 Jun, 2023 16:53
Kathleen,

Did you ever get an answer to this question? We just got sequence back on a new BM and will be annotating it in the near future, so I am interested in what you found out.

If you haven't gotten an answer yet, it might be worth emailing Chris Shaffer for more information.

Lee
| posted 07 Jun, 2023 20:23
Lee,
The confusing part of Kathleen's post is the sentence that states "But we don't understand why Starterator didn't call the most annotated start." But she is here today so we are able to discuss.
The reason that Starterator doesn't make the right call is becuase it only reports what is in in the files submitted to phamerator. –Meaning in this particular case, Frankenweenie's auto-annotated is not what others ID'd as the best start to call. In fact, Kathleen and her students went to Starterator to justify making a change. That sentence in Starterator is to be coupled with the yellow line in the Frankenweenie graph and recogonized as in DRAFT form. What Kathleen and her students wanted was for Starterator to make the right call. Technically, it makes NO calls and just reports what was called (making human calls green and auto calls yellow).
 
Login to post a reply.