SEA-PHAGES Logo

The official website of the HHMI Science Education Alliance-Phage Hunters Advancing Genomics and Evolutionary Science program.

Welcome to the forums at seaphages.org. Please feel free to ask any questions related to the SEA-PHAGES program. Any logged-in user may post new topics and reply to existing topics. If you'd like to see a new forum created, please contact us using our form or email us at info@seaphages.org.

All posts created by debbie

| posted 25 Sep, 2019 18:40
If one looks at the immunity repressor of Cluster F1s like Mycobacteriophage Strokeseat, the coding potential data is confusing. The Starterator data clearly shows only 1 start that is found in all pham members.
The start in Strokeseat is at position 34251 (stop is at 33682).(gene length is 570bp).

This then provides adequate (just barely) space for promoters for this gene (CRO) and the next one (because they are transcribed in opposite directions). The start of this gene is at 34358 (stop is at 34651).
Hope you find this helpful!

Unfortunately, as is the nature of Cluster F1s, not all F1s have these phams.
Edited 08 Aug, 2023 13:20
Posted in: Cluster F Annotation TipsImmunity Repressor and CRO starts
| posted 31 Jul, 2019 20:07
Jordan (and anyone else who reads this terrific post,
I asked Chris Shaffer if I was correct in thinking that start 1 and 2 in the phamerator report 19534 were basically the same. He explained that they ARE the same. Check it out! chris just showed me that I am smarter than a computer! lol!

Deb,
what an interesting case,

TL,DR: no starts 1 and 2 are not really different even though starterator says they are.

Long version if interested:
this is a weird corner case where the starterator program is giving the literally correct answer which is really "wrong". So no the 1 and 2 should not be considered different. Looking very closely at the diagram you can see all the 1 starts have a tiny white sliver on the right edge while all the 2 starts do not. Here is the actual multiple sequence alignment (MSA) and you can see the single A base insertion in some sequence and not others (the while sliver represents the gap), I highlighted the start codon in yellow (attached):

When you ask a computer are those starts "the same" the answer is no, so Starterator considers the top 3 sequences as start 2 and the bottom 4 as start 1. What we have run across is one of the fundamental problems of MSA's, multiple sequence aligners like clustal just doesn't have any way to know if it should align as -A or A-. This is why people "hand tweek" alignments when they want the best possible alignments (for publishing conserved domains for example) because it just isn't possible to code in the external evidence needed to decide between "A-" and "-A". There might be a way to fix this in starterator, not sure, I will have to think on it.

As for recommendation I typically consider starts that are clearly that close on a starterator diagram as "the same" even if the computer gives them different numbers.

Chris

Edited 31 Jul, 2019 20:11
Posted in: Cluster DJ Annotation Tipsnew minor tail?
| posted 31 Jul, 2019 11:52
Jordan,
I would call this a minor tail protein. I would also call the longest ORF (start 17699). There is too much coding potential upstream of the 4th start. and the difference between start 1 &2 is negligible.
debbie
Posted in: Cluster DJ Annotation Tipsnew minor tail?
| posted 30 Jul, 2019 18:50
Jordan,
A good question, thanks for making it easy to check. I would call both Membrane proteins.
debbie
Posted in: Cluster D Annotation TipsHolin in D1
| posted 30 Jul, 2019 17:51
Jordan - I have a quick question (without looking) Are there any other membrane proteins in the region? If not, then it is acceptable to call this a holin.
If there are other proteins that could be the holins adjacent to these 2 genes, then i would leave it at membrane protein.
If it is more confusing that this and you want me to take a longer look, let me know.
debbie
Posted in: Cluster D Annotation TipsHolin in D1
| posted 13 Jul, 2019 14:01
Jordan,
Here is some more advice (and only advice!)
I would call these genes this way:
45007 - 45616 NKF
45477 - 45776 NKF
45773 - 46057 HTH

I wouldn't be surprised to learn that the first two genes are some sort of methytransferases. this MO (two overlapping genes) can be found in some of other methyltransferases. But there is not enough evidence to call them.

My 2 cents. This is the first of its kind, so don't sweat it. And you have looked at this set of genes in context of the entire genome and I have not.

Good luck,
debbie
Posted in: AnnotationMid-gene deletion causing frameshift and orphams
| posted 12 Jul, 2019 22:01
Hi Jordan,
I would call both genes, and I would call the second one to include all of the coding potential, even though it is a huge overlap. Then I would carefully look to see if the two pieces have functional domains that I could call separately. If you get stuck, ask again. also, what does the sequence look like that causes this gene to be disrupted. You called it a deletion, is it a single base? We should probably ask for confirmation of the sequence.
Posted in: AnnotationMid-gene deletion causing frameshift and orphams
| posted 09 Jul, 2019 19:58
I just annotated Undlulmathi (Zulu for giraffe!) and am only willing to annotate the "G" of the tail assembly chaperone. It is bigger than most "G" genes, so I am betting that is significant. I am not calling the next gene a TAC and there is NO obvious slippery sequence to call. I will modify the TAC calls of Cuke and FowlMouth soon (but not now).
debbie
7-9-19
Posted in: Cluster AC Annotation TipsTail Assembly Chaperone
| posted 25 Jun, 2019 15:13
Steve,
I am late to this game, but I have not had any trouble accessing Aragorn.
debbie
Posted in: tRNAsAragorn Issue
| posted 19 Jun, 2019 01:43
Alison,
We can do that! We will need to complete an MTA and then we can send it to you (from the NRRL collection).
Be sure that the legal contact and shipping address is correct on your institution page.
Thanks,
debbie
Posted in: ArthrobacterA. globiformis aquisition