Welcome to the forums at seaphages.org. Please feel free to ask any questions related to the SEA-PHAGES program. Any logged-in user may post new topics and reply to existing topics. If you'd like to see a new forum created, please contact us using our form or email us at info@seaphages.org.
Recent Activity
All posts created by cdshaffer
Link to this post | posted 14 May, 2021 18:20 | |
---|---|
|
Just a follow up. when I had two tandem start codons I always picked the longer gene model (based on the "All other things being equal, a longer call is usually preferable," rule) but recent work with mass spec on phage proteins suggest otherwise. I am quoting now from the online guide (this page on revising your annotations) with the somewhat obscure rule that came out of that mass spec work [note i have added the underline for emphasis]Can the start site of the downstream gene be extended so that the gene covers more of the gap? Carefully consider all possible start sites for the downstream gene. If a longer one is available, compare it to the current start site to see if it is a similar or better choice. All other things being equal, a longer call is usually preferable, but do not extend genes just to fill a gap. The exception to this are genes with two start codons in tandem, in these cases all of our wet bench experiments support the second of the two codons as the correct start. |
Posted in: Choosing Start Sites → F1 gene needs help on start site
Link to this post | posted 04 May, 2021 20:09 | |
---|---|
|
Wow, so cool. Never heard of a SGNH domain. It appears to be a specific subtype of acytltransferase. I have not had time to do a suitable in depth on this but here is the paper: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7448272/ That paper on the crystal is only 10 months old so not surprising it has not come up before. My brief survey suggests a good general annotation here might be "carbohydrate acetyltransferase" but that would need more work to provide more evidence to have the term added to the approved list. Since both domains in this protein appear to both be related to acetyltransferases of two different types I would think "acetyltransferase" is a good match from the options on the current approved terms list. |
Posted in: Functional Annotation → acyltransferase and SGNH domains
Link to this post | posted 20 Apr, 2021 15:32 | |
---|---|
|
I think Deb is right in that you you should check for alignments to domains. I can see quite a few HHPRED matches that start in the middle of the subject but align to amino acid 1 or 2 when start 797 is selected. When I get situations like this, I have my students take the amino acid sequence of the longer form and do an hhpred search. Then look at the results and ask: do those "extra" amino acids at the beginning (42 amino acids in this case) also aligning to the subject. If those amino acids do align, we take it as pretty evidence that those first amino acids are in the protein and we pick the longer form, if the amino acids to not align we pick the shorter form. |
Link to this post | posted 17 Apr, 2021 03:39 | |
---|---|
|
OK just an update. The most recent version of the database (403) vine_74 is back to pham 58246. It still appears as pham 57934 on phagesdb so the phagesdb links are out of date and will not work but the first link above will, at some point phagesdb will update and should then report vine_74 is back in pham 58426. |
Posted in: Starterator → Pham not found in Starterator
Link to this post | posted 16 Apr, 2021 16:13 | |
---|---|
|
Yes this is a database sync issue. The new database should appear by the end of today. In the mean time the results for vine_74 are still available using the older number you mentioned: http://phages.wustl.edu/starterator/Pham58426Report.pdf The URL has the exact same pattern for all phams, so if get a link that does not work and you see the pham number has changed you can always manually change the URL back to the old number and see if that works. In this case, that 58246 number does work. Sometime later this link will not work. and the newer link will: http://phages.wustl.edu/starterator/Pham57943Report.pdf |
Posted in: Starterator → Pham not found in Starterator
Link to this post | posted 28 Mar, 2021 20:06 | |
---|---|
|
Just a heads up. Christian in the Hatfull lab has been working on optimizing the parameters for the clustering of phage proteins into phams. The most recent version of the database (ver 400) shows a much larger than average shift in both the number and make-up of phams. We don't know how these changes will effect starterator analysis. It may help overall in that more genes will be grouped resulting in fewer genes ending up as orphams with no starterator report. It may also not help in that the added genes will be so divergent that they provide little evidence to interpret within the reports. All uses should be on the lookout for changes that effect the usefulness of the starterator reports. If anything that appears "off" or "confusing" in the starterator results let us know. If things seem to be working better for you let us know that too. You can use this forum or send me an email. |
Posted in: Starterator → phameration tweeks and effects on starterator
Link to this post | posted 19 Mar, 2021 16:43 | |
---|---|
|
I am still not convinced it is not one amino acid back (i.e. the slip is D/P instead of K/P). Supporting the former is base conservation, supporting the latter is the "observed pattern" for many slippery sequences. I know of no evidence to tell me which is more informative in this situation. I will certainly say that either annotation has enough support that it will qualify as "less worse" than going with the up til now policy of "annotate T as a separate gene and pick the Longest orf". So we have several BK1 and will annotate using the CCCAAAT pattern accordingly. |
Posted in: Frameshifts and Introns → No frameshift in cluster BK1?
Link to this post | posted 18 Mar, 2021 16:49 | |
---|---|
|
1st: Yup, stop codons are a no go as far as I am concerned. I was just looking at conservation in the MSA which is why I mentioned backing up; but you are correct, I would not think it a good gene model to add in "stop codon read through" (I know these do exist in eukaryotes do they even exist in Prok's?) 2nd: I am fine if Joyce or anyone else wants to include this data in a poster. Its kind of a pain to create all those DNA sequences if you don't have Starterator running in a VM so I am happy to send anyone the sequences or the alignment for any pham, just send me an email. As I said above, I think the only evidence that could be relatively easily collected that would help me make up my mind is to get a sense of how often the slippery sequence changes in other phams, if we NEVER see it change in other phams then that would make me pause here on the side of caution and stick with the "least worst" model. On the other hand if we do see it happening in other phams then I could see calling it here too. So just like in all my wet bench experiments: if you are not sure of your conclusions: run another experiment. |
Posted in: Frameshifts and Introns → No frameshift in cluster BK1?
Link to this post | posted 17 Mar, 2021 22:38 | |
---|---|
|
What an interesting and cool question! Here is an update with some more evidence: I checked 5 BK1 by hand and all have that CCCAAAT sequence. I then realized we should just look at all the sequences in the pham. So I looked at the multiple sequence alignment for the pham 5495 which include the G gene for BE and BK phages. The CCCAAAT is found in all the BK1 G genes (they all have gene numbers in the 30's) but that sequence is not found in any of the BE (genes in the 50's-60's) so if this is the slippery sequence you have to argue that it changed to CCCGGAA and yet it is still slippery -or- that the location of the slip has moved since the BE and BK genes diverged. This is fruitful ground for reasonable well trained annotators to disagree, since it is all based on individual estimations of the likelihoods of certain events occurring over evolution. Do we have any evidence of the frequency of slippery sequence turn over rates in the mycobacteriophage? That is a much more comprehensive set might be informative. Alternatively if you back up a few bases there is a sequence which is conserved for 7 of 8 residues across all phage sequences and the one degenerate position is always a pyrimidine I.e. AA(C/T)GACCC. This may not fit any pattern seen among the bench validated slippery sequences but the sample size there is low enough I am not sure how much confidence we should put in those observed patterns.
|
Posted in: Frameshifts and Introns → No frameshift in cluster BK1?
Link to this post | posted 17 Mar, 2021 21:41 | |
---|---|
|
I would guess there is a slippery sequence here but there is no way to find it as it has yet to be discovered in the lab. As an annotator I would never intentionally "make up" a slippery sequence. So even though there is likely a slippery sequence somewhere in that genome I have no way to find it. This means I know I cannot get the "right" answer. Then, if I cannot get the "right" answer, the best I can do is try to find the "least worst" answer. For me, the "least worst" is to annotate as much of the T region as I can as a gene. I know this is very likely wrong but it is "less wrong" than the alternatives of either picking a slipper sequence with no support or having no gene annotated for that region at all. And yes for many of the BK1's the "longest" form is really really short, so we just annotate that tiny gene and give it the tail assembly chaperone and hope that anyone that runs across the annotation will know enough (or go to the literature to find out) what is really going on here. But there is really no way to annotate these regions that works well for a naive reader. But I agree with Deb, if we can come up with a hypothesis that makes sense based on the published properties of slippery sequences then that is better than the current solution. I will look for the XXXYYYZ in our BK1's. P.S. for those unfamiliar with the G/T nomenclature see this page: https://seaphagesbioinformatics.helpdocsonline.com/article-6 |
Posted in: Frameshifts and Introns → No frameshift in cluster BK1?