SEA-PHAGES Logo

The official website of the HHMI Science Education Alliance-Phage Hunters Advancing Genomics and Evolutionary Science program.

Welcome to the forums at seaphages.org. Please feel free to ask any questions related to the SEA-PHAGES program. Any logged-in user may post new topics and reply to existing topics. If you'd like to see a new forum created, please contact us using our form or email us at info@seaphages.org.

All posts created by ivanerill

| posted 29 Jul, 2019 16:44
Hi Welkin,

Just double-checking before submission. If we are naming the protein product, shouldn't it be "tellurium resistance D family protein", rather than "tellurium resistance protein D family"?

Ivan
Posted in: Request a new function on the SEA-PHAGES official listTerD, tellurium resistance protein
| posted 07 Aug, 2018 16:42
OK. Microdon has not been yet submitted. There are I believe a couple more on the BH cluster that are still draft. The rest were just released by GenBank this July… smile
Posted in: Functional Annotationtail chaperones in cluster BH
| posted 06 Aug, 2018 18:38
Hi Veronique,

Thanks. Yes, our idea was not so much to make the functional annotation (we agree the evidence is relatively, weak although the HHpred hit for the non-called region is over the p=0.9 threshold and synteny points toward these being tail chaperones), but to make the longer gene call. In all BH phages the option to make the longer call on Gp23 homologs is available, and results in a similar overlap with the previous gene (Gp22).

Ivan
Posted in: Functional Annotationtail chaperones in cluster BH
| posted 06 Aug, 2018 14:59
Hi Lee,

I agree that neither call would be completely correct. The main point of our argument is that it is hard to explain why a non-called region would have a good HHpred hit with a DnaJ chaperone domain (that is, why would a non-coding region match a protein domain). Given that synteny suggests these should be tail chaperones, the HHpred hit is harder to disregard. Hence my take would be to make the longer call. As you point out, this is likely to be wrong, since it seems logical to assume that if these two are indeed tail chaperones, one would strongly suspect that they would also be using translational frameshifting. I, however, could not find conclusive evidence of this.

Regarding DnaJ, I am not aware of any formal link between them and TACs. I was able to find a reference that indicates that DnaJ is used by phages during tail assembly (http://www.jmb.or.kr/journal/download.php?Filedir=../submission/Journal/014/&num=764).
My overall take on this is that these are tail chaperones (given their syntenic arrangement, coding potential evidence for an overlap and the presence of a conserved chaperone domain in the region that would be left uncalled using GeneMark), but that they are divergent enough to not get proper hits in HHpred (beyond the DnaJ chaperone hit in the non-called region).

Ivan
Posted in: Functional Annotationtail chaperones in cluster BH
| posted 04 Aug, 2018 04:42
We suspect that the canonical arrangement of tail chaperones preceding the tapemeasure gene may be conserved in cluster BH, and possibly contain a programmed frameshift. We would like feedback on whether to annotate overlapping genes and/or help in detecting putative frameshift points. Please see attachment for details.
Posted in: Functional Annotationtail chaperones in cluster BH
| posted 12 Feb, 2018 16:06
Hi Joseph,

The typical and atypical models in GeneMark are described in:
https://www.ncbi.nlm.nih.gov/pubmed/9847079

Essentially, GeneMark uses a heuristic to figure out the coding potential in a genome. It detects the longest possible ORFs and assumes they are real ORFs. Based on these, it starts the iterative training of Hidden Markov Models to predict coding and non-coding regions. One the HMM is set, GeneMark then performs clustering of predicted protein-coding genes. The most common clustering uses two clusters, assuming that the majority of genes follow a "coherent" codon usage pattern, and there is a minority, likely the result of lateral gene transfer (LGT) that do not stick to those rules.

Final models are trained on these two clusters, resulting in the "Typical" model for "coherent" genes, and the "Atypical" model for the weirder one.

Hope this helps! smile

Ivan
Posted in: DNA MasterTypical vs atypical GeneMarkS coding potential