SEA-PHAGES Logo

The official website of the HHMI Science Education Alliance-Phage Hunters Advancing Genomics and Evolutionary Science program.

Welcome to the forums at seaphages.org. Please feel free to ask any questions related to the SEA-PHAGES program. Any logged-in user may post new topics and reply to existing topics. If you'd like to see a new forum created, please contact us using our form or email us at info@seaphages.org.

All posts created by MSMC

| posted 15 Mar, 2019 10:06
Thanks Debbie!
Posted in: Gene or not a GeneDifferent coding potential for same sequence?
| posted 15 Mar, 2019 01:25
Gotcha. So in this situation, since the sequences are highly conserved, we would put more weight on the comparative BLAST results than the coding potential (i.e. - include the gene as opposed to excluding it)?
Posted in: Gene or not a GeneDifferent coding potential for same sequence?
| posted 15 Mar, 2019 01:20
I guess I'm a little confused then. Isn't the coding potential dependent on the sequence of that part of the genome. All of the regions in my previous post have the exact same amino acid sequence. Even if the sequences are exactly the same, we would expect different coding potentials?
Posted in: Gene or not a GeneDifferent coding potential for same sequence?
| posted 14 Mar, 2019 15:24
Debbie Jacobs-Sera
Evan,
Have you read this?
https://seaphagesbioinformatics.helpdocsonline.com/article-4

Both Glimmer and GeneMark use a sample of the target genome, so if you run them 10 times you MAY get some differences. Those differences occur most commonly when predicting 'small' genes. It is the primary reason we hand curate the genomes.

As for your question, please provide specifics: what gene, what genome, and any other pertinent data.

Thanks,
debbie

Thanks Debbie. Three examples of the gene are below with screenshots of Stanktossa (no coding potential), TinSulphur (some coding potential) and Superfresh (significant coding potential). All three align 100% (alignments shown are to Stanktossa). All three are on the reverse strand as well.

I figured three would be some differences between runs of GeneMark, but since so much weight is put on coding potential, I can see how a gene could be excluded because of those differences.

Evan
Edited 14 Mar, 2019 15:24
Posted in: Gene or not a GeneDifferent coding potential for same sequence?
| posted 14 Mar, 2019 10:37
I am curious about how the calculation for coding potential (this is trained to M. foliorum) is determined. We have a called gene that shows no coding potential whatsoever. However, this gene is called in many other phages, and the AA sequence is 100% conserved. In the phages with identical AA sequences, the amount of coding potential varies (from very little to quite a bit). The RBS values don't differ dramatically among the phages.

Is this an issue where different runs will give different outputs (I believe coding potential uses a HMM)? Or is GeneMark taking into account non-coding regions adjacent to the gene that may vary?
Posted in: Gene or not a GeneDifferent coding potential for same sequence?
| posted 10 Dec, 2018 00:11
DrCatalase
Trying to set up DNA Master on Wine for the upcoming Bioinformatics workshop and after I hit auto annotate I only get one sequence in the features tab of the Extracted from FASTA Library Shubert.FASTA window. Please help.

Have you done the update to access GeneMark and Glimmer (attached)?
Posted in: DNA MasterRunning DNA Master on a Mac using Wine
| posted 03 Dec, 2018 13:46
Welkin Pope
Hi Evan,
So here's some questions:
How big are these types of enzymes in general? Does the phage gene encompass the entire enzyme or just the N-terminal domain? Is the N-terminal domain where the catalytic activity is, or is it just a binding region?

Best,
Welkin

The median size of these proteins is 512 a.a. (with a SD of 173 a.a.) according to OrthoDB (2287 genes in 1175 species). This putative phage gene consists of 309 amino acids. Therefore it appears that the phage gene aligns with the N-terminus of this protein but not the C-terminus. HHPRED indicates positions 27-181 of the phage gene align with positions 4-177 of the target gene.

Alpha-L-arabinofuranosidase B comprises two domains: a catalytic domain and an arabinose-binding domain (ABD). The catalytic domain appears to be in the N-terminus of the protein, while the ABD appears to be in the C-terminus. The alignment in this case is with the N-terminal catalytic domain region. I am currently looking for better resolution of how much of this domain is likely aligned in this instance.
Posted in: Functional AnnotationProposed function alpha-L-arabinofuranosidase
| posted 03 Dec, 2018 03:22
This gene (Stop site 27488 in Gordonia phage Bradissa) was a point of discussion at the December 1st Hackathon at Nyack College. Below is the information supporting this gene functino as alpha-L-arabinofuranosidase. Any thoughts greatly appreciated!

—————————

Gene 30 – Proposed function alpha-L-arabinofuranosidase.

The top HHPRED result is PF09206.10 ArabFuran-catal ; Alpha-L-arabinofuranosidase B, catalytic
- This has a 100% probability (4.9 e-35 E-value) with 49.8% coverage.

The CDD has 10.7% identity to pfam02906 (E-value 1.14 e-7). The following is the description for this pfam:
Alpha-L-arabinofuranosidase B, catalytic
Members of this family, which are present in fungal alpha-L-arabinofuranosidase B, adopt a beta-sandwich fold similar to that of Concanavalin A-like lectins/glucanase. The beta-sandwich fold consists of two anti-parallel beta-sheets with seven and and six strands, respectively. In addition, there are four helices outside of the beta-strands. The beta-sandwich strands are closely packed and curved with a jelly roll topology, creating a small catalytic pocket. The domain catalyzes the hydrolysis of alpha-1,2-, alpha-1,3- and alpha-1,5-L-arabinofuranosidic bonds in L-arabinose-containing hemicelluloses such as arabinoxylan and L-arabinan.

When the amino acid sequence was run on Phyre2, the following results were provided:
Fold:Concanavalin A-like lectins/glucanases
Superfamily:Concanavalin A-like lectins/glucanases
Family:Alpha-L-arabinofuranosidase B, N-terminal domain
103 residues ( 33% of your sequence) have been modelled with 100.0% confidence by the single highest scoring template

There are three annotated phams (14992) for this gene. Lucky10 assigns a function of alpha-L-arabinofuranosidase. The other two phages (Pollux and Hedwig) with this pham do not assign a function. HHPRED analysis of this pham in these two phages give identical results to those seen with Bradissa (>99% probability of Alpha-L-arabinofuranosidase).

The role of alpha-L-arabinofuranosidase is hydrolysis of terminal non-reducing alpha-L-arabinofuranoside residues in alpha-L-arabinosides. It has been annotated in Microbacterium testaceum (https://www.uniprot.org/uniprot/E8NDZ9).
Posted in: Functional AnnotationProposed function alpha-L-arabinofuranosidase
| posted 14 Sep, 2018 13:33
Quick question about setting up plaque assays from enriched for the first round of purification: If you have a positive spot test, and you are now going back to the enriched sample for dilutions, do you use the original filtered sample, or pull more from the enriched culture and filter a new sample? Both the original filtered sample and the enriched culture have been in the fridge for a week.
Posted in: Phage Discovery/IsolationPlaque assay from enriched - refilter?
| posted 10 Sep, 2018 00:09
Thanks Vic!
Posted in: Phage Discovery/IsolationDNA Isolation from M. foliorum - any updates?