SEA-PHAGES Logo

The official website of the HHMI Science Education Alliance-Phage Hunters Advancing Genomics and Evolutionary Science program.

Welcome to the forums at seaphages.org. Please feel free to ask any questions related to the SEA-PHAGES program. Any logged-in user may post new topics and reply to existing topics. If you'd like to see a new forum created, please contact us using our form or email us at info@seaphages.org.

All posts created by fbaliraine

| posted 03 Mar, 2024 21:12
Thank you, Debbie, for helping put this to rest!
Fred
Posted in: Cluster F Annotation Tips4 bp overlaps
| posted 02 Mar, 2024 03:30
We need further clarification about these small genes in subcluster F1:
“The right arms of cluster F genes are characterized by TONS of tiny genes. These genes are sometimes so small that the gene prediction programs have a really difficult time predicting them. You can usually identify them easily though, as their start and stop codons will overlap with the flanking genes in a 4bp overlap.”

Does this principle of calling them in F1 even without coding potential only apply to those “with 4bp overlaps at both ends,” or does it also apply to those with a 4 bp overlap at only one end? For example, FreddyB position 33967 – 34041bp (reverse) in the -2 frame has a potential small gene (75 bp) with a 4bp ATGA overlap with the start of the upstream gene, but there is a gap between it and the downstream gene. This potential gene has a strong RBS score (3.152), though its start is not part of an operon (only its stop is; see attached). Its amino acid sequence is MRMSRSDNASARKLATNTERPTTRZ. It currently has no significant blast hits.
Thanks!
Fred
Posted in: Cluster F Annotation Tips4 bp overlaps
| posted 16 Feb, 2024 22:09
Hi Debbie,
Thank you for your insights. I still do not have a strong justification for keeping this gene, so I will go ahead and delete it. There is no coding potential, the region is left unfilled in D29 & L5, and I note that whereas nucleotide sequence is conserved across many genomes, only the above mentioned few call it. If a researcher wants to investigate that “gene” which has been called in the above-mentioned phages, it will be interesting if they get a gene product and possibly a function, in which case we would have stronger support for calling a gene in this region in future annotations.
Thanks again!
Fred
Posted in: Gene or not a GeneIs the current subcluster A4 pham 4236 (as of 2/14/2024) really a gene?
| posted 15 Feb, 2024 05:05
We have not seen any coding potential whatsoever for subcluster A4 pham 4236 in GeneMark_S, smeg or TB for the very last Gimmer-autocalled reverse “gene” in Alberto7 (51374 to 51204 bp) and Lunsford (51030 to 50860 bp). We note that this “gene” (MTSLTASCSVVLDNSTADIANRFGNPHASGFETAQRTGTLRTSRINPALDIAQSISZ) comes after a very large gap (710 bp gap, with no coding potential in the gap) from the preceding gene. We do not see any significant HHPred hits (probabilities range from 20.07-63.7%).

We however see significant BLASTp hits in phagesDB with 17 previously annotated genes, with 100% homology (see attached). To find out some possible justification for this pham, in case at least one member of the has significant coding potential, we decided to check for coding potential in all the draft and non-draft phage “genes” that match the above “genes” in phagesDB. We have found no evidence of coding potential for this gene in any of the 21 current pham members:
Tinybot_89 [51068 to 50898 (Reverse)]; Taquarus_89 [51040 to 50870 (Reverse)]; Scamp_90 [51039 to 50869 (Reverse)]; Ohfah_91 [51078 to 50908 (Reverse)]; NotAPhaseMom_89 [51042 to 50872 (Reverse)]; Maxo_92 [51044 to 50874 (Reverse)]; Maverick_91 [51038 to 50868 (Reverse)]; Lunsford_draft_86 [51030 to 50860 (Reverse)]; LochMonster_91 [51274 to 51104 (Reverse)]; Lemur_89 [51034 to 50864 (Reverse)]; Koreni_90 [50721 to 50551 (Reverse)]; Jaykayelowell_86 [51031 to 50861 (Reverse)]; Houdini22_89 [51042 to 50872 (Reverse)]; Deano_86 [51045 to 50875 (Reverse)]; ChampagnePapi_92 [51477 to 51307 (Reverse)]; Cerulean_92 [51475 to 51305 (Reverse)]; CentreCat_90 [51078 to 50908 (Reverse)]; Bombshell_91 [51038 to 50868 (Reverse)]; Alberto7_Draft_85 [51374 to 51204 (Reverse)]; LappelDuVide_Draft_85 [51037 to 50867 (Reverse)].

We are now of the view that apparently, the only reason this gene was called was because it was autocalled by Glimmer. I am strongly of the view that this gene should be deleted, but because it has been previously called in number of phages, I do not want to kill it without giving it an appellant opportunity. Any strong defense in favor of keeping it?
Thanks!

Fred
Edited 15 Feb, 2024 05:06
Posted in: Gene or not a GeneIs the current subcluster A4 pham 4236 (as of 2/14/2024) really a gene?
| posted 30 Jan, 2024 19:30
Thank you, Lee.
Indeed, we pointed out the differences to our students and instructed them on what to do, though some still get confused by the mismatches. Since the forward frames in DNA Master perfectly match with the forward frames in the GeneMark coding potential outputs, we were just hoping that there was some way to twitch the GeneMark software for the reverse frames to match those in DNA Master as well. For now, we will just keep pointing out this difference between DNA Master frames and the GeneMark outputs with regard to the reverse frames.
Fred
Edited 30 Jan, 2024 19:31
Posted in: AnnotationGlitch in phagesDB GeneMark? Reverse frames being flipped!
| posted 26 Jan, 2024 02:19
As of Jan 25, 2024, whereas the forward frames are Okay, we have noticed a mismatch between the reverse frames in DNA Master and those of GeneMark outputs. We have noticed that that coding potential (CP) for DNA Master frame -1 is in frame -2; plots/data for frame -2 is in the -1 frame, and plots for frame -3 is in the -2 frame of the GeneMark_S, GeneMark_smeg, & GeneMark_TB plots obtained from phagesDB. Is there a glitch or are we missing something? See attached. Thanks! Fred
Edited 26 Jan, 2024 02:20
Posted in: AnnotationGlitch in phagesDB GeneMark? Reverse frames being flipped!
| posted 21 Jan, 2024 07:58
Thank you, Debbie!
Case closed!
Fred
Posted in: Gene or not a GeneA very short, 15 amino acid (45 nucleotides) long gene: FreddyB (44340-44384 bp)?
| posted 20 Jan, 2024 08:29
I don’t like the fact that I see no coding potential at FreddyB (subcluster F1) position 44340-44384 bp (MNTYRIPNPVEATQZ) and the fact that it would be the shortest gene I had ever seen. However, this potential gene would be part of an operon with 4 bp (ATGA) overlaps with both the upstream and downstream genes. Given Guiding Principle #12a, the ribosome prefers a 1bp or 4 bp overlap, irrespective of RBS score. I wonder how a ribosome would skip over these two 4 bp ATGA overlaps to exclude this potential short gene that would make part of an operon. BLASTp shows more than 20 hits, with q2:S108, 100% identity, e-value = 0.43. Thoughts? See attached:
Posted in: Gene or not a GeneA very short, 15 amino acid (45 nucleotides) long gene: FreddyB (44340-44384 bp)?
| posted 21 Dec, 2023 04:52
Thank you!
In fact, a look at the six-frame translation shows 3 stop codons between 37305-37388 bp (see attached file). I will go ahead and NOT insert a gene for phage Sonah at position 37305-37388 bp.
Fred
Edited 20 Jan, 2024 04:45
Posted in: Gene or not a GeneGene or no gene at Subcluster P1 phage Sonah position 37305-37388 bp?
| posted 18 Dec, 2023 06:42
Thank you Debbie!
Fred
Edited 18 Dec, 2023 18:57
Posted in: Cluster P Annotation TipsAssignment of gp28 as holin