SEA-PHAGES Logo

The official website of the HHMI Science Education Alliance-Phage Hunters Advancing Genomics and Evolutionary Science program.

Welcome to the forums at seaphages.org. Please feel free to ask any questions related to the SEA-PHAGES program. Any logged-in user may post new topics and reply to existing topics. If you'd like to see a new forum created, please contact us using our form or email us at info@seaphages.org.

Is the current subcluster A4 pham 4236 (as of 2/14/2024) really a gene?

| posted 15 Feb, 2024 05:05
We have not seen any coding potential whatsoever for subcluster A4 pham 4236 in GeneMark_S, smeg or TB for the very last Gimmer-autocalled reverse “gene” in Alberto7 (51374 to 51204 bp) and Lunsford (51030 to 50860 bp). We note that this “gene” (MTSLTASCSVVLDNSTADIANRFGNPHASGFETAQRTGTLRTSRINPALDIAQSISZ) comes after a very large gap (710 bp gap, with no coding potential in the gap) from the preceding gene. We do not see any significant HHPred hits (probabilities range from 20.07-63.7%).

We however see significant BLASTp hits in phagesDB with 17 previously annotated genes, with 100% homology (see attached). To find out some possible justification for this pham, in case at least one member of the has significant coding potential, we decided to check for coding potential in all the draft and non-draft phage “genes” that match the above “genes” in phagesDB. We have found no evidence of coding potential for this gene in any of the 21 current pham members:
Tinybot_89 [51068 to 50898 (Reverse)]; Taquarus_89 [51040 to 50870 (Reverse)]; Scamp_90 [51039 to 50869 (Reverse)]; Ohfah_91 [51078 to 50908 (Reverse)]; NotAPhaseMom_89 [51042 to 50872 (Reverse)]; Maxo_92 [51044 to 50874 (Reverse)]; Maverick_91 [51038 to 50868 (Reverse)]; Lunsford_draft_86 [51030 to 50860 (Reverse)]; LochMonster_91 [51274 to 51104 (Reverse)]; Lemur_89 [51034 to 50864 (Reverse)]; Koreni_90 [50721 to 50551 (Reverse)]; Jaykayelowell_86 [51031 to 50861 (Reverse)]; Houdini22_89 [51042 to 50872 (Reverse)]; Deano_86 [51045 to 50875 (Reverse)]; ChampagnePapi_92 [51477 to 51307 (Reverse)]; Cerulean_92 [51475 to 51305 (Reverse)]; CentreCat_90 [51078 to 50908 (Reverse)]; Bombshell_91 [51038 to 50868 (Reverse)]; Alberto7_Draft_85 [51374 to 51204 (Reverse)]; LappelDuVide_Draft_85 [51037 to 50867 (Reverse)].

We are now of the view that apparently, the only reason this gene was called was because it was autocalled by Glimmer. I am strongly of the view that this gene should be deleted, but because it has been previously called in number of phages, I do not want to kill it without giving it an appellant opportunity. Any strong defense in favor of keeping it?
Thanks!

Fred
Edited 15 Feb, 2024 05:06
| posted 15 Feb, 2024 15:17
Hi Fred,
The right arm of some of our well studied mycobacteriophages defy common annotation practice.
Here is how I look at it. We have published papers where the right arm has no coding regions because there are RNAs expressed. The paper about mycobacteriophage Giles has the best display of bench data that we have.
I am always cautious to not overcall the right end of a genome. I think these are some of the stats to consider:
This ORF is only found in Cluster A4s.
Phamerator has it as 17/134 genes annotated. Are they all drafts? How many of them have the nucleotide sequence? How conserved is it? Could it be something other than a protein coding sequence?
You mention no coding potential. Nor, is it easy to see how this gene would get easily made.
Check out other cluster A phages like L5 or D29 and note they are not over-calling this region.

Also, what is the risk if you delete it? Do you think a researcher who want to investigate lemur_85 isn't going to find this segment of DNA in Alberto7.

Most important is that you can support whichever call that you make. Looking at all of the data is and not what others have done will get you to a best answer.

Good luck as you make this decision!
debbie
| posted 16 Feb, 2024 22:09
Hi Debbie,
Thank you for your insights. I still do not have a strong justification for keeping this gene, so I will go ahead and delete it. There is no coding potential, the region is left unfilled in D29 & L5, and I note that whereas nucleotide sequence is conserved across many genomes, only the above mentioned few call it. If a researcher wants to investigate that “gene” which has been called in the above-mentioned phages, it will be interesting if they get a gene product and possibly a function, in which case we would have stronger support for calling a gene in this region in future annotations.
Thanks again!
Fred
| posted 17 Feb, 2024 13:54
Fred,
I think you have a good choice!
Best,
debbie
 
Login to post a reply.