Gene or no gene at Subcluster P1 phage Sonah position 37305-37388 bp?

| posted 18 Dec, 2023 06:32
I am asking this clarification question for subcluster F1 and P1 phages.
I notice that for phage Sonah (subcluster P1) position 37305-37388 bp (sequence MTDFLGATIRIVAQIGFPTVNPIEVMRZ) has a potential small gene. When inserted and blasted, it gives significant q1:s1, 100% but with just a few phages {Zilizebeth gp 64 (P1), HUHilltop gp 56 (P1) and Royals2015 gp 70 (F1)}, along with insignificant hits with Malithi gp 54 (P1), Camster gp 55 (P1) & Techage gp 59 (P1). I also note that all of the above genes are significantly loger that this potential gene, which would potentially only form part of the fisrt few aa of the above longer genes. I note that a vast majority of previously annotated phages skipped calling this gene. There is no coding potential whatsoever in GeneMarkS, smeg or TB among the above P1 phages that gave singificant hits. The only exception is in the subcluster F1 phage Royals2015 which also has no coding potential in Smeg or TB but has weak/insignificant CP (below 50%) in GeneMarkS (see attached). I note though, that the RBS score for this start in phage Sonah is strong (Z = 2.098; spacer distance = 10; final score = -4.661, with a TTG start codon). Moreover, it has an 8 bp overlap with the upstream gene and would form part of an operon with a 1 bp overlap (TAATG) with the downstream gene. Because this potential gene has weak CP and is only in GeneMarkS of F1 phage Royals2015 among all the genomes with significant hits that I have checked, and has not been called in many previously published genomes, I would like a second opinion about it going forward.
Edited 20 Jan, 2024 05:37
| posted 19 Dec, 2023 19:13
Hi Fred,

This seems like it could go either way. The lack of coding potential and absence of being called in other genomes (a decent number of them) is compelling to be conservative. The operon overlaps are intriguing and I usually weigh them pretty heavy when making decisions. It does seem like there is something interesting going on with respect to DNA metabolism in this region of the P1 cluster. Without some more definitive support I'd leave it out.
| posted 19 Dec, 2023 19:43
It also looks like the first 18 aa correspond to the beginning of Malithi gp 54, a 262 aa sequence. So maybe some HGT in ancestor of Sonah interrupted/broke this gene?
| posted 21 Dec, 2023 04:52
Thank you!
In fact, a look at the six-frame translation shows 3 stop codons between 37305-37388 bp (see attached file). I will go ahead and NOT insert a gene for phage Sonah at position 37305-37388 bp.
Edited 20 Jan, 2024 04:45
