Welcome to the forums at seaphages.org. Please feel free to ask any questions related to the SEA-PHAGES program. Any logged-in user may post new topics and reply to existing topics. If you'd like to see a new forum created, please contact us using our form or email us at info@seaphages.org.
Recent Activity
Viknesh Sivanathan posted in Recording Data for Attempts to Raise Lysogens
Debbie Jacobs-Sera posted in when Glimmer and Genemark call genes in different strands
Debbie Jacobs-Sera posted in when Glimmer and Genemark call genes in different strands
JustinA posted in Recording Data for Attempts to Raise Lysogens
lisabono posted in when Glimmer and Genemark call genes in different strands
when Glimmer and Genemark call genes in different strands
Link to this post | posted yesterday, 18:15 | |
---|---|
|
Our section of SEA-PHAGES is annotating PrairieDogTown in Cluster FO. We noticed that PrairieDogTown (FO) has 82 predicted features, while the rest of the phages in FO has 52-54 features. When we dug into this a bit, we realized that the discrepancy in the number of features appears to be a difference in the number of reverses. PrairieDogTown has ~27 reverse genes while JanetJ and Aoka have 3-5. The rest of the phage in the cluster are still draft genomes. Anyone have any ideas about why we would be seeing such a large discrepancy in the number of reverses? Is there a technical reason why more reverses would be detected? Thanks! Note that there's no Cluster FO specific topic, so I'm posting in F. |
Link to this post | posted yesterday, 20:46 | |
---|---|
|
It is not common but is definitely seen that Glimmer and GeneMark finds a big open reading frame on opposite strands and then calls most genes in that orientation. These genomes cannot have that many genes in their genomes. Caution is required as to where the other supporting data aligns. Look for the ORFs that have known phage function to build your case for where to call genes. |
Link to this post | posted yesterday, 20:50 | |
---|---|
|
Hi Lisa, In the regions where there are forward and reverse genes called, the previous annotators decided the evidence for forward genes i that region were more compelling. Please be sure to make you decisions based on the evidence that you have. Note that gene prediction models are using a 4 nucleotide sequence to find the most abundant patterns in the big open reading frames, then applying that pattern to the whole genome. The math will break down for smaller genes. My guess is that those regions where genes are called simultaneously on both strands are not very GC rich, so the patterns of ATCG are somewhat equivalent, so they are all getting called. Note to see if one of the programs (Glimmer or GeneMark) calls one strand more than the other. So you may see a bias due to the program's algorithm. debbie |