SEA-PHAGES Logo

The official website of the HHMI Science Education Alliance-Phage Hunters Advancing Genomics and Evolutionary Science program.

Welcome to the forums at seaphages.org. Please feel free to ask any questions related to the SEA-PHAGES program. Any logged-in user may post new topics and reply to existing topics. If you'd like to see a new forum created, please contact us using our form or email us at info@seaphages.org.

All posts created by debbie

| posted 26 Mar, 2019 08:40
Philippos,
Hi. No, I would not call auto-annotated gene 123, but instead call the longer gene you describe (it is a DNA methylase, so that should convince you). In addition, don't miss the next forward gene, because it too is a DNA methylase. See my attached frames window. Check out the differences in the auto-annotation….
Posted in: Frameshifts and IntronsPossible translational frameshift in Cluster J
| posted 26 Mar, 2019 08:27
Philippos,
Hi. I took a look at this area and I would not call that or the next reverse gene(see Phamerator). Instead I wold add a small ORF in the forward direction between 103 and 105. If you look at your auto-annotation, my auto-annotation, and Phamerator, I think there are differences in this region across all 3. Which means basically there is poor coding potential all-around.
Note: I see no strong coding potential in the reverse frames. The red dotted lines on a GeneMark graph are demonstrating an order of 2 (patterns of 2 nucleotides at a time), which most of the time amounts to noise. So if you look at your example, the black line is poor coding potential (order of 4). Make sense?
debbie
Edited 26 Mar, 2019 08:28
Posted in: Gene or not a GeneForward or reverse gene?
| posted 21 Mar, 2019 11:53
Sarah,
After preliminary work with host range experiments, I am finding that the best way forward is to find phages on hosts that you would like to test. Until you are familiar with a host, the testing of additional hosts with known phages can be problematic. By that I mean, if you don't have a phage that infects each host, how do you know that a phage that you are testing can infect that host (or that that infection is detectable)? If it does, great! But if it doesn't, is your test valid (without a positive control)? So my recommendation going forward is to add some new hosts, go phagehunting on that host, then begin the host range testing. Does that make sense?
Posted in: Host-Range ProjectBasic Host Range Project Information
| posted 18 Mar, 2019 18:45
Veronique,
I agree, I see no function to call for the second gene in question. and I would call the first one the "capsid maturation protease and MuF-like fusion protein".
Posted in: Functional Annotationcapsid maturation protease
| posted 15 Mar, 2019 02:54
Erin,
I don't think it has to have 2 domains, it just has to have 1 convincing one. The problem is that one program doesn't have adequate sensitivity. So if there is only one, we want it verified by 2 sources. Without the option to 'verify' a transmembrane domain with a second program, let's not add it as a function.
Posted in: Functional AnnotationMembrane protein
| posted 15 Mar, 2019 02:41
Evan,
In this case you can't ignore the comparisons, not can you ignore that there is a gap. If we could say that coding potential is always weighted in a particular way for every gene, we could write the program and stop doing this manually. BUT there are too many factors and they are not weighted the same in each instance (gene), so it requires human evaluation. At least for now….
Do not exclude this gene. (until you have evidence to do so.)
Posted in: Gene or not a GeneDifferent coding potential for same sequence?
| posted 15 Mar, 2019 01:22
No. it is dependent on the random 'sample' that it used. therefore, the patterns picked up work well for big genes, but 'fall apart' for small ones. Yes, the sequence of the gen is the same, but the random samples used to measure against are different. They are even different for the same genome, hence thy are based on a random sample of the genome.
Posted in: Gene or not a GeneDifferent coding potential for same sequence?
| posted 15 Mar, 2019 01:12
Evan,
Keep in mind that the coding potential is based on a sample of the sequence from each genome, so why would you expect them to be the same? The use of comparative genomics is helpful and necessary here.
to prove the point, I just ran GeneMark on Stanktossa and got a different result than you did. Yes, coding potential is very important, but this shows you that it has to be thought of in context. The comparative genomics are impressive.
Edited 15 Mar, 2019 01:19
Posted in: Gene or not a GeneDifferent coding potential for same sequence?
| posted 14 Mar, 2019 14:36
Evan,
Have you read this?
https://seaphagesbioinformatics.helpdocsonline.com/article-4

Both Glimmer and GeneMark use a sample of the target genome, so if you run them 10 times you MAY get some differences. Those differences occur most commonly when predicting 'small' genes. It is the primary reason we hand curate the genomes.

As for your question, please provide specifics: what gene, what genome, and any other pertinent data.

Thanks,
debbie
Posted in: Gene or not a GeneDifferent coding potential for same sequence?
| posted 14 Mar, 2019 01:39
Hari,
Basically yes.
However, when you add gene "11" in HarperAnne, add the gene only in that frame. If you add a gene that starts at 10 and ends at 11 it will be difficult to see where to slip it. (But if you know that coordinates of the slippage, it won't matter.) Then go back and chagen the start and add regions when you have it figured out.
debbie
Posted in: AnnotationAnnotations of Microbacterium foliorum