SEA-PHAGES Logo

The official website of the HHMI Science Education Alliance-Phage Hunters Advancing Genomics and Evolutionary Science program.

Welcome to the forums at seaphages.org. Please feel free to ask any questions related to the SEA-PHAGES program. Any logged-in user may post new topics and reply to existing topics. If you'd like to see a new forum created, please contact us using our form or email us at info@seaphages.org.

All posts created by lblumer

| posted 27 Mar, 2017 18:23
Debbie,

This helps, thanks. When a Pham has very few members, it is difficult to decide which members are outliers, as in Nairb ORF 32. When there are many members, a few ORFs with deletions should not change our sense of the conserved start.

I am still curious about your call on ORF 21. The 19663 start had a worse SD score (-4.569) than 19657 (-3.104) but they have same Z values (I wrote "lower SD score" in my annotation notes which may be confusing). 19663 is a TTG start while 19657 is an ATG start. Neither call changes the inclusion of coding potential, but 19657 has a 1bp overlap with the prior ORF while 19663 has a gap of 5bp from the prior ORF. What indicates that 19663 is the better call?
Posted in: Choosing Start SitesRelative importance of criteria when annotating an uncommon phage
| posted 25 Mar, 2017 13:47
Chris,

I know you have more than enough things on your plate, but I have been thinking about how to make the Starterator Reports, the newest pdf pham reports that we are pulling from <http://phages.wustl.edu/starterator/PhamXXXXReport.pdf>, could be useful for calling the most conserved starts.

When a given pham member does not contain the most commonly annotated start in that pham, I then look at the bands for other pham members that also are missing the suggested start. The current printed report gives the percentage of calls for each of the other possible starts in annotations and autoannotations. At that point, I am not very interested in the start calls, but in the available starts.

If the sequential start list provided the percentage of all pham members that contain that possible start, I could quickly evaluate the most conserved start(s), regardless of what had been called in past annotations or in autoannotations. I would look for the first start that is found in the greatest percentage of pham members. Could those data be added to the Pham Reports?

Posted in: StarteratorSuggestions for Starterator Report Upgrades
| posted 21 Mar, 2017 13:49
New Functions found in our annotation of Nairb (Cluster T)

ORF 1 Start 89 Stop 325 FWD Phosphomannomutase Blastp NCBI, e=2e-10, Paenibacillus phage Vegas gp80

This protein is found in some Mycobacterium species but the function had not previously been identified.

ORF 48 Start 35954 STop 36469 FWD Replisome organizer and helicase loader/inhibitor, HHPred, e=2e-08, p=98.74, Bacillus phage SPP1

Our DNAMaster file, with all notes, is attached.

Are these acceptable addition to the Master Function List?
Posted in: Request a new function on the SEA-PHAGES official listphage replisome organizer found in Cluster T; add to Master List?
| posted 21 Mar, 2017 13:42
We made very careful use of the Starterator Pham reports from WUSTL in this annotation, and I think it really helped us make better start calls, but it has raised new questions about calling starts.

We are working on Nairb is a Cluster T phage, one of only 5. Only two annotations (RonRayGun and Bernal13) have been previously published. In a number of cases, Starterator data only are available for Cluster T phages or the most common Starterator start is not present in Cluster T phams. When the calls previously made in RonRayGun and Bernal13 do not agree, and there is no special reason to choose a start, such as the magic 4bp overlap, then we have been making calling decisions based on the following criteria (in this order):
1. Start is conserved in all Pham members from Cluster T
2. Captures more coding potential
3. The longest possible ORF length called
4. The SD score is the best of the choices available
5. The start was called by Glimmer and/or GeneMark

Our problem Phams are ORFs 21, 26, 30, 32, 33, 34, 39, 55.

I am asking whether my criteria are in the order you would choose? Are there criteria for using the conserved start for all but one member of a Cluster? One of the draft annotations in Cluster T, Mendoksei, has Phams that are regularly shorter than all the other members and that suggests a different conserved start than if just the other four members are considered.

I am asking for any words of wisdom on making the hard choices given that calling conserved starts has become a relatively high priority.

For example, in ORF 34 REV (Pham 4779) the start we called was previously called in only one of the two previous Cluster T annotations. One member of this Pham is not from Cluster T, but their starts are completely different from the Cluster T members, so we ignored it. We called the most conserved start (30) that also had the best SD score and was the Glimmer call. However, this start did not capture all the coding potential and was not the longest ORF. An earlier start was available in all but one of the Cluster T phages in this Pham that would have captured all the coding potential. Did we make the call appropriately?

ORF 55 (Pham 17577) has 14 members but the most frequently called start is not present in the Cluster T members of this Pham. The two annotated Cluster T members called different starts, 10 and 15. Yet, start 14 is the conserved start in all five of the Cluster T phages. Glimmer called start 15, GeneMark called start 13. I am inclined to call the start at 14. It has a better Z value but a worse SD score than start 15 which is also found in all five Cluster T phages. Just to make things more interesting, start 10 is present in all but one of the Cluster T phages and it would be the most conserved start in all but that one. Start 10 is present in both of the annotated phages, was the called start in one of them, and would include all the coding potential. Thoughts?
Posted in: Choosing Start SitesRelative importance of criteria when annotating an uncommon phage