SEA-PHAGES Logo

The official website of the HHMI Science Education Alliance-Phage Hunters Advancing Genomics and Evolutionary Science program.

Welcome to the forums at seaphages.org. Please feel free to ask any questions related to the SEA-PHAGES program. Any logged-in user may post new topics and reply to existing topics. If you'd like to see a new forum created, please contact us using our form or email us at info@seaphages.org.

All posts created by cdshaffer

| posted 10 Apr, 2017 23:53
Purgamenstris is now an N cluster phage see:

http://phagesdb.org/phages/Purgamenstris/
Posted in: PhameratorMissing phage in Phamerator database
| posted 29 Mar, 2017 17:18
OK i have updated this issue for Starterator on the issue tracker. You can see the issue posted here. There are still some unresolved bugs/crashes as well as some ideas for other changes on the issue tracker as well.

Not sure how easy difficult this change will be to implement, so not sure if/when it will be added. It should be fairly easy in theory but some of the data constructs are a little tricky to work with, especially the ones in the section of the code that generates the PDF, so it will take some time to dig into the code and see what can be done.
Posted in: StarteratorSuggestions for Starterator Report Upgrades
| posted 29 Mar, 2017 01:59
name: root
password: phage
Posted in: PhameratorForce A Database Update? How?
| posted 29 Mar, 2017 01:58
Having dealt with these new reports in class I agree that the current set of calculated numbers are not the best with higher variance phams. In our discussions last fall we wanted to get away from the "Suggested Start" because it was counting both human and computational annotations equally and thus putting too much emphasis on glimmer/genemark.

I wonder if a good number to report is the fraction of times a start is annotated as the start of the gene but only consider the manually annotated genes that actually have that start present.

I can see two places to put that kind of info that might help, in the "Summary by start number" section and/or in the "Gene information" section. Here are examples, the details of which can easily be changed but it gives you the idea and something to comment on:

Summary by start number
• Start number 18 is called in: Spectropatronm_Draft_2, Rima_2, Namo_2,
Percent of genes with start 18 present: 37.5% ( 3 of 8 )
Start 18 was manually annotated as the start 100% (2 of 2) of the time when present.

• Start number 19 is called in: Scap1_2,
Percent of genes with start 19 present: 25.0% ( 2 of 8 )
Start 19 is called as the start 50% of the time when present (1 of 2).

So in the above example I image there are 8 members of the pham, start 18 is present in 3 of the 8 members so starterator reports "present in 37.5%". Of those three members that have start 18, two have been manually annotated (Rima and Namo) and in both cases start 18 was the annotated start of the gene. Thus starterator reports 2 out of 2 or 100%. For start 19, it is present in 2 phage (Scap1, and one other), both are manually annotated but in only one of them was 19 the annotated start so starterator reports 50%. Thus we have a % presence with examines all genes and gives a sense of overall levels of conservation, and a % manually annotated which gives the fraction of the time it is picked when present.

A different way to encode the same would be to put those in the Gene info like this:
Gene information
•Gene: Spectropatronm_Draft_2 Start: 485, Stop: 892
Candidate Starts for Spectropatronm_Draft_2:
[(5, 395, 0%), (18, 485, 100%), (19, 563, 50%)]
I guess a third option is to do both.

I am a little reticent to put lots of details into the Gene information since that section gets quite long already for phams with lots of members and long genes.

It would be great to hear feedback on any of the above. I agree that more would be needed, I am just not sure if there is something better. Are there tweeks to the above that make better sense to you? Are there other numbers that could be calculated and reported that would be more informative? Thoughtful feedback is much appreciated.
Posted in: StarteratorSuggestions for Starterator Report Upgrades
| posted 22 Mar, 2017 18:49
The important thing with respect to "hypothetical protein" or known function is to be sure to have the correct entries in the notes of your "minimalistic" file when you have a "reportable" function and have the notes field empty when you do not. The "minimalistic" file is described in the section that follows Fig 12.2.

Someone from the SMART QC team can correct me it I am wrong, but for the fully noted final file I do not believe you need to add any entries to the product field. Just follow the protocol as described in section "9.3.3 Renumbering & formatting annotated features" to auto fill the product field.
Posted in: Notes and Final FilesFilling in the Product field
| posted 10 Mar, 2017 17:43
During Feb and March when there are lots of new phage sequences coming out the database gets updated once or twice a week. The only way to know there has been a new update is to manually check this page and see if the version number has changed. For the most part pham numbers remain the same from one database version to another so following database versions is probably a waste of time for most users.

This is a good teaching moment for your students, they should understand that pretty much all online biological databases are being updated and when they find inconsistencies its always a good idea to consider if it is due to recent changes to the underlying database. This is especially true when comparing info from one website to another since the two websites are likely to be on different update cycles.
Posted in: StarteratorPham not found in Starterator
| posted 09 Mar, 2017 16:17
There are two main reasons (other than the simple typo) that would cause a missing starterator report.

The first is that the pham number has changed with a recent database update and all the support databases have not caught up and are still on the old number; the second is that your gene is an orpham and there is no Starterator report since there is nothing to compare.

I cannot see pham 24884 assigned to any gene in Kalah2 for the most recent database update so your issue is the former. To get the current up to date pham assignment for any gene check phagesdb (all the other web pages lag behind phagesdb by some amount of time). On phagesdb there an info page for each gene which includes the current pham number. The easiest way to get to a gene page is to go the gene list pull down menu on the kalah2 phage page.
Posted in: StarteratorPham not found in Starterator
| posted 27 Feb, 2017 05:41
I would need specific examples to know for sure but the pham numbers do change sometimes as new proteins are added. Right now the with all the new phages, Travis is posting 2 updatess to the database each week. So, it depends on where you get your map, how old the map is, and when the starterator report was generated. It is quite possible the two are out of sync. This is the reason that we decided to add the date on the starterator report, the older the report the more likely the discrepancy is due to the numbers changing with the new phages being added.

I am trying to keep the online starterator reports here as up to date as I can. So far I have been able to get everything posted within 1 working day of when the new database is posted. So the online starterator reports should be (mostly) in sync with the info at phagesdb.org.

If you want to double check a discrepancy you can always check the pham at phagesdb.org; the page on the pham will list all the current proteins based on the most recent database release, you can then compare that to the most recent map by using the SEA VM, start phamerator and make sure it completes an full update before creating the map (this can take quite a while depending on your set up). I am not sure for the online version of phamerator how often it gets updated but it does tend to lag behind a bit.

Hope that helps, if you are still not sure post an example or two and I would be happy to double check that it is not a more serious coding error.
Edited 01 Mar, 2017 19:45
Posted in: Starteratorphage that crash starterator
| posted 22 Feb, 2017 17:19
I am guessing mabel is locked. This is a setting on the admin -> phages page, left most column
Posted in: PECAANRead-only Genome for User?
| posted 16 Feb, 2017 22:36
Most of your questions are more about the policy for good notes so I will leave those to the more informed, but I can answer the question about the numbers in phamerator hover above the genes. If you are on phamerator.org the number is the phamily number. If you are using the maps from the stand alone version of phamerator from the SEA VM there is a second number inside parentheses that is (the number of proteins in the phamily).
Posted in: Notes and Final FilesConserved Domain ID in notes