SEA-PHAGES Logo

The official website of the HHMI Science Education Alliance-Phage Hunters Advancing Genomics and Evolutionary Science program.

Welcome to the forums at seaphages.org. Please feel free to ask any questions related to the SEA-PHAGES program. Any logged-in user may post new topics and reply to existing topics. If you'd like to see a new forum created, please contact us using our form or email us at info@seaphages.org.

All posts created by cdshaffer

| posted 22 Mar, 2017 18:49
The important thing with respect to "hypothetical protein" or known function is to be sure to have the correct entries in the notes of your "minimalistic" file when you have a "reportable" function and have the notes field empty when you do not. The "minimalistic" file is described in the section that follows Fig 12.2.

Someone from the SMART QC team can correct me it I am wrong, but for the fully noted final file I do not believe you need to add any entries to the product field. Just follow the protocol as described in section "9.3.3 Renumbering & formatting annotated features" to auto fill the product field.
Posted in: Notes and Final FilesFilling in the Product field
| posted 10 Mar, 2017 17:43
During Feb and March when there are lots of new phage sequences coming out the database gets updated once or twice a week. The only way to know there has been a new update is to manually check this page and see if the version number has changed. For the most part pham numbers remain the same from one database version to another so following database versions is probably a waste of time for most users.

This is a good teaching moment for your students, they should understand that pretty much all online biological databases are being updated and when they find inconsistencies its always a good idea to consider if it is due to recent changes to the underlying database. This is especially true when comparing info from one website to another since the two websites are likely to be on different update cycles.
Posted in: StarteratorPham not found in Starterator
| posted 09 Mar, 2017 16:17
There are two main reasons (other than the simple typo) that would cause a missing starterator report.

The first is that the pham number has changed with a recent database update and all the support databases have not caught up and are still on the old number; the second is that your gene is an orpham and there is no Starterator report since there is nothing to compare.

I cannot see pham 24884 assigned to any gene in Kalah2 for the most recent database update so your issue is the former. To get the current up to date pham assignment for any gene check phagesdb (all the other web pages lag behind phagesdb by some amount of time). On phagesdb there an info page for each gene which includes the current pham number. The easiest way to get to a gene page is to go the gene list pull down menu on the kalah2 phage page.
Posted in: StarteratorPham not found in Starterator
| posted 27 Feb, 2017 05:41
I would need specific examples to know for sure but the pham numbers do change sometimes as new proteins are added. Right now the with all the new phages, Travis is posting 2 updatess to the database each week. So, it depends on where you get your map, how old the map is, and when the starterator report was generated. It is quite possible the two are out of sync. This is the reason that we decided to add the date on the starterator report, the older the report the more likely the discrepancy is due to the numbers changing with the new phages being added.

I am trying to keep the online starterator reports here as up to date as I can. So far I have been able to get everything posted within 1 working day of when the new database is posted. So the online starterator reports should be (mostly) in sync with the info at phagesdb.org.

If you want to double check a discrepancy you can always check the pham at phagesdb.org; the page on the pham will list all the current proteins based on the most recent database release, you can then compare that to the most recent map by using the SEA VM, start phamerator and make sure it completes an full update before creating the map (this can take quite a while depending on your set up). I am not sure for the online version of phamerator how often it gets updated but it does tend to lag behind a bit.

Hope that helps, if you are still not sure post an example or two and I would be happy to double check that it is not a more serious coding error.
Edited 01 Mar, 2017 19:45
Posted in: Starteratorphage that crash starterator
| posted 22 Feb, 2017 17:19
I am guessing mabel is locked. This is a setting on the admin -> phages page, left most column
Posted in: PECAANRead-only Genome for User?
| posted 16 Feb, 2017 22:36
Most of your questions are more about the policy for good notes so I will leave those to the more informed, but I can answer the question about the numbers in phamerator hover above the genes. If you are on phamerator.org the number is the phamily number. If you are using the maps from the stand alone version of phamerator from the SEA VM there is a second number inside parentheses that is (the number of proteins in the phamily).
Posted in: Notes and Final FilesConserved Domain ID in notes
| posted 16 Feb, 2017 17:48
The server is reporting service unavailable for me, so its not anything at your end. Unfortunately, I don't think there is anything to do until the server comes back up, since all the ways to install that I know of require running the auto-updater, which is also failing for me.
Posted in: DNA MasterDownloading DNA Master
| posted 14 Feb, 2017 17:50
I realized that even though I don't have time now to write code to create the kind of specific starterator text when doing whole phage reports, that something is better than nothing. So, I just cloned the code I used for creating the text for the individual pham reports into the code for whole phage. I ran a test with phage Amgine and you can download the results of that full page report here.

This solution is not ideal as the new individual pham reports give more details on each and every gene in the pham, so individual pham reports on large phams can get quite large. Now, when you put many of those together to make a whole phage report the file can get really really big. For example the whole phage Amgine report is over 1100 pages long. However, this may work better that going to the numerous pham reports on the web. I would suggest using a PDF viewer with good searching tools to quickly find things.

If anyone wants the updated code you just need to pull and run the most current version of the byphagewithbase branch from my github repo.
Posted in: StarteratorNew Version of Starterator for 2017?
| posted 13 Feb, 2017 17:05
Aaron,

If you are doing whole phage then yes that is the current output for that branch. That branch is stuck in the middle of being updated, I have removed the old code for whole phage reports but have not had time to put in the new code to create the output similar to the pre-computed pham reports but with extra text based on which whole phage is being analyzed. Welcome to the cutting edge where you often get cut.

I did create a "whole phage" report manually by extracting all the pham numbers for the phage using command line mysql, creating all the pham pdf's using command line starterator, and finally concatenating all the pdf's together using command line ghostscipt. This will create a single PDF with all the pham reports for that phage but is still missing the first few header pages with the map and the list of suggested starts. If you want more details on exact commands let me know.

I like your idea of adding the track No. to the text on the track. I have added this as an issue on the starterator github page here:

https://github.com/SEA-PHAGES/starterator/issues/26

Not sure when I will have time to get back to coding on starterator, hopefully some over spring break.
Edited 13 Feb, 2017 17:50
Posted in: StarteratorNew Version of Starterator for 2017?
| posted 10 Feb, 2017 16:51
The answer to your question depends on which version of starterator you want. There is currently version 1.1 which has many bug fixes (not all bugs are fixed just some) and other updates which I did mostly over the summer, it is the master branch at

https://github.com/SEA-PHAGES/starterator.git

Based on discussions with Welkin, Deb and extra feedback from the in silico workshop, I was able over winter to update a lot of the text output. I am using this code to run the pre-computed pdfs but that version has not been fully tested to make sure it is "release ready", (i.e. it runs on my machine with lots of extra modifications but I have not tested it with the default SEA 2017 VM). The version that is doing all the pre-computing is byphagewithbase branch in my personal repo at:

https://github.com/cdshaffer/starterator.git

Having said that anyone can install and run any version of the code but it takes a good understanding of unix administration and git:

1. Ensure you have the dependencies (if you are not using the SEA 2017 VM):

sudo apt-get install python-pip ncbi-blast+ git
sudo pip install PyPDF2
sudo pip install beautifulsoup4
sudo pip install requests
2. Remove all old Starterator files:

cd $HOME/Applications
rm -rfv $HOME/.starterator
rm -rfv $HOME/Applications/Starterator
3. Create new Starterator folder:

mkdir $HOME/Applications/Starterator
cd $HOME/Applications/Starterator
4. Clone the version you want:
git clone https:<insert URL here for the repo you want to clone>

5. Checkout the branch you want:

cd $HOME/Applications/Starterator/starterator
git checkout <insert branch name here>

6. Run your checked out version of Starterater:

bash starterator.sh

Please remember, no guarantees for anything in my repo to work on the 2017 VM. However if you try and it does work please do let me know.
Edited 13 Feb, 2017 17:04
Posted in: StarteratorNew Version of Starterator for 2017?