Welcome to the forums at seaphages.org. Please feel free to ask any questions related to the SEA-PHAGES program. Any logged-in user may post new topics and reply to existing topics. If you'd like to see a new forum created, please contact us using our form or email us at info@seaphages.org.
Recent Activity
All posts created by cdshaffer
Link to this post | posted 09 Feb, 2016 03:00 | |
---|---|
|
yes, Open phamerator. Select preferences from the edit menu. Click the "Force database update" button I always start these updates very first thing in class before I start any lecturing to give maximum time. I also only have at most 5 student computers trying to update at any one time so the wireless network does not get saturated. |
Posted in: Phamerator → Force A Database Update? How?
Link to this post | posted 07 Feb, 2016 01:43 | |
---|---|
|
As for the total failure, hard to diagnose without specifics. If it is failing on everything then your assumption that it cannot connect to the database is probably correct. First thing to test then is to see if you can connect to the Actino_draft database with Phamerator. That will help separate database problems from starterator problems. Open phamerator and look in the preferences to make sure Actino_draft is the selected database and then see if phamerator can find recently added phages and make phage maps. Post the results of that test. |
Posted in: Starterator → phage that crash starterator
Link to this post | posted 07 Feb, 2016 01:36 | |
---|---|
|
I ran all 4 of the phage above on my system where I updated my own local copy of Starterator to fix a few bugs that were posted last fall. results for the 4 phage above: Picard ran fine so it is likely it had an issue that I have already corrected. You can get the full report from this link. bubbles123 crashed for me on 68 of 104, pham 5447. This pham is for the gene 107, right hand most gene, it is on negative strand. The gene starts 2 bases in from the end of the phage, this is a corner case which needs fixing. Exact error reported in line 220 of find_most_common_start of phams.py. Error is ValueError: 0 is not in list. This is a new bug I have not seen before from starterator. Fixing and double checking the code will take time, for now I have just added temporary code to my copy of Starterator to skip that gene so I could create a full report of all the other genes. That full report is available here. Roosevelt crashed for me on pham 31 of 89; pham number 8871. pham 8871 corresponds to gene 35. Error reported was line 98 of add_aligment in pham.py: KeyError "TWAMP_Draft_33". This might be an naming error, the geneid in the pham table is SEA_TWAMP_33 not TWAMP_Draft_33". There are a cluster of gene names from several phage that start with "SEA_", not sure why but genes with names like this appear to be mis-handled. This is a new bug, like above, fixing the bug will take time. I have had starterator run and just skip gene 35. The full report without gene 35 is available here. Mojorita was able to run fine so it is likely also caused by a bug I have already squashed. You can get the full report from this link. I have posted bug reports for the bubbles 123 and roosevelt problems to my github issues tracker where I am keeping a running list of known bugs and possible improvements. Not sure when I will be able to get to these for permanent fixes. Thanks for all the crash reports, the program cannot get better without reports like these. |
Posted in: Starterator → phage that crash starterator
Link to this post | posted 06 Feb, 2016 22:27 | |
---|---|
|
When I look at ShiaLaBeouf in Phamerator I see that it is labeled "ShiaLaBeouf_draft" and that there are 231 genes. The "_draft" means that the phage was run on DNA Master auto-annotation and those auto-annotated genes were incorporated into the database. That database is used by both Starterator and Phamerator so I always look in Phamerator first when debugging starterator. Not sure why your DNA Master has a different number, could be something as simple as the DNA Master total you are looking at includes the tRNA genes (the phamerator database is only counting protein coding genes), or that someone added genes to the DNA Master file. Alternatively, it could be something complicated based on settings or default configuration of your copy of DNA Master compared to the copy that was run to create the auto-annotation that ended up in the database. Another possibility is that there was a glitch that caused an error in the database. Starterator was designed to deal with this situation (i.e. you want to analyze a gene that is not in the phamerator database) by allowing you to enter in coordinates that define a gene (it is the routine listed in the start window as "One unphamerated gene" ). You are supposed to be able to enter the relevant data, phage, phage sequence, gene coordinates and strand and get a result, but I have not had good luck with that routine. It is certainly something that needs work under the hood with the code. Anyway, if you still want to try to track down this discrepancy, then the first step would be to do a careful comparison of the gene list in DNA Master compared to the the Phamerator Database. I extracted the gene list from the phamerator database to help with comparison. You can get the file from this link. |
Posted in: Starterator → Empty Track
Link to this post | posted 05 Feb, 2016 20:11 | |
---|---|
|
easy way (that I use in class when time is an issue) I tell student: open browser, connect to email, send files to yourself as attachment. Bonus feature: automatic backup of files. If you need to move many files back and forth: 1. make sure you are using a recent version of Virtualbox and have installed guest additions in the ubuntu machine 2. with machine off go to settings -> shared folder 3. click the tiny folder with the plus sign to add a shared folder 4. in the folder path entry click the arrowhead and select "other" 5. select the mac Desktop folder to share between guest and host 6. select the "Make permanent" setting so you only have to do this once. 7. The shared folder will be in the /media folder in the guest which is in the top level of folders (i.e. two folders up from your home folder). The name of the folder inside the guest machine will be something like sf_Desktop 8. Files placed in there will appear on the Desktop of your mac You an also set up drag'n'drop to supposedly be able to move files back and forth but I have had less success with that technique |
Posted in: SEA-PHAGES Virtual Machine → Shared folders
Link to this post | posted 05 Feb, 2016 18:34 | |
---|---|
|
I will look into this when I can, unfortunately my computer motherboard died last night and I am working from a loaner until it is repaired. Until then my starterator virtual machine is unavailable. It sounds like a couple of other cases I have run with little or no pink. It both of those cases it was not a bug as much as weird corner cases that starterator was not built to handle. |
Posted in: Starterator → Empty Track
Link to this post | posted 05 Feb, 2016 18:04 | |
---|---|
|
I have been trying to keep a list of bugs and possible improvements to starterator (see issues on my github cdshaffer/starterator repo if you want to see the specific list). I saw a very similar result in the phage Mitkao pham 1510 output. I was able to do a little sniffing around in that case. The problem was a single unusual gene with a very long ORF upstream of the start codon that messed up the calculation of the scaling to use for the X axis. Another pham had a different issue but a similar output in that there was just too much protein sequence divergence among the pham members so there was no pink simply because there was so little conservation among all members. So in both cases I investigated it was not simply the size of the pham but unusual properties of the specific pham. This is very typical in bioinformatics. The computer programs will take care of 95-99% of cases, but since biology is not math there are always unusual corner cases that just don't work well. In the MitKao case one of the assumptions made my starterator is that there will be an in-frame stop codon not too far upstream of the annotated start codon. In rare cases this assumption is incorrect and the output fails to give meaningful results. I always use results like this as a teaching moment. This is a great example that no computer program is 100% successful and it is why it is still worthwhile doing manual annotation. So in this case, the "experiment" (i.e. the automated analysis of a multiple sequence alignment of all genes in a pham using ClustalW) failed to give a result. I would explain to the student that we now have a decision to make: try to do the analysis manually or just move on. This brings up the opportunity to discuss cost/benefit analysis and how that relates to research and that there is never enough time to do everything and a good researcher is making good choices about where to invest time and $ to get the best outcome they can afford. I would then probably say in this case that the manual analysis is not worth the time/effort and just put in the notes that starterator was NI (not informative) as suggested in the Annotation Guide (see page 76). |
Posted in: Phamerator → Tutorial on Phamerator and Starterator Use?
Link to this post | posted 25 Jan, 2016 18:02 | |
---|---|
|
I have not tried Windows 10 but the checksums for the virtual disk images are here. |
Posted in: DNA Master → DNA Master and Windows 10
Link to this post | posted 07 Dec, 2015 20:47 | |
---|---|
|
I always just type out phagesdb.org/pham speaking of which, Dan, I know from that page I can search for words in the notes field but is there a way to see all notes for a given pham? This would be nice, I have run across a problem with finding functional annotations if they are relatively recent (i.e very few phage have that annotation) and the pham is particularly large. The function just doesn't end up in the top 100 proteins listed in the blast result but you can find them if you search all the notes. I have my full semester students look directly for annotations using a mysql query of the notes field but it would be really nice to have a web bases system for our mini-phage class. I looked for this kind of function before but could not find in on phages.db. Does such a search exist? |
Posted in: Phamerator → Phams
Link to this post | posted 24 Nov, 2015 20:07 | |
---|---|
|
There is a way to get the info but I am not sure i would call it easy. You can export the complete pham table from phamerator. Open up phamerator, in the file menu select export pham table, select the file location, give the file a name and hit save. This will take a bit, depends on your machine, but you will eventually end up with a file of all genes cross referenced for all phage and all phams (take my computer about 3-4 minutes to create the file). The file is currently running about 15mb but it does compress down quite a bit. The pham table is essentially a giant excel spreadsheet with a couple of columns of summary data, then each phage in a single column, each pham as a row. The sheet is then filled with genes names where each gene is placed in the cell that intersects the proper column (i.e. phage the gene came from) with the proper row (i.e which pham it is in). I have created the current pham table from the most recent Actino_Draft database, you can download it with this link. That file will work for now but of course it will soon be out of date. But for now, just download the file, double click on it to uncompress, and import into excel. You will need to tell excel during the import that the data is in a text file, that the data is delimited, and that the delimiter is Tab (at least that is how it works on my mac version of excel). This is a lot of data, so be patient with excel, some commands could take 20 or 30 seconds to complete, there are something like 14 million cells to process in the whole spreadsheet. Once you have it open in excel, I would recommend you scroll down until you find the pham you are interested in, select the whole row, copy it, move to a new blank spreadsheet and use "Paste special…" to paste a "transposed" version into your new sheet (that will create a much, much smaller and faster spreadsheet). Transposition will convert the row to a single column which will be much easier to deal with. You can then use excel "sort" or "filter" on the column to bring all the gene names to the top. good luck, most more questions if you get stuck at any point. |
Posted in: Phamerator → Phams