SEA-PHAGES Logo

The official website of the HHMI Science Education Alliance-Phage Hunters Advancing Genomics and Evolutionary Science program.

Welcome to the forums at seaphages.org. Please feel free to ask any questions related to the SEA-PHAGES program. Any logged-in user may post new topics and reply to existing topics. If you'd like to see a new forum created, please contact us using our form or email us at info@seaphages.org.

All posts created by cdshaffer

| posted 05 Feb, 2016 20:11
easy way (that I use in class when time is an issue) I tell student: open browser, connect to email, send files to yourself as attachment. Bonus feature: automatic backup of files.

If you need to move many files back and forth:
1. make sure you are using a recent version of Virtualbox and have installed guest additions in the ubuntu machine
2. with machine off go to settings -> shared folder
3. click the tiny folder with the plus sign to add a shared folder
4. in the folder path entry click the arrowhead and select "other"
5. select the mac Desktop folder to share between guest and host
6. select the "Make permanent" setting so you only have to do this once.
7. The shared folder will be in the /media folder in the guest which is in the top level of folders (i.e. two folders up from your home folder). The name of the folder inside the guest machine will be something like sf_Desktop
8. Files placed in there will appear on the Desktop of your mac

You an also set up drag'n'drop to supposedly be able to move files back and forth but I have had less success with that technique
Posted in: SEA-PHAGES Virtual MachineShared folders
| posted 05 Feb, 2016 18:34
I will look into this when I can, unfortunately my computer motherboard died last night and I am working from a loaner until it is repaired. Until then my starterator virtual machine is unavailable. It sounds like a couple of other cases I have run with little or no pink. It both of those cases it was not a bug as much as weird corner cases that starterator was not built to handle.
Posted in: StarteratorEmpty Track
| posted 05 Feb, 2016 18:04
I have been trying to keep a list of bugs and possible improvements to starterator (see issues on my github cdshaffer/starterator repo if you want to see the specific list). I saw a very similar result in the phage Mitkao pham 1510 output. I was able to do a little sniffing around in that case. The problem was a single unusual gene with a very long ORF upstream of the start codon that messed up the calculation of the scaling to use for the X axis. Another pham had a different issue but a similar output in that there was just too much protein sequence divergence among the pham members so there was no pink simply because there was so little conservation among all members.

So in both cases I investigated it was not simply the size of the pham but unusual properties of the specific pham. This is very typical in bioinformatics. The computer programs will take care of 95-99% of cases, but since biology is not math there are always unusual corner cases that just don't work well. In the MitKao case one of the assumptions made my starterator is that there will be an in-frame stop codon not too far upstream of the annotated start codon. In rare cases this assumption is incorrect and the output fails to give meaningful results.

I always use results like this as a teaching moment. This is a great example that no computer program is 100% successful and it is why it is still worthwhile doing manual annotation. So in this case, the "experiment" (i.e. the automated analysis of a multiple sequence alignment of all genes in a pham using ClustalW) failed to give a result. I would explain to the student that we now have a decision to make: try to do the analysis manually or just move on. This brings up the opportunity to discuss cost/benefit analysis and how that relates to research and that there is never enough time to do everything and a good researcher is making good choices about where to invest time and $ to get the best outcome they can afford. I would then probably say in this case that the manual analysis is not worth the time/effort and just put in the notes that starterator was NI (not informative) as suggested in the Annotation Guide (see page 76).
Posted in: PhameratorTutorial on Phamerator and Starterator Use?
| posted 25 Jan, 2016 18:02
I have not tried Windows 10 but the checksums for the virtual disk images are here.
Posted in: DNA MasterDNA Master and Windows 10
| posted 07 Dec, 2015 20:47
I always just type out phagesdb.org/pham

speaking of which, Dan, I know from that page I can search for words in the notes field but is there a way to see all notes for a given pham? This would be nice, I have run across a problem with finding functional annotations if they are relatively recent (i.e very few phage have that annotation) and the pham is particularly large. The function just doesn't end up in the top 100 proteins listed in the blast result but you can find them if you search all the notes. I have my full semester students look directly for annotations using a mysql query of the notes field but it would be really nice to have a web bases system for our mini-phage class. I looked for this kind of function before but could not find in on phages.db. Does such a search exist?
Posted in: PhameratorPhams
| posted 24 Nov, 2015 20:07
There is a way to get the info but I am not sure i would call it easy. You can export the complete pham table from phamerator. Open up phamerator, in the file menu select export pham table, select the file location, give the file a name and hit save. This will take a bit, depends on your machine, but you will eventually end up with a file of all genes cross referenced for all phage and all phams (take my computer about 3-4 minutes to create the file). The file is currently running about 15mb but it does compress down quite a bit. The pham table is essentially a giant excel spreadsheet with a couple of columns of summary data, then each phage in a single column, each pham as a row. The sheet is then filled with genes names where each gene is placed in the cell that intersects the proper column (i.e. phage the gene came from) with the proper row (i.e which pham it is in).

I have created the current pham table from the most recent Actino_Draft database, you can download it with this link. That file will work for now but of course it will soon be out of date. But for now, just download the file, double click on it to uncompress, and import into excel. You will need to tell excel during the import that the data is in a text file, that the data is delimited, and that the delimiter is Tab (at least that is how it works on my mac version of excel). This is a lot of data, so be patient with excel, some commands could take 20 or 30 seconds to complete, there are something like 14 million cells to process in the whole spreadsheet.

Once you have it open in excel, I would recommend you scroll down until you find the pham you are interested in, select the whole row, copy it, move to a new blank spreadsheet and use "Paste special…" to paste a "transposed" version into your new sheet (that will create a much, much smaller and faster spreadsheet). Transposition will convert the row to a single column which will be much easier to deal with. You can then use excel "sort" or "filter" on the column to bring all the gene names to the top.

good luck, most more questions if you get stuck at any point.
Edited 24 Nov, 2015 20:19
Posted in: PhameratorPhams
| posted 19 Nov, 2015 04:29
Maybe Steve or Dan can confirm but I believe the database has grown too large for the tasks you want. Looking at Actino_draft there are over 1100 phage and almost 12,000 phams. I don't think I have been able to get a pham circle since the phage count went above about 200. I can't image how small the font would need to be to have the names of 1100 phage around a circle.

There are other ways of looking at pham data [but it won't be pretty pictures]. Its fairly easy to generate a list of all genes in a pham using either the mysql command line or extracting the data out of the pham table. What pham are you interested in? Would a list of the gene members for a particular pham be helpful?
Posted in: PhameratorPhams
| posted 02 Nov, 2015 16:37
OK I will look into it. Sorry for being late; I am at peak teaching, things should calm considerably. I will try to post updates to this posting as I track down the issue.

OK figured out that the 117th pham is pham 14107. This corresponds to gene 1 on the phamerator map. It looks like this is the same (or at least related to) the issue with Dori 94.

I have now precisely mapped that gene by running auto-annotation in DNA Master. Turns out the problem gene is a wrap-around gene called by glimmer that goes from base 444 down to base 1, then end of phage down to 68540. Starterator is not able to handle these right now due to the way that they are handled by Phamerator when importing genes like this. Working on a solution….

Ok I did the quick and dirty fix, added code to skip any genes that look problematic. In this case if the length of the gene in the database does not equal the distance between the given start and stop codons then something is awry, so skip the gene. This fix is not yet submitted to the main Starterator branch so your version does not have it, only my local copy. I want to do more testing before I submit. In the mean time I did try my patch on Yara and it went all the way through without crashing. I have posted the final pdf report here. Thanks for the report, another bug squashed! Now if I can just find the time to do all the testing so I can submit all these bug fixes I have found.
Edited 03 Nov, 2015 02:10
Posted in: Starteratorphage that crash starterator
| posted 21 Oct, 2015 03:07
Ok I tweeked the entry for missing phage.
Posted in: StarteratorRead First: Common Starterator Troubleshooting
| posted 19 Oct, 2015 16:23
Dan,
Is there anyway to make this post sticky so it stays on top of the list. Might be helpful to new faculty to read this post first and try solutions before posting a question. Could we even change the title to add something like READ THIS FIRST or START HERE.
Chris
Posted in: StarteratorRead First: Common Starterator Troubleshooting