SEA-PHAGES Logo

The official website of the HHMI Science Education Alliance-Phage Hunters Advancing Genomics and Evolutionary Science program.

Welcome to the forums at seaphages.org. Please feel free to ask any questions related to the SEA-PHAGES program. Any logged-in user may post new topics and reply to existing topics. If you'd like to see a new forum created, please contact us using our form or email us at info@seaphages.org.

All posts created by cdshaffer

| posted 07 Dec, 2015 20:47
I always just type out phagesdb.org/pham

speaking of which, Dan, I know from that page I can search for words in the notes field but is there a way to see all notes for a given pham? This would be nice, I have run across a problem with finding functional annotations if they are relatively recent (i.e very few phage have that annotation) and the pham is particularly large. The function just doesn't end up in the top 100 proteins listed in the blast result but you can find them if you search all the notes. I have my full semester students look directly for annotations using a mysql query of the notes field but it would be really nice to have a web bases system for our mini-phage class. I looked for this kind of function before but could not find in on phages.db. Does such a search exist?
Posted in: PhameratorPhams
| posted 24 Nov, 2015 20:07
There is a way to get the info but I am not sure i would call it easy. You can export the complete pham table from phamerator. Open up phamerator, in the file menu select export pham table, select the file location, give the file a name and hit save. This will take a bit, depends on your machine, but you will eventually end up with a file of all genes cross referenced for all phage and all phams (take my computer about 3-4 minutes to create the file). The file is currently running about 15mb but it does compress down quite a bit. The pham table is essentially a giant excel spreadsheet with a couple of columns of summary data, then each phage in a single column, each pham as a row. The sheet is then filled with genes names where each gene is placed in the cell that intersects the proper column (i.e. phage the gene came from) with the proper row (i.e which pham it is in).

I have created the current pham table from the most recent Actino_Draft database, you can download it with this link. That file will work for now but of course it will soon be out of date. But for now, just download the file, double click on it to uncompress, and import into excel. You will need to tell excel during the import that the data is in a text file, that the data is delimited, and that the delimiter is Tab (at least that is how it works on my mac version of excel). This is a lot of data, so be patient with excel, some commands could take 20 or 30 seconds to complete, there are something like 14 million cells to process in the whole spreadsheet.

Once you have it open in excel, I would recommend you scroll down until you find the pham you are interested in, select the whole row, copy it, move to a new blank spreadsheet and use "Paste special…" to paste a "transposed" version into your new sheet (that will create a much, much smaller and faster spreadsheet). Transposition will convert the row to a single column which will be much easier to deal with. You can then use excel "sort" or "filter" on the column to bring all the gene names to the top.

good luck, most more questions if you get stuck at any point.
Edited 24 Nov, 2015 20:19
Posted in: PhameratorPhams
| posted 19 Nov, 2015 04:29
Maybe Steve or Dan can confirm but I believe the database has grown too large for the tasks you want. Looking at Actino_draft there are over 1100 phage and almost 12,000 phams. I don't think I have been able to get a pham circle since the phage count went above about 200. I can't image how small the font would need to be to have the names of 1100 phage around a circle.

There are other ways of looking at pham data [but it won't be pretty pictures]. Its fairly easy to generate a list of all genes in a pham using either the mysql command line or extracting the data out of the pham table. What pham are you interested in? Would a list of the gene members for a particular pham be helpful?
Posted in: PhameratorPhams
| posted 02 Nov, 2015 16:37
OK I will look into it. Sorry for being late; I am at peak teaching, things should calm considerably. I will try to post updates to this posting as I track down the issue.

OK figured out that the 117th pham is pham 14107. This corresponds to gene 1 on the phamerator map. It looks like this is the same (or at least related to) the issue with Dori 94.

I have now precisely mapped that gene by running auto-annotation in DNA Master. Turns out the problem gene is a wrap-around gene called by glimmer that goes from base 444 down to base 1, then end of phage down to 68540. Starterator is not able to handle these right now due to the way that they are handled by Phamerator when importing genes like this. Working on a solution….

Ok I did the quick and dirty fix, added code to skip any genes that look problematic. In this case if the length of the gene in the database does not equal the distance between the given start and stop codons then something is awry, so skip the gene. This fix is not yet submitted to the main Starterator branch so your version does not have it, only my local copy. I want to do more testing before I submit. In the mean time I did try my patch on Yara and it went all the way through without crashing. I have posted the final pdf report here. Thanks for the report, another bug squashed! Now if I can just find the time to do all the testing so I can submit all these bug fixes I have found.
Edited 03 Nov, 2015 02:10
Posted in: Starteratorphage that crash starterator
| posted 21 Oct, 2015 03:07
Ok I tweeked the entry for missing phage.
Posted in: StarteratorRead First: Common Starterator Troubleshooting
| posted 19 Oct, 2015 16:23
Dan,
Is there anyway to make this post sticky so it stays on top of the list. Might be helpful to new faculty to read this post first and try solutions before posting a question. Could we even change the title to add something like READ THIS FIRST or START HERE.
Chris
Posted in: StarteratorRead First: Common Starterator Troubleshooting
| posted 15 Oct, 2015 21:16
Ok good to know. I will add this issue to the starterator troubleshooting thread.
Posted in: StarteratorPhage not found in track
| posted 12 Oct, 2015 18:57
It looks like (based on the new 2016SEAVM) that the default database for Starterator is Mycobactriophage_Draft so even if phamerator has Actino_Draft, Starterator may not be using it. To check your instance, open starterator, go to the top magically appearing menu items and select edit-> preferences. In the dialog box, check that the database name in there also set to Actino_draft. It may still be set to Mycobactriophage_Draft. If so, edit that entry to Actino_Draft and hit OK. Once I had done that with my 2016SEAVM I was able to completely process Bipper (great phage name by the way).
Also, since I ran the whole phage to check that everything was working with Bipper, I posted the whole phage starterator analysis report as a PDF here. Feel free to download and use if you want.

I will figure out how to update the default database for starterator. I don't have direct access to the code for Starterator but I can post a pull request which should fix the issue once it is accepted. That should fix things for future users. Thanks for being the first and discovering the issue of the old default setting for the starterator database.
Edited 12 Oct, 2015 21:45
Posted in: StarteratorPhage not found in track
| posted 08 Oct, 2015 17:28
I would be happy to look into this. It is pretty unlikely that you are doing something wrong.

To start I need to make sure I have the same database to see if I can replicate the problem. If you open up Phamerator and select Edit -> Preferences. You should see something like the picture I attached. Do you see the same Actino_Draft for the Database, or something else?

If you do see Actino_Draft, can you check that your Ubuntu machine is able to get to the internet to download the most recent version of the database? To do that just open up a web browser in the ubuntu machine and see if you can get to sites on the internet like seaphages.org.
Edited 08 Oct, 2015 17:33
Posted in: StarteratorPhage not found in track
| posted 24 Sep, 2015 19:59
The fasta file from either source will work as input for another program but it can be a hassle if you want to see the sequences yourself. In that case you will probably want to reformat.

On the web I use the EMBOSS tool called seqret to do that. Do a google search on "emboss seqret ebi" and pick the seqret server at www.ebi.ac.uk
You can upload your fasta file and select "FASTA format" as the output format and all the sequences will be formatted so you can see the entire protein sequence without having to do so much horizontal scrolling.
Posted in: DNA MasterExport all protein sequences from a genome