SEA-PHAGES Logo

The official website of the HHMI Science Education Alliance-Phage Hunters Advancing Genomics and Evolutionary Science program.

Welcome to the forums at seaphages.org. Please feel free to ask any questions related to the SEA-PHAGES program. Any logged-in user may post new topics and reply to existing topics. If you'd like to see a new forum created, please contact us using our form or email us at info@seaphages.org.

All posts created by cdshaffer

| posted 06 Mar, 2021 21:29
I would always prefer the HHPRED matches (if I find them) over the blast results. This is due, in no small part, on the quality of the different databases being searched and the relative sensitivity of the algorithms. The source for many of these "discrepancies" like your list is that the alignments are only matching to part of your protein or to just part of the subject. Since some proteins have multiple functional parts all connected together in a single polypeptide chain this can lead to what I would call a "partial annotation".

Also note that your first two possibilities do not really "disagree" they really are just different levels of specificity. When trying to decide on levels of specificity, I first direct my students to try to understand the differences in what the terms mean, good sources include the sea-phages approved terms list, the EXPASY enzyme class list, Wikipedia, intro bio text books etc. are all good sources. Once you have a better understanding of the terms you can then look for evidence to help you decide if a higher level of specificity is justified or not.

As for this particular protein, if I scan through the top 15-20 hits from prokaryotes (i.e. I am going to ignore the two human mitochondrial proteins) I see many hits to proteins that are described to have BOTH a helicase activity AND a nuclease activity. This explains the "discrepant" results, so the question becomes: does this protein from crewmate also have those two domains or just one. This is why you see annotators often talk about the size of the protein and the size of the match. I quickly focus on the length of the alignment and which part of the subject is matching. Most of these alignments cover about 75% of crewmate 28 but you can see that they only match a much shorter part of the subjects that are described to have both a nuclease and a helicase activity (like residues 1005-1232, 790-1014 or 129-368 ). So likely crewmate is similar to either the nuclease or the helicase part of these larger proteins, but I cannot tell which based simply on the summary data presented in the table. Looking at the other matches I can see hits to pfam domains and "cd conserved domains" that are all different types of exonucleases. So what we have here is likely an exonuclease that is often found as part of a large helix/nuclease combo protein. So I am pretty convinced that either of the first two options on your list could be appropriate here.

When talking about this with my students I would point out that since they are the first author it is really up to them to read up on the two terms and decide if they think the "cas4" is better than the generic "exo" but I would be willing to put my name as an author on either of those annotations since one is just a more specific subtype of the other.
Posted in: Functional AnnotationPhage gene annotation has matching phage genes have 4 different proteins - which one is a match?
| posted 27 Feb, 2021 19:03
in the database there are 11 phage genes with subcluster B1 they all appear in current pham 48591:

| phageid      | subcluster | gene name | notes | phamid |

| BatteryCK    | B1         | 14        | holin |  48591 |
| Beaglebox    | B1         | 14        | holin |  48591 |
| DoesntMatter | B1         | 15        | holin |  48591 |
| LeeLot       | B1         | 15        | holin |  48591 |
| Magic8       | B1         | 15        | holin |  48591 |
| Megatron     | B1         | 15        | holin |  48591 |
| Mosaic       | B1         | 14        | holin |  48591 |
| OliverWalter | B1         | 15        | holin |  48591 |
| ProfessorX   | B1         | 15        | holin |  48591 |
| Trypo        | B1         | 15        | holin |  48591 |
| Xavier       | B1         | 14        | holin |  48591 |
for the totally geeky here is the mysql query I used:
> select gene.phageid, phage.subcluster, gene.name as "gene name",
gene.notes, gene.phamid from gene join phage on gene.phageid=phage.phageid
where phage.subcluster="B1"
and gene.notes like "%holin%";
Edited 27 Feb, 2021 19:04
Posted in: Cluster B Annotation TipsHolin
| posted 19 Feb, 2021 18:12
The current versions of both chrome and firefox have stopped support for FTP. I started up an old version of the SEA-VM and used the really old version of firefox therein to confirm that the FTP server is up and running. So to access it directly you will need to use an FTP client to connect. In the past, on Windows machines, I have used Filezilla and cyberduck to access FTP sites.

If you have a mac, you can also just connect using the finder -> go menu -> connect to server -> enter address of ftp://cobamide2.bio.pitt.edu/ -> click connect -> select connect as guest => click connect. This will open a finder window which is connected to the ftp server. Be patient with file loading it can be slow; the file you want (dna master.exe) is in the DNAMas folder. You can use typical drag and drop to copy the file.
Final step, of course, is to transfer the file to your windows machine by whatever method you find most convenient.
Posted in: DNA MasterDNA master server down?
| posted 17 Feb, 2021 16:53
The issues you are having are related to the restructuring of the database that was undertaken last fall. You will need to upgrade to a compatible version of PhamNexus. I can see that Christian in the Hatfull lab is still keeping it up to date as he posted code changes last October to make it compatible with the new database format. He will need to tell you how to update to that working version.
Posted in: Bioinformatic Tools and AnalysesPhamNexus on SEA-VM
| posted 09 Feb, 2021 19:33
This was due to a database versioning issue (i.e. starterator was still on version 391 and phagesdb had upated to version 392). The report is now available here. If you want all the details on the how and why this happens, and how to check versions see this thread: this thread

If you want to see which version Phamerator.org, pecaan and starterator are on use these instructions:
phagesdb see: http://databases.hatfull.org/Actino_Draft/Actino_Draft.version
starterator see: http://phages.wustl.edu/starterator/database.version
pecaan: look on any "pham maps" page just above the map
phamerator.org: open the pull down menu in the top left
Posted in: StarteratorPham not found in Starterator
| posted 23 Jan, 2021 19:33
When it comes to enzyme naming issues I prefer the KEGG and exapasy "Enzyme" databases. For example here is the info on this enzyme at Expasy:

https://enzyme.expasy.org/EC/2.7.6.1

The preferred name is Ribose-phosphate diphosphokinase, so they agree with your preference for "kinase". I also note that one of the mentioned synonyms also uses kinase. I have always been pretty agnostic on which term we should put on our approved function list but once it is there I think it makes all our work better if we all try to use that particular term as it creates a more consistent data set across all the SEA phages.

phosphoribosyl transferase appears to be a general term as it hits to about 25 different enzymes in the Expasy database, the vast marjority of which are glycosyltransferases.
Posted in: Functional AnnotationPhosporibosylpyrophosphate synthetase
| posted 15 Dec, 2020 21:05
ok release of starterator reports for database version 387 are now available and are tagged as the current version. Links from pecaan and phagesdb should now be working (at least until the next database update). Please report if any links fail at this point.
Posted in: StarteratorRead First: Common Starterator Troubleshooting
| posted 15 Dec, 2020 18:00
OK this appears to be an out of sync error. The current version of the database is 387. The starterator analysis to create the 14 thousand or so reports is on going so the web pages still report the old version 386. The computations should complete and be posted by the end of the day at which point the links from pecaan and phagesdb should work.

In the mean time you can use these links to get to the reports for those three genes in the older 386 database reports here:

Peel 30: old pham number 44662 and link http://phages.wustl.edu/386/Pham44662Report.pdf
Peel 32:old pham number 44868 and link http://phages.wustl.edu/386/Pham44868Report.pdf
Peel 33:old pham number 44899 and link http://phages.wustl.edu/386/Pham44899Report.pdf

There is a way to check if the "out of sync" error is likely to be the cause of these types of missing phams by comparing the reported database versions. This link will always show the current version number on the Starterator server. Alternatively, you can look on any starterator report where you will see something like this: "This analysis was run 12/09/20 on database version 386" near the top of the first page of text.

In pecaan you can go to the Pham Maps page and look just above the map where you will see something like this: Phamerator Version: 387 . Unfortunately there is no easy way to get the version number of the database at phagesdb but you can always check this link to see which version is on the database server, all the programs download from there.
Edited 15 Dec, 2020 20:47
Posted in: StarteratorRead First: Common Starterator Troubleshooting
| posted 15 Dec, 2020 16:52
These are typically due to database at phagesdb being out of sync with the current database for the starterator reports but this should not be happening "for weeks".

Are you using pecaan or phagesdb for your links? Could you post the pham numbers you are looking for?
Posted in: StarteratorRead First: Common Starterator Troubleshooting
| posted 10 Dec, 2020 18:58
Ok, those look pretty good. I now suspect our issue might have been with the age/storage of the grids. Thanks
Posted in: Phage Discovery/IsolationElectron Microscopes