Welcome to the forums at seaphages.org. Please feel free to ask any questions related to the SEA-PHAGES program. Any logged-in user may post new topics and reply to existing topics. If you'd like to see a new forum created, please contact us using our form or email us at info@seaphages.org.
Recent Activity
All posts created by cdshaffer
Link to this post | posted 09 Dec, 2020 22:55 | |
---|---|
|
For those of you using Phamerator and Starterator in the SEA VM: The Hatfull lab has been doing some much needed updates to the underlying phamerator database and at the same time has moved all the databases to a new server. The combination of these two events means that the SEA VM version of phamerator and starterator will no longer be able to update to new databases as they are released. These programs will continue to work "as is" but the databases they have now will become increasingly out of date as new phage are added and pham assignments invariable change. The production version of starterator has been updated and the web based PDF's are up to date with the new database. Users of the web pages should be on the lookout for any discrepancies or possible errors, I have tested many pages but cannot screen all 14,000 or so pham reports so it is possible that some unusual set of circumstances could result in errors in the Starterator reports. If you find anything of concern please report to this forum or send me an email directly. Finally, I am looking for a few volunteer testers to work with me on testing the new beta version of Starterator in their VM. Send me an email if you would like to help in this regard. Thanks. |
Link to this post | posted 07 Dec, 2020 19:10 | |
---|---|
|
Thanks for posting this. I have added a note on the issues tracker for Starterator here: https://github.com/SEA-PHAGES/starterator/issues/42 Hopefully once the end of the semester grading bonanza ends I can update starterator to handle the new default Genome profile format created by DNA Master. |
Link to this post | posted 25 Nov, 2020 00:26 | |
---|---|
|
These issues of specificity are always difficult to annotate. I agree with Debbie in that I tend to "do no harm"; that is, I would annotate using the more general "protease" unless there is good evidence for the more specific term. To answer the question of function at this level you would have to dig into the published literature on the matching crystals and see if they describe the active site in detail. If you are lucky you can find enough detail as to exactly which bonds are involved in the reaction and which side chains are critical in the active site. If you can get that, then you could see if your protein is likely to create that same active site using 3D modeling and visualization tools. This is well beyond what I ask students to do in general but we ask students to investigate one gene "in depth" and this would be a really good candidate gene for this kind of detailed investigation. The likelihood of success is low, and I tell my students that up front, as they have to be lucky enough that the information they are looking for is actually in the published literature. But as a teacher I am fine with students trying as it is really about the journey not the destination. |
Posted in: Functional Annotation → Metalloprotease or metallopeptidase
Link to this post | posted 18 Nov, 2020 23:42 | |
---|---|
|
In phage Belfort CDS 134(87,804-88,487) has a large number of high quality hits to NAD-Dependent Deacetylase. There are approx 50 HHPRED hits with 100% probability and >99% aligned. The vast majority of the top hits include the term "sirtuin" a group of enzymes found in all kingdoms. However, of the 100% probability alignments more than half include the term "NAD-Dependent Deacetylase". The top prokaryote hit is to crystal 1S5P_A, an enzyme from Escherichia coli (100% probability and 99.5% coverage) and has the description "NAD-dependent deacetylase (E.C.3.5.1.-); protein deacetylase". We propose either the term "NAD-dependent deacetylase" or "NAD-dependent protein deacetylase" and avoid the whole "sirtuin" nomenclature. If you want to see all the hits, this phage is in PECAAN (Belfort 134). For detailed alignments here is the amino acid sequence to rerun the HHPRED search:
|
Link to this post | posted 22 Oct, 2020 00:15 | |
---|---|
|
To me the results of of the starterator reports are quite telling. The two choices you point out are labelled start 12 and 15 in the current starterator report here. First the level of conservation for start 12 is much much higher than start 15. In fact there are only 2 of 56 phage that don't have start 2 and both of those have a start very very close by position to start 12. On the other hand start 15 is only seen in 2/3rds of these genes and for 7 of the 30 tracks there are no starts anywhere near start 15. To me it is hard to believe that evolution would continue to choose to keep the bases that code for start 12 in virtually all these genes if start 15 was really the start cf translation, so I would have a strong preference for it. As for coding potential (CP). If you look carefully you can see examples of other regions in the genome where you know the sequence is coding but the CP signal drops to zero. These are regions that are downstream of a strong CP signal but before the stop codon. See the CP for gene 14, there is easily at least 100 bases with no CP signal. So this is why I have a "rule" that a positive signal in CP is good evidence there IS a gene but no CP is not quite as good at indicating there IS NOT a gene. Said more formally, CP algorithm makes more false negative errors than false positive errors. So, in this case where one start says there is a CP false positive (start with 245 gap) and the other choice would say that CP is a false negative I would say that CP also is slightly more supportive of the big overlap start. Taken together then I would annotate this gene to start at 1322. If I were helping a student with this I would now ask them to back and double check that gene 2 is real just because of that super large overlap. But even if gene 2 is real I would probably still stick with that huge overlap given the strong level of conservation seen in starterator report. |
Link to this post | posted 24 Sep, 2020 20:17 | |
---|---|
|
The term "AAA-ATPase" is a domain found in a wide variety of proteins and is indicative of a specific fold pattern that creates an ATPase pocket that has been seen in many different proteins. So an annotation with that term says you want people to know that the protein very likely cleaves ATP and uses that specific fold structure. The term "terminase" reflects the biological role the protein supplies for the phage. To me, neither term is inherently better than the other, it all depends on what one is interested in. I am sure there are many enzymologists out there that are much more interested in the presence of an enzymatic domain, while others are more interested in why the phage would have that gene at all. My own perspective is to go with the biological role if I can find sufficient evidence in support and only mention the presence of a domains (as they are often good hints to the biological role) when there is not sufficient evidence to call the role. |
Posted in: Functional Annotation → cluster K terminases
Link to this post | posted 24 Sep, 2020 20:02 | |
---|---|
|
For those of you that have the SEAVM with pdm utils installed that came out in June 2020, it has version 2.0.5 of tRNA-scan already installed. It is the command line version, not the web page, but it is pretty straightforward to run. First copy the genome fasta file into the VM and put the file on the Desktop. Open a terminal and change the working directory to the Desktop with this command:
Now use the following command (I tried to use the parameter settings that would mimic the settings recommended on the web interface). You will need to replace the items bounded by the <> with your specific values:
Here is an example of the command I used recently on phage Jada where the fasta genome file is called Jada.fasta, and I wanted the results file to be called Jada.tRNAscan.txt.
the results will be a text file that you can copy back to your computer, it can be opened by textedit on a Mac or Wordpad on Windows. Here are the first few lines of the output for phage Jada:
|
Posted in: tRNAs → tRNAScan-SE Down-ish?
Link to this post | posted 02 Sep, 2020 18:52 | |
---|---|
|
Wow! talk about a spectacularly unhelpful error message. There are just a few things to try that are very basic: 1. make sure there is enough room on the hard drive for installation. 2. make sure any virus protection software is TURNED OFF. Virtualbox does some very deep level installation that could be blocked. 3. make sure to be logged into an account with Administrator privilege (to check this go to system preferences -> users & groups and make sure that there is the "Admin" label under the user name in the left column) that is all I can come up with off the top. Good luck! |
Posted in: SEA-PHAGES Virtual Machine → Virtual box installation error
Link to this post | posted 04 Aug, 2020 23:42 | |
---|---|
|
OH I see, great questions. As far as I know there is no version of the VM that anyone has built that can run both Starterator and pdm_utils. Not sure it is even possible but I have never tried so it could be possible. The issue is that starterator requires some pretty old versions of some of the the graphical libraries and pdm_utils requires some of the newer libraries that are only available on the newer versions of Ubuntu. It is possible that there are versions of the graphics libraries that are compatible with both starterator and the newer versions of ubuntu but I have not found them. |
Posted in: Starterator → Release of Starterator version 1.2
Link to this post | posted 04 Aug, 2020 17:01 | |
---|---|
|
it is probably working on the old VM because you have not run phamerator so you are on an old (pre-udpates) version of the database. It is important to remember that Starterator does not do all the database checking and management, it relies on the old phamerator to do that. So if you are using a VM you should always run phamerator 1st to get the most recent database and then run starterator. |
Posted in: Starterator → Release of Starterator version 1.2