SEA-PHAGES Logo

The official website of the HHMI Science Education Alliance-Phage Hunters Advancing Genomics and Evolutionary Science program.

Welcome to the forums at seaphages.org. Please feel free to ask any questions related to the SEA-PHAGES program. Any logged-in user may post new topics and reply to existing topics. If you'd like to see a new forum created, please contact us using our form or email us at info@seaphages.org.

Clarification regarding "SIF"

| posted 10 Apr, 2018 15:54
I'm still a little unclear what's going on here:

SIF-BLAST [NKF / function, database, phage name, gene number, database gene accession number, %alignment, evalue]
SIF-HHPred [NKF / function, database, phage name, gene number, database accession number, %alignment*, probability]
SIF-Syn: ("syn" refers to "synteny" ) [NKF / function, phage(s) used to infer ]
I understand that y'all want us to pursue all three lines of evidence, but if the bottom line in one (or all) of them is "NKF" do we still need all the other stuff, i.e. gene number etc (and where are we going to get "gene accession number" from BLAST results? I don't remember ever seeing that in there….likewise what does "database" refer to in HHPred? We just use the standard set of databases (or as close as we can get) that come from the drop-down window in HHPred for basically every run, and quite frankly I wouldn't know how to do otherwise, certainly I wouldn't know how to explain to students how to do otherwise. jross
| posted 13 Apr, 2018 13:19
Hi Joe,
for BLAST:
The gene accession number is available when you BLAST on NCBI. The database you are BLASTing against is either NCBI or phagesdb. We have worked hard to sync these two databases with respect of our own data, however, we have altered the annotation for some phages that we did not isolate or sequence in the phagesdb database. For phagesdb, there won't be a gene accession number.

In HHPREd, you are doing your alignment against four databases at time. They do not all have equivalently reliable information. So if your function comes from a crystal structure, you'd write "PDB". If it is a pfam entry, you'd write pfam. Etc.

Regarding the lines of evidence– we are asking you to investigate all three for every gene to make sure that you don't find conflicting answers. You may find that a scaffolding protein doesn't have any sequence similarity to anything via BLAST, and no entries via HHPRed, but it is still the only small protein between the protease and the capsid protein. In that case, it is fine that two lines are NKF, and synteny gets you "scaffolding".
 
Login to post a reply.