Documenting Gene Calls in DNA Master

| posted 23 Jan, 2018 16:47
Question 1: In Spring 2017, the BLAST notes included both sources, i.e. NCBI GeneBank and PhagesDb. This year's summary notes do not include these two. Do we need to include the phagesDb BLAST or just go by the NCBI Blast? See below what I am reading from the example in resource guide (

"Blast-Start: [phage name, gene number, database, Q : S, coverage, e value/no significant BLAST alignments]. The best BLAST match for this gene that reflects the alignment at the start of the protein, and the alignment of the gene start with that BLAST match. (For example, “Matches KBG gp32, Query 1 to Subject 1”, 100%, 0.0” or “Aligns with Thibault gp45 q3:s45 65% 10-16). If the best BLAST alignment has an e value above 10-4, report “no significant BLAST alignments”
Which BLAST Database is being assumed in this example: PhagesDB or GenBank?
Question 2: With respect to e-values, supposing you get an e-value of 1.7 x 10-17, should we just record -17 as in the above example or do we include the leading values (1.7 in this example)? In a the same vein, DNA master does not allow for writing exponents; what are we expected to do in this case?
Question 3: In the above example, the the "Query to subject" is given two ways, in one case the words written out in full, while being abbreviated in the other example. Previously we were required to abbreviate as Q:S. For purposes of consistence, which way does the QC Team want us to record this?
| posted 26 Jan, 2018 17:13
Related Question: Related to the SIF: notation in the Bioinformatics Guide

QUESTION: Am I correct that if we get a solid function ID from the BlastP analysis that the HHPred analysis is unnecessary? Same for SIF: Synteny?

(Trying to save the QC team a lot of editing and possibly myself if we do the 400 features we are annotating this semester incorrectly.)
Edited 26 Jan, 2018 17:14
| posted 20 Feb, 2018 20:10
I have the same question as Greg, regarding the need to make notes about HHPred findings if BLASTP gives a strong function ID.
| posted 21 Feb, 2018 15:04
Hi All,
You must run all the analyses on all genes. It is no longer acceptable to assign functions simply from one source.

This is because we are getting too many incorrect functional assignments in the annotations. We are hoping that if all three methods and multiple database are checked for every gene, you should be able to assign the correct function, or at least realize that you have really good data for more than one functional assignment.
At that point, you can ask for help with the functional assignment, and get it addressed in your classrooms while you are still working on the annotation. And then really excellent functional annotations will be submitted!

