SEA-PHAGES Logo

The official website of the HHMI Science Education Alliance-Phage Hunters Advancing Genomics and Evolutionary Science program.

Welcome to the forums at seaphages.org. Please feel free to ask any questions related to the SEA-PHAGES program. Any logged-in user may post new topics and reply to existing topics. If you'd like to see a new forum created, please contact us using our form or email us at info@seaphages.org.

Phage gene annotation has matching phage genes have 4 different proteins - which one is a match?

| posted 06 Mar, 2021 02:01
I am annotating the phage Crewmate (gene 2smile in PECAAN and I would like some guidance as to which protein function I should assign to the gene.

- The NCBI BlastP of the gene returns several matching phages and their proteins.
- (See the attached image for details) The first 9 phages returned have high coverage (99.28%-100%), high alignment (>96%), low e-values (0), and high identity (>90%). (see the attached image for details)but they have annotated 4 different proteins and I am unsure which one is a good reference for the Crewmate gene 28? The proteins assigned to the 9 phages are:

- Cas4 family exonuclease (4 of 9 phages called this)
- Exonuclease (3 of 9 phages called this)
- RepA-like helicase (1 of 9 phages called this)
- Hypothetical protein (1 of 9 phages called this)

Thank-you kindly for reading and for any help offered,
Kieran

References:
A. Attached to this post is a screen shot of NCBI Blastp in PECAAN
B. Protein Sequence:
MTTPKVSTIKRGGARFYVDPDDGKIKVPGVTSIIGMLPKEFLRYWAAKEVAQTAVDSLPTVLQMILNDQSDAAVDFLKKSPDRNTRKAADTGTAAHDLFERMAKGETVGRVHPDLEPFVRHFDEFLTVAKPEYHFLEETVWSDKHAYAGSFDAYATIGGERLWLDNKTTRSGIHEEVGIQLAAYRFADSIIRADGGRVPMPTADGGAVLHVRPEGWKLVPVRCDEELFEVFLHLREVFKYEKEIKSTIVGREVFSGPAEDAPTGPKRRTPRARKAAE
| posted 06 Mar, 2021 02:01
Wabush
I am annotating the phage Crewmate (gene '28'smile in PECAAN and I would like some guidance as to which protein function I should assign to the gene.

- The NCBI BlastP of the gene returns several matching phages and their proteins.
- (See the attached image for details) The first 9 phages returned have high coverage (99.28%-100%), high alignment (>96%), low e-values (0), and high identity (>90%). (see the attached image for details)but they have annotated 4 different proteins and I am unsure which one is a good reference for the Crewmate gene 28? The proteins assigned to the 9 phages are:

- Cas4 family exonuclease (4 of 9 phages called this)
- Exonuclease (3 of 9 phages called this)
- RepA-like helicase (1 of 9 phages called this)
- Hypothetical protein (1 of 9 phages called this)

Thank-you kindly for reading and for any help offered,
Kieran

References:
A. Attached to this post is a screen shot of NCBI Blastp in PECAAN
B. Protein Sequence:
MTTPKVSTIKRGGARFYVDPDDGKIKVPGVTSIIGMLPKEFLRYWAAKEVAQTAVDSLPTVLQMILNDQSDAAVDFLKKSPDRNTRKAADTGTAAHDLFERMAKGETVGRVHPDLEPFVRHFDEFLTVAKPEYHFLEETVWSDKHAYAGSFDAYATIGGERLWLDNKTTRSGIHEEVGIQLAAYRFADSIIRADGGRVPMPTADGGAVLHVRPEGWKLVPVRCDEELFEVFLHLREVFKYEKEIKSTIVGREVFSGPAEDAPTGPKRRTPRARKAAE
| posted 06 Mar, 2021 21:29
I would always prefer the HHPRED matches (if I find them) over the blast results. This is due, in no small part, on the quality of the different databases being searched and the relative sensitivity of the algorithms. The source for many of these "discrepancies" like your list is that the alignments are only matching to part of your protein or to just part of the subject. Since some proteins have multiple functional parts all connected together in a single polypeptide chain this can lead to what I would call a "partial annotation".

Also note that your first two possibilities do not really "disagree" they really are just different levels of specificity. When trying to decide on levels of specificity, I first direct my students to try to understand the differences in what the terms mean, good sources include the sea-phages approved terms list, the EXPASY enzyme class list, Wikipedia, intro bio text books etc. are all good sources. Once you have a better understanding of the terms you can then look for evidence to help you decide if a higher level of specificity is justified or not.

As for this particular protein, if I scan through the top 15-20 hits from prokaryotes (i.e. I am going to ignore the two human mitochondrial proteins) I see many hits to proteins that are described to have BOTH a helicase activity AND a nuclease activity. This explains the "discrepant" results, so the question becomes: does this protein from crewmate also have those two domains or just one. This is why you see annotators often talk about the size of the protein and the size of the match. I quickly focus on the length of the alignment and which part of the subject is matching. Most of these alignments cover about 75% of crewmate 28 but you can see that they only match a much shorter part of the subjects that are described to have both a nuclease and a helicase activity (like residues 1005-1232, 790-1014 or 129-368 ). So likely crewmate is similar to either the nuclease or the helicase part of these larger proteins, but I cannot tell which based simply on the summary data presented in the table. Looking at the other matches I can see hits to pfam domains and "cd conserved domains" that are all different types of exonucleases. So what we have here is likely an exonuclease that is often found as part of a large helix/nuclease combo protein. So I am pretty convinced that either of the first two options on your list could be appropriate here.

When talking about this with my students I would point out that since they are the first author it is really up to them to read up on the two terms and decide if they think the "cas4" is better than the generic "exo" but I would be willing to put my name as an author on either of those annotations since one is just a more specific subtype of the other.
| posted 06 Mar, 2021 23:11
Hi all,
When is an exonucleaase a Cas4 family exonuclease?
debbie
| posted 25 Mar, 2021 21:18
Hi Chris,

Thank-you so much for the explanation. It was very nice of you to write such a nice detailed response. Very helpful indeed. Much appreciated!

Kieran
 
Login to post a reply.