The official website of the HHMI Science Education Alliance-Phage Hunters Advancing Genomics and Evolutionary Science program.

Welcome to the forums at Please feel free to ask any questions related to the SEA-PHAGES program. Any logged-in user may post new topics and reply to existing topics. If you'd like to see a new forum created, please contact us using our form or email us at

Functions not on the approved list

| posted 01 Mar, 2017 23:48
My apologies for the long list, but I have come across the following functional assignments which are not on the approved list. Should we use the following names?

DNA polymerase I (ShayRa_47 28154-29944, Rev) - in a number of genomes but not on the approved list

MRE11 nuclease - (OR MRE11 exonuclease/endonuclease OR MRE11 double strand break nuclease) - has both endonuclease and exonuclease domains. Sometimes the gene has been annotated as endonuclease, sometimes exonuclease- HHPRED 99.5% (LastHope_35, pham 5465)

Polypeptide N-acetylgalactosaminyl transferase (Kingsley_110, pham 6125, 57955-59391 Forward) HHPRED prob 100%

N-acetylglucosaminyl transferase (or glycosyltransferase) - Kingsley_112, pham 27754, 59597-60208, Forward - HHPRED prob 100%

DNA helicase/methylase - gene has has both helicase and methylase domains Kingsley_63, pham 6161, 42315-44834 Forward HHPRED prob 100% for both helicase and methylase (also in CD)

Immunity superinfection protein (sometimes called superinfection exclusion protein)- Jabith_72 (pham 25322.) It's in a number of genomes but not listed on the approved list.

| posted 01 Mar, 2017 23:59
I have been using TMHMM to check for transmenbrane domains. There is a gene in Findley(K2)which has both DNA binding domain at the C terminus and 4 transmembrane domains at the N terminus. Should it be called "membrane DNA binding protein"?

| posted 02 Mar, 2017 15:47
Hi Jackie,
1) Graham has approved DNA polymerase I, and it is OK to use for those Gordonia phages.
2) I think we want to stay with glucosyl transferase and galactosyl transferase for those two enzymes in the Fs. For more specific I feel like we should demonstrate bench evidence.
3) DNA helicase/methylase is approved as long as you can demonstrate two functional domains.
4) Immunity superinfection protein is too vague based on what we currently know about prophage mediated resistance, and stays off the approved list. I need to go back and figure out how those functions got into those genomes to begin with.
5) Re the MRE11 (what a neat investigation): the HHPred results are pretty confusing, so I can see why this protein has been labeled as both exo- and endo- or nothing at all. What are you proposing? Call all members of this pham "nuclease"? I Only slightly farther down the list are almost equally good alignments to "metallophosphoesterase" and "phage lambda ser/thr kinase". Are you sure that the evidence is good enough to pick one thing out of this list where everything is 98.7 probable or better to a single domain of the protein? I would have to do some serious digging through the literature to figure out which parts of the proteins were aligning to the phage proteins.
6) Excellent use of TMHMM. Which type of DNA binding domain?
| posted 02 Mar, 2017 18:25
Re 5) I just checked the literature and MRE11 is a Mn dependent nuclease with several highly conserved phosphoesterase domains.
Phyre2 results also support the call as MRE11 with 99.1% certainty. I would propose to call the gene MRE11 double strand break nuclease. I think that it maybe even better to include exonuclease/endonuclease (instead of just nuclease - MRE11 double strand break exonuclease/endonuclease) so that when students see hits to both endo and exo it will minimize confusion but I'm concerned that the name may be too long. What do you think?

6)HTH DNA binding - hits to HTH transcriptional activator
Edited 02 Mar, 2017 18:27
| posted 24 Apr, 2017 15:34
5) I like the MRE11 double-strand break endo/exonuclease. We'll add that to the main list.

6). I think we should just stick with HTH DNA binding protein. I know I am wimping out about the membrane part, but i think it is better to stay vague when we don't know what it is doing.

Thanks JAckie!
| posted 01 Jul, 2017 21:17
A lot of the BD1 Streptomyces phages have a "dNMP kinase" that isn't on the official list. HHPRED matches are written as "deoxynucleoside monophosphate kinase" so I thought written out might be better? Can we add officially?
| posted 07 Jul, 2017 19:16
Yep, we can add officially. give me a phage name and gene number.
| posted 07 Jul, 2017 21:03
Welkin Pope
Yep, we can add officially. give me a phage name and gene number.


I already put it down on the "pending" section of the working list. Will we go with written out "deoxynucleoside monophosphate kinase" or "dNMP kinase"?
| posted 24 Aug, 2017 00:59
Here is a new one to consider: Sortase

This one is described in the Abt2graduatex2 genome I am QCing, but for a phage already annotated it can be found in BabyGotBac_44. Description of "sortase" below. It looks like to me that these gene products have the conserved catalytic triad. They show 98% homology in HHPRED with an E-value of 10^-9

Sortase is described in CDD as: "Sortases are cysteine transpeptidases, mainly found in Gram-positive bacteria, which either anchor surface proteins to peptidoglycans of the bacterial cell wall envelope or link proteins together to form pili by working alone, or in concert with other enzymes. They do so by catalyzing a transpeptidation reaction in which the surface protein substrate is cleaved at a conserved cell wall sorting signal and covalently linked to peptidoglycan for display on the bacterial surface. Sortases are grouped into different classes based on sequence, membrane topology, genomic positioning, and cleavage site preference. The different classes are called class A to F sortases. Most Gram-positive bacteria contain more than one sortase and it is thought that the different sortases attach different surface protein classes. The typical eight-stranded beta-barrel fold is observed in all known sortases, along with the conserved catalytic triad consisting of cysteine, histidine and arginine residues. Some sortases contain an N-terminal signal peptide only and the C-terminus serves as a membrane anchor, which represents a type I membrane topology, with the N-terminal enzymatic portion projecting towards the bacterial surface and the C-terminal end residing in the cytoplasm. Other sortases adopt a type II membrane topology, with the N-terminal hydrophobic segment inside the cytoplasm and the C-terminal enzymatic portion located across the plasma membrane. The N-terminus either functions as both a signal peptide for secretion and a stop-transfer signal for membrane anchoring. Sortases are also present in some Gram-negative and Archaebacterial species, but the functions of these enzymes are unknown."

So, what do you think? Sortase, or cysteine transpeptidase, or not a function we add to the list?

| posted 27 Aug, 2017 21:21
Another one from Abt2graduatex2: RNase adapter protein RapZ

The genome has this function (not on the approved list) for gp55. I compared the HHPRED data for the equivalent gene in BabyGotBac (gp56, which was annotated as "ATPase"smile and found there is 100% probability, 90% coverage, and E-value of 0 to the RNase adapter protein (RapZ in E.coli).

The function in Uniprot is given as "Modulates the synthesis of GlmS, by affecting the processing and stability of the regulatory small RNA GlmZ"

Thoughts on this function?

Login to post a reply.