The official website of the HHMI Science Education Alliance-Phage Hunters Advancing Genomics and Evolutionary Science program.

Welcome to the forums at Please feel free to ask any questions related to the SEA-PHAGES program. Any logged-in user may post new topics and reply to existing topics. If you'd like to see a new forum created, please contact us using our form or email us at

All posts created by welkin

| posted 05 May, 2017 15:33
Hi Ann,
thanks for the data and sorry about the lengthy wait for a reply!
for 93: There are too many equally likely hits that do different things— I think the best choice is to pick HTH DNA binding protein. There are antitoxins, transcriptional regulators, and all kinds of these. HHPred is picking up that structural motif, and in this case it could be lots of things.

TA systems are really difficult to untangle— the proteins are tiny, so the alignment programs don't work as well. Without bench data, I think we should be cautious.

Posted in: Functional Annotationtranscriptional regulator or toxin-antitoxin
| posted 05 May, 2017 15:06
Hi Larry, I moved you to a new thread so it would be easier to find for other people annotating B2 phages. In general, each new function question should be a new thread.

For your questions:
ORF 2: queuine tRNA-ribosyltransferase, Blastp Phamerator, e=0.0, Glass gp2
yes, go ahead. we should put it on the big list.

ORF 6: GTP cyclohydrolase, Blastp Phamerator, e=e-108, LizLemon gp6
yes, go ahead with this one too.

ORF 8: peroxide stress protein YaaA, HHPred, prob=99.81 e=2.3e-20, E. coli
(HHPred coverage with this match is only 75% of the query sequence.)


ORF 18: membrane protein, Blastp Phamerator, e=1e-18, Attis gp21
Not certain that this is the same as membrane protein Band-7-like that is in the list for Functional Assignments.

The Attis protein was labeled as a membrane protein after it was examined with TMHMM. If you get a similar output, go ahead and call this one a membrane protien too. It is not the same as the Band 7 proteins.

ORF 42: tail fiber protein, Blastp NCBI, e=3e-08, Mediterranean phage uvMED (Should this just be called a minor tail protein?)
This one is kind of scrawny to be a tail fiber. Let's hold off until we have some bench data.

Thanks Larry!
Posted in: Functional AnnotationFunctional assignments in B2 phage Rhinoforte
| posted 26 Apr, 2017 16:09
Hi Ann,
Thanks for posting the HHPred text output. However i would also need to see how much of your protein aligns to the various matches— probability and e value alone don't get you there, as you can get some pretty high scores for pretty short motif matches.


Posted in: Functional Annotationtranscriptional regulator or toxin-antitoxin
| posted 24 Apr, 2017 15:34
5) I like the MRE11 double-strand break endo/exonuclease. We'll add that to the main list.

6). I think we should just stick with HTH DNA binding protein. I know I am wimping out about the membrane part, but i think it is better to stay vague when we don't know what it is doing.

Thanks JAckie!
Posted in: Functional AnnotationFunctions not on the approved list
| posted 29 Mar, 2017 13:35
Wow, Larry, you sure don't pick easy ones!
I may have to break these up into several posts because it is taking me a while to get through them.
First of all, the integrase. This is a tyrosine integrase, rather than a serine (which is the Bxb1 integrase Debbie was referring to). We have several tyrosine integrase starts characterized at the bench, including L5. If I do an ideal alignment using Smith-Waterman at ENI between the two, it looks like the cluster T integrase should be around 1266 in your Nairb. This start lines up better with the Cluster F integrases that have been called on phagesdb and is start "20" on the Starterator report.

I'll post others as I get a chance. I am also going to post your .dnam5 file, so others can follow along if they need to.

Posted in: Choosing Start SitesRelative importance of criteria when annotating an uncommon phage
| posted 20 Mar, 2017 17:54
Our RNA seq and mass spec data from various clusters supports the guiding principle that the vast majority phage genes and genomes are only transcribed in one direction for any given region. So while I don't have data for this particular instance (and there are always exceptions to every principle), the data that we do have says that you need to choose one or the other when it comes to two genes occupying the same piece of DNA in different frames.
Posted in: tRNAsHow close can one pack protein and tRNA's genes
| posted 13 Mar, 2017 20:32
Yes, Greg, from this picture I think you need to delete the tRNA. I can't tell what cluster this is from the picture, but I am guessing there are many other cluster relatives that likely have the same prediction, and the tRNA was deleted. With no bench evidence one way or the other, I am inclined to go for the protein encoding gene that fits into the operon nicely, and to stay consistent with other cluster relatives.

Posted in: tRNAsHow close can one pack protein and tRNA's genes
| posted 02 Mar, 2017 15:47
Hi Jackie,
1) Graham has approved DNA polymerase I, and it is OK to use for those Gordonia phages.
2) I think we want to stay with glucosyl transferase and galactosyl transferase for those two enzymes in the Fs. For more specific I feel like we should demonstrate bench evidence.
3) DNA helicase/methylase is approved as long as you can demonstrate two functional domains.
4) Immunity superinfection protein is too vague based on what we currently know about prophage mediated resistance, and stays off the approved list. I need to go back and figure out how those functions got into those genomes to begin with.
5) Re the MRE11 (what a neat investigation): the HHPred results are pretty confusing, so I can see why this protein has been labeled as both exo- and endo- or nothing at all. What are you proposing? Call all members of this pham "nuclease"? I Only slightly farther down the list are almost equally good alignments to "metallophosphoesterase" and "phage lambda ser/thr kinase". Are you sure that the evidence is good enough to pick one thing out of this list where everything is 98.7 probable or better to a single domain of the protein? I would have to do some serious digging through the literature to figure out which parts of the proteins were aligning to the phage proteins.
6) Excellent use of TMHMM. Which type of DNA binding domain?
Posted in: Functional AnnotationFunctions not on the approved list
| posted 01 Mar, 2017 18:56
Hi Beckie— I love the Gordonia phages! These areas flanking the integrase seem particularly plastic and so it doesn't surprise me that the normal guidelines aren't holding up.

Based on your GenemarkS output (which I think is the most compelling evidence you've got here) I want to keep the forwards 43 that is in the middle of that reverse operon. I also think you are right; you make sure that you call the reverse gene to start at 29163 and no longer to accommodate the forwards gene, and delete the other reverse gene that occupies the same space as the forwards gene.

I could be persuaded to change my mind though, if you run them all through HHPred and find functions for the reverse gene that you propose to delete.

Posted in: Gene or not a GeneForward gene with good evidence among reverse genes
| posted 01 Mar, 2017 17:30
Hi Marisa,
This is a great question. The NifU annotation came from our collaborators in Brazil, so Debbie and I weren't directly involved with the functional assignment in phage Barriga. I had to spend some time digging into "NifU" and what it does.
NifU is a scaffold protein for Fe-S cluster complexes (not to be confused with the phage capsid scaffolding protein). It is frequently (but not always) involved in nitrogen fixation, as it is also found in species that do not fix nitrogen. The Barriga gene appears to align well with the NifU N , that is the N-terminal domain of NifU, but it is too short to contain both the N and C terminal domains of the protein. The N-terminal domain alone coordinates the binding of a single Fe while the C-terminal domain facilitates the binding to the rest of the Fe-S cluster. There are three Cys residues required for N-terminal Fe binding: Cys35, Cys62, and Cys106. A Smith-Waterman alignment places Barriga's cysteines in exactly the right place.

So I think it is appropriate to add NifU-like protein (N-terminal domain) to our annotation list, and we should probably correct the others on

It would be interesting to see your phage has the C-terminal portion somewhere in a different gene.

Here is the pubmed abstract on NifU, if you and your students are interested in further reading.

Edited 01 Mar, 2017 18:45
Posted in: Functional AnnotationNifU-like protein