The official website of the HHMI Science Education Alliance-Phage Hunters Advancing Genomics and Evolutionary Science program.

Welcome to the forums at Please feel free to ask any questions related to the SEA-PHAGES program. Any logged-in user may post new topics and reply to existing topics. If you'd like to see a new forum created, please contact us using our form or email us at

All posts created by welkin

| posted 21 Feb, 2018 16:17
And as regarding PECAAN, yes, you will have to enter them manually for now. I know Claire was planning on updating PECAAN, so that may become available.
Posted in: Notes and Final FilesSIF-Blast; SIF-HHPred; SIF-Syn
| posted 21 Feb, 2018 16:16
Yes. For the most part, I doubt you will find much in GenBank that is not one of our phage proteins. But it is always worth looking, or even doing a limited search in which you deliberately exclude our phages to find new things, as new sequences are being added daily.
Posted in: Notes and Final FilesSIF-Blast; SIF-HHPred; SIF-Syn
| posted 21 Feb, 2018 16:14
Hi Veronique,
Excuse me for retitling your thread— we are trying to make them as specific as possible.

Yes, go ahead and use the Gordonia model that they list on the GeneMark website for the host model. It has worked just fine for me.

Posted in: AnnotationBest Gordonia model for GeneMark
| posted 21 Feb, 2018 16:12
Hi Allison,
Immunity repressors have HTH DNA binding domains, so this is why both terms show up in the pham. When we became more confident of the immunity repressor assignment, we switched the functional label in the newer annotations. This assignment is supported both by HHPred crystal structures, and by synteny– except for Cluster A, the majority of phage immunity repressors are found in this position in the genome relative to the integrase.

Posted in: Functional AnnotationImmunity repressors in Cluster F
| posted 21 Feb, 2018 16:07
OK, these are not the same problem, so Allison, I am moving your post to a new thread. In general, unless you are looking at a phage in the exact same cluster, it is worth starting a new post —even if it seems like the same problem on the surface.

the issue with your phages is that you are looking at a Cluster J that appears to contain the immunity repressor of a Cluster A phage (Kugel is in Cluster A). The immunity repressor in cluster A is not adjacent to the integrase where it belongs, but way out in the right arm. So the immunity repressor in Squint is probably gene 90 (only found in Cluster J).
We have multiple instances of phages carrying around the Cluster A immunity repressor– in both Cluster C and Cluster J. So the correct name for Squint 188 is Cluster A immunity repressor.
This is such a weird thing that we didn't pick up on it right away when we started annotating all these, and so labeled it by the HTH DNA binding domain, which all immunity repressors have.
Posted in: Functional AnnotationCluster J and A repressor/ immunity repressor
| posted 21 Feb, 2018 15:56
Dear All,
I am closing this thread— if you have more specific frameshift questions, please start a new thread and include the subcluster of the phage in the title of the thread.

Posted in: Frameshifts and IntronsAnnotation Advise: Frameshift
| posted 21 Feb, 2018 15:48
The problem wasn't the stop codon in Eviarto- it was a stop codon in a closely related phage that meant we couldn't choose the sequence that we wanted to and make the whole cluster congruent.
Posted in: Frameshifts and IntronsCZ1 Tail assembly proteins
| posted 21 Feb, 2018 15:47
You may document the bacterial protein. Singletons are really challenging—just do your best and don't stress over it. I've done singleton annotations that we had to revise once we isolated another five phages from the cluster. Without the comparative genomics, sometimes you just won't know.

Good luck!
Posted in: Choosing Start SitesGordonia singleton annotation help
| posted 21 Feb, 2018 15:45
Hi Greg,
Unfortunately, these data points are not redundant. The problem with assigning functions via only one source is that if an error has been introduced into the database, the error will be propagated in your annotation. The only way to make sure is to confirm your functional assignments through all sources. We are working to make all of the phage-related functional assigns match the entries in GEnBank— should happen shortly, in which case you would only need to look at one of these databases for functions in our phage genes. You should still look in GenBank for functions from sources outside of our database.

And it is OK to relinquish genomes if you don't think you can do them all. I'd rather have fewer annotations done at a high standard than more that we have to fix on the back end.

Posted in: Notes and Final FilesSIF-Blast; SIF-HHPred; SIF-Syn
| posted 21 Feb, 2018 15:04
Hi All,
You must run all the analyses on all genes. It is no longer acceptable to assign functions simply from one source.

This is because we are getting too many incorrect functional assignments in the annotations. We are hoping that if all three methods and multiple database are checked for every gene, you should be able to assign the correct function, or at least realize that you have really good data for more than one functional assignment.
At that point, you can ask for help with the functional assignment, and get it addressed in your classrooms while you are still working on the annotation. And then really excellent functional annotations will be submitted!

Posted in: Notes and Final FilesDocumenting Gene Calls in DNA Master