The official website of the HHMI Science Education Alliance-Phage Hunters Advancing Genomics and Evolutionary Science program.

Welcome to the forums at Please feel free to ask any questions related to the SEA-PHAGES program. Any logged-in user may post new topics and reply to existing topics. If you'd like to see a new forum created, please contact us using our form or email us at

All posts created by Pollenz

| posted yesterday, 14:22
Using GC/MS of several EK2 phages, one of the most expressed proteins that is identified is the gene product for the gene directly downstream of the called portal protein. Although we do not have direct evidence that this is the major capsid, the location, similarity across all EK phages and its level of expression in a phage protein lysate suggest that this is a good candidate to be the MCP. We are also using cro-EM to begin to assess which protein is the MCP in these podoviriae phages.
RS Pollenz
Posted in: Cluster EK Annotation Tipsmajor capsid protein
| posted 03 Jun, 2021 14:19
We have found in Phage Leperchaun (cluster F1) a split methyltransferase with very good HHpred evidence (gp68/gp69). In other cluster F1 there is one large methylase gene in this region, but in Leperchuan the first smaller ORF has a stop codon and the switch to frame 2 picks up the sequence with an 80bp overlap and a large second gene. In several of the HHpred hits, there are alignment to the same PDB that show continuous coverage between the 2 Leperchuan genes. There is no consensus slippery sequence but it will be interesting to see if this is the case. See attachment.
RS Pollenz
Posted in: AnnotationMid-gene deletion causing frameshift and orphams
| posted 13 May, 2021 19:39
tyrosine integrase #1 = gp 34 (many strong HHpred hits)
immunity repressor = gp 35 (strong HHpred evidence and genes in the Pham)
tyrosine integrase #2 = gp 37 (many strong HHpred hits as gp34)
cro = gp41 (7CSV_A HTH cro/C1-type domain-containing protein; dimer, ANTITOXIN; 1.71A {Pseudomonas aeruginosa PAO1}
excise gp 42 (PF06806.1 Putative excisionase; 1Y6U_A Excisionase from transposon Tn916; DNA architectural protein, Tyrosine recombinase, Excisionase, Winged-helix protein, C

So all required genes can be identified
RS Pollenz
Posted in: Cluster FF Annotation TipsImmunity Cluster Genes in FF Phage Popper
| posted 13 May, 2021 18:50
Given that these FF cluster phage have genes in PHAM 54987 (92 members and majority calls are immunity repressor) that hit to numerous repressors such as 5FD4_B (ComR; Streptococcus, Competence, Quorum sensing, ComR, TRANSCRIPTION REGULATOR; 2.9A {Streptococcus suis (strain 05ZYH33), 6H49_A (Orf20; SaPI, Repressor, STRUCTURAL PROTEIN; HET: SO4; 1.8A {Staphylococcus aureus}) and 5D50_D (Repressor; Repressor, Anti-repressor, complex, DNA BINDING PROTEIN; 2.49A {Salmonella phage SPC32H}) an immunity repressor call is consistent with the data and the location of the genes within an immunity cluster (gp35 in Popper, gp37 in Nandita and gp37 in Ryan).
RS Pollenz
Edited 13 May, 2021 19:16
Posted in: Cluster FF Annotation TipsRepressor vs HTH DNA binding proteins
| posted 22 Jan, 2021 20:44
The PHAM you reference is now 44108 and contains ~500 members. The majority of calls are to Major Capsid even though the HHPred hits have 25% probability, poor e scores and minimal coverage. I have tested many of these proteins and do not see any evidence of hitting to major capsid PDB entries. Even when the HHpred query is run with the pfam selected, there are no hits to phages. We are annotating a DE3 Gordonia phage (EdmundFerry) where gp19 is in this pham. It would appear that most calls are possibly based on synteny.

BUT: in EdmundFerry there are also very few structural genes identified in the 5' region of the phage as in others we have annotated. So calling genes based on the location to other known structural proteins is iffy and is not supported by condition 2 of the synteny rules (2. adjacent to other structural genes of known, verifiable function ).

For example we can find small and large terminases with excellent HHpred evidence (gp1 and 2), but genes 3-10 have NKF, portal (gp11, good HHpred hits), but that is about it. The Tape measure is clearly gp29 based on its size and location and there are two major tail proteins that hit low probability HHPred and pfam that are upstream of it. The call of a MuF-like minor capsid (gp13) has NO HHpred PDB HITS of any consequence, but DOES hit to low coverage pfam hits to MuF and minor capsid, BUT: gp15 (PHAM 46435) is called as a capsid maturation protease or RNA ligase with essentially NO hits to relevant proteins that have anything to do with proteases or capsids……so the sequence of clearly identifiable structural genes does not fit the standard.

Any other thoughts here?
RS Pollenz
Posted in: Functional AnnotationSingleton FuzzBuster Major Capsid
| posted 25 Feb, 2020 12:58
The short duplicated areas in the DJ phages appear to have sequence similarity to the sequences in the Cluster BI1 phages even though they come from different species.

The DJ phages have eight of these directly upstream of start codons:

RS Pollenz
Posted in: Choosing Start SitesSD scoring matrix
| posted 24 Feb, 2020 15:22
Yes, very interesting. There is one sequence of ~44bp that is replicated PRIOR to 5 consecutive genes…..
RS Pollenz
Posted in: Choosing Start SitesSD scoring matrix
| posted 24 Feb, 2020 14:03
We are annotating a Gordonia DJ phage Secretariat. It appears there is an region across most of these DJ genomes that begins about 32,000bp and contains ~15 small genes that are divergent across the different phages and all the genes are separated by 20-100bp gaps between them (no 1-4bp overlaps or small <10bp gaps). The issue is that the called start for many of these in NOT the LO that in many cases will significantly reduce the gaps between these genes. It appears that the RBS data is very poor (example, Z value 0.446, spacer 8, final score -8, compared to the "called" start of 1.77/11/-5.3) for many of these. All of the genes are hypothetical, so no evidence of how these genes are related (an opernon??). The Coding Potential may drop off at the LO on some of these, but its not that different from many genes that have been called. We know that there are many aspects that impact translation initiation beyond the RBS/SD such as RNA structure, operons, translation of previous genes, etc. What are your thoughts on selecting the LO and reducing gaps when the RBS data is so poor?
RS Pollenz
Posted in: Choosing Start SitesSD scoring matrix
| posted 29 Jan, 2020 20:34
We installed the virula machine and windows10 on a new Macbook Air and the window size (screen) is very small when you launch the VM. How do you change the size to make it bigger. The MAC controls the OS screen, not the VM screen. Thank you
RS Pollenz
Posted in: Using WINE to run DNA Master on a MacHelp with WINE
| posted 14 Jun, 2019 15:31

I have a new 2019 DELL laptop and have been trying to install DNA master. I do not get all the files in the download (no library or ssleay32 files) and I do not see the DNAMas updated version showing up like I saw when I did this on an older machine I borrowed prior to the conference last week (running windows 7). Thus, the program that is downloaded will not update from the 2012 version. I tried form both Phages DB links and I have uninstalled and tried several times but get the same results. I noticed that when I initially download the file and open in the downloads folder, the DNAmas.ex will not initially launch and gives an error saying that it will not run on a PC, but eventually the DNAmas.ex does turn into the DNA master icon so I can launch the installer……but only seems to contain the minimal files listed above.
RS Pollenz
Posted in: DNA MasterDNAS master install on 2019 PC