Cluster EC phages have annotated an ERF family ssDNA binding protein or ERF family DNA pairing protein (E.G. Kowalski_48 or Megan_47). Make sure to check the official functions list for the current function name. ERF protein of phage P22 plays a role in circularizing the linear genome upon entry into the cell (1-3).

Iyer, Koonin and Aravind 2002 (4) did a phylogenetic analysis of these DNA binding proteins that has been used to define the conserved domain (pfam04404). In order to call a gene with this function, make sure it includes the defining elements below in addition to supporting evidence from HHPred, BLAST and synteny.

These proteins should contain the following sequence elements.
1) A conserved motif: GuXXoYhp+YXhXXhh (where G is glycine, Y-tyrosine, u is a tiny residue, h-hydrophobic, p is a polar residue, o is an alcohol residue, + is a basic residue, and X is any residue).
a. Example, Megan_47 contains the sequence GSAITYARRYALTAAT that matches this motif (attached figure).
2) Shortly downstream of 1, a DXD motif.
a. Example, Megan_47 contains the sequence DND 8 residues downstream of the motif above (attached figure).

The attached figure contains a multiple sequence alignment of genes from this cluster with the above elements highlighted.

These proteins are also predicted to have α + β fold secondary structure composed of 5 helices and 3-4 sheets. If you run secondary structure prediction of the full-length protein, compare the prediction with the figure in reference 4 (Figure 3).

This note was developed in collaboration with Christine Byrum during the 2023 SEA Faculty meeting.

References:
1) Botstein D, Matz MJ: A recombination function essential to the growth of bacteriophage P22. J Mol Biol. 1970, 54: 417-440.
2) Weaver S, Levine M: Recombinational circularization of Salmonella phage P22 DNA. Virology. 1977, 76: 29-38.
3) Poteete AR: Location and sequence of the erf gene of phage P22. Virology. 1982, 119: 422-429.
4) Iyer LM, Koonin EV, Aravind L. Classification and evolutionary history of the single-strand annealing proteins, RecT, Redbeta, ERF and RAD52. BMC Genomics. 2002 Mar 21;3:8. doi: 10.1186/1471-2164-3-8. Epub 2002 Mar 21. PMID: 11914131; PMCID: PMC101383. https://bmcgenomics.biomedcentral.com/articles/10.1186/1471-2164-3-8
Edited 03 Jun, 2023 19:55