Two Pham 24076 genes in a row with different functions in cluster E

| posted 16 Jun, 2018 02:21
In E cluster phage are often two genes in row around gene 75 in right half that are both of the same pham, 24076. The first gene is usually NKF but HHpred currently identifies it as a "DprA-like DNA processing chain A". It has good coverage 99.75 probability of 99.3%(aa 1-137 of the E cluster to aa 42-185 of the target; 1e-20). The same function is also applied to a different pham, 30776, an example of which is found in a cluster M2 phage, GenevaB15. The second pham 24076 gene in the cluster E phage is often assigned HNH endonuclease function. Is this correct?, unusual? common?
Edited 16 Jun, 2018 02:23
| posted 16 Jun, 2018 22:53
There are quite a few examples of two genes in a row being in the same pham. This is based on the decision (made very early on) as to what to do when you find a protein that aligns well to proteins in two different phams. The decision was made to combine the two phams together into a single larger pham and retire the two older phams. This usually happens by having one long protein where some smaller proteins align well to the first part of the long protein and other smaller proteins align to the second part of the larger protein.

In the case of pham 24076 it looks like ShereKhan_Draft gene 75 is that larger protein. This situation is neither common or exceptionally rare. An example that has gotten a lot of discussion lately is the split lysin A proteins (do a forum search for "split lysin A" ) .

This doesn't often help with functional annotation other than to explain why sometimes members of a pham can have two quite different functional annotations and both are justified by the evidence.
Edited 16 Jun, 2018 22:53
