SEA-PHAGES | New function? Hfq RNA binding protein

Link to this post \| posted 04 Feb, 2022 20:28
uOttawaPHAGE	We are annotating JohnDoe, a cluster AZ phage. Gene 53 in related phages has been annotated as NKF (by us and others), but we see high probability on HHpred for similarity to Hfq RNA binding proteins. The only two RNA binding protein options on the approved list are "RNA binding protein" and "Ro RNA binding protein". Can this new function be added? See attached doc. All other AZ phage with this gene will need to be corrected. thanks! Adam 233Kb

Link to this post | posted 04 Feb, 2022 20:28

We are annotating JohnDoe, a cluster AZ phage. Gene 53 in related phages has been annotated as NKF (by us and others), but we see high probability on HHpred for similarity to Hfq RNA binding proteins. The only two RNA binding protein options on the approved list are "RNA binding protein" and "Ro RNA binding protein". Can this new function be added?

See attached doc.

All other AZ phage with this gene will need to be corrected.

thanks!
Adam

Link to this post \| posted 10 Feb, 2022 03:48
debbie	We'll need to do some further investigations, I think. Without experimental data, I am bit uncomfortable to call this more specifically than an "RNA binding protein". I have attached an article that describes 3 different RNA binding protein families. How would one differentiate between them? Knowing that would help in this decision. Still thinking. debbie 864Kb

Link to this post \| posted 10 Feb, 2022 04:08
uOttawaPHAGE	Hi Debbie. We'll use RNA binding protein, but I'll also have the students gather more evidence to see if they can differentiate between the different RNA binding proteins. This is an annotation we will change in all the AZ phages that have this gene as we complete the AZ harmonization.

Link to this post \| posted 10 Feb, 2022 04:17
balish	Full disclosure: I don't know anything specific about RNA binding proteins, but I have ideas about how to address questions like these. It looks to me as though the similarity of this protein to Hfq is restricted to the predicted fold, as opposed to actual amino acid sequence similarity. Accordingly, CD-Search on the Conserved Domains Database doesn't recognize anything about it. And of course, finding that kind of thing is one of the reasons we use HHpred. If we take the fold prediction to be reliable, then one would certainly be within reason to hypothesize that it's an RNA binding protein. But I don't think the HHpred result should stand as the only evidence. In the absence of doing actual biochemistry, I think that one could look for support for that hypothesis by asking which amino acids of Hfq have interactions with RNA (which I'm sure is well-established and described in the literature) and ask whether at least some of the amino acids at the corresponding positions in this protein (as predicted by HHpred) look like they could have similar interactions, accounting for side chain biochemical properties like charge or hydrophobicity. Either way, I'd argue against calling it any specific variety of RNA binding protein, like Hfq. In my view, the absence of sequence similarity (even if the structure is conserved) means that the Occam's razor argument is that it's convergent with Hfq, rather than homologous to it, and I don't think we should give the same names to proteins that are convergent. So my vote is not to call it Hfq, but to consider calling it an RNA binding protein if it has a handful of key amino acids in the right places based on Hfq structure/function. -Mitch Edited 10 Feb, 2022 04:19

Link to this post | posted 10 Feb, 2022 04:17

balish

Full disclosure: I don't know anything specific about RNA binding proteins, but I have ideas about how to address questions like these.

It looks to me as though the similarity of this protein to Hfq is restricted to the predicted fold, as opposed to actual amino acid sequence similarity. Accordingly, CD-Search on the Conserved Domains Database doesn't recognize anything about it. And of course, finding that kind of thing is one of the reasons we use HHpred.

If we take the fold prediction to be reliable, then one would certainly be within reason to hypothesize that it's an RNA binding protein. But I don't think the HHpred result should stand as the only evidence. In the absence of doing actual biochemistry, I think that one could look for support for that hypothesis by asking which amino acids of Hfq have interactions with RNA (which I'm sure is well-established and described in the literature) and ask whether at least some of the amino acids at the corresponding positions in this protein (as predicted by HHpred) look like they could have similar interactions, accounting for side chain biochemical properties like charge or hydrophobicity.

Either way, I'd argue against calling it any specific variety of RNA binding protein, like Hfq. In my view, the absence of sequence similarity (even if the structure is conserved) means that the Occam's razor argument is that it's convergent with Hfq, rather than homologous to it, and I don't think we should give the same names to proteins that are convergent. So my vote is not to call it Hfq, but to consider calling it an RNA binding protein if it has a handful of key amino acids in the right places based on Hfq structure/function.

-Mitch

Edited 10 Feb, 2022 04:19

Link to this post \| posted 10 Feb, 2022 05:28
uOttawaPHAGE	thanks Mitch. I also know little about RNA binding proteins, but given the HHPred results we will investigate if key residues are conserved. Hfq is a phage protein, so the idea that this protein and Hfq evolved from a common ancestor seems possible? I'm not sure if convergence vs. common descent is the simpler explanation. However, I haven't looked carefully at the sequence similarity (or lack of it).

Link to this post | posted 10 Feb, 2022 05:28

uOttawaPHAGE

thanks Mitch. I also know little about RNA binding proteins, but given the HHPred results we will investigate if key residues are conserved. Hfq is a phage protein, so the idea that this protein and Hfq evolved from a common ancestor seems possible? I'm not sure if convergence vs. common descent is the simpler explanation. However, I haven't looked carefully at the sequence similarity (or lack of it).

Link to this post \| posted 10 Feb, 2022 14:11
balish	Adam, the way I see it, the approach to identifying homology between any two genes/proteins is to seek evidence of sequence similarity; similar 3-D structures point to related function, but not necessarily common descent. Convergence is the null hypothesis. It's certainly not the case that evidence of common descent between genes/proteins with very limited sequence similarity can never be found, but I just don't see evidence of sequence similarity with these particular proteins using a couple of different tools (BLAST, CD-Search). That doesn't mean it didn't happen; but if it can't be detected, convergence is the default. Maybe someone will find evidence of sequence similarity using more sophisticated tools. But let's say it were the case that this protein and Hfq could be demonstrated by some means to have a common ancestor. Is that enough to call it Hfq? If the criteria of sequence similarity and having key side chains in similar places are met, then I think the best we could say is that it's in a family, or superfamily, with Hfq. I don't know what the cutoff is or should be, but I would argue that if no sequence similarity above background can be detected, I don't think it's useful to give proteins the same exact designation. There are a lot of ways a protein could evolve to bind RNA (or DNA), and there's been an enormous amount of opportunity to explore possibilities over evolutionary time - especially for bacteriophages, which replicate so often. It wouldn't surprise me at all to learn that in terms of 3-dimensional structure, similar solutions have been arrived at multiple times, and that this protein and Hfq have everything in the right place to do similar things. That's my two cents, anyway (maybe three cents; I'll shut up now). I look forward to seeing what we find out about this protein - it's a fascinating case!

Link to this post | posted 10 Feb, 2022 14:11

balish

Adam, the way I see it, the approach to identifying homology between any two genes/proteins is to seek evidence of sequence similarity; similar 3-D structures point to related function, but not necessarily common descent. Convergence is the null hypothesis. It's certainly not the case that evidence of common descent between genes/proteins with very limited sequence similarity can never be found, but I just don't see evidence of sequence similarity with these particular proteins using a couple of different tools (BLAST, CD-Search). That doesn't mean it didn't happen; but if it can't be detected, convergence is the default. Maybe someone will find evidence of sequence similarity using more sophisticated tools.

But let's say it were the case that this protein and Hfq could be demonstrated by some means to have a common ancestor. Is that enough to call it Hfq? If the criteria of sequence similarity and having key side chains in similar places are met, then I think the best we could say is that it's in a family, or superfamily, with Hfq. I don't know what the cutoff is or should be, but I would argue that if no sequence similarity above background can be detected, I don't think it's useful to give proteins the same exact designation.

There are a lot of ways a protein could evolve to bind RNA (or DNA), and there's been an enormous amount of opportunity to explore possibilities over evolutionary time - especially for bacteriophages, which replicate so often. It wouldn't surprise me at all to learn that in terms of 3-dimensional structure, similar solutions have been arrived at multiple times, and that this protein and Hfq have everything in the right place to do similar things.

That's my two cents, anyway (maybe three cents; I'll shut up now). I look forward to seeing what we find out about this protein - it's a fascinating case!

Link to this post \| posted 15 Feb, 2022 18:22
debbie	Hi all, Sally Molloy also weighed in: I attached a paper describing prokaryotic Hfq proteins (see figure 2). I think the protein actually has the conserved residues required for the two Sm domains of an Hfq protein including: 1) it has the conserved G in Beta2 that is found in all Sm proteins 2)It has the highly conserved hydrophobic residues characteristic of the first Sm domain 3) It has the highly conserved G of Sm1 but is missing the second highly conserved D of Sm1. 4) It has the absolutely conserved Q of alpha helix 1 and it has the highly conserved Y/F in Sm1 It is missing the YKH motif of the SM2 motif but instead has an HRS motif (the eukaryotic motif here is simply RG). So its pretty similar in terms of secondary structure and conserved amino acids to Gram positive Hfq proteins. I think we can at least call it an RNA binding protein and maybe an Hfq protein. Cheers, Sally Molloy I personally am inclined to call these proteins "RNA binding proteins". debbie 833Kb

Link to this post | posted 15 Feb, 2022 18:22

debbie

Hi all,
Sally Molloy also weighed in:
I attached a paper describing prokaryotic Hfq proteins (see figure 2). I think the protein actually has the conserved residues required for the two Sm domains of an Hfq protein including:

1) it has the conserved G in Beta2 that is found in all Sm proteins
2)It has the highly conserved hydrophobic residues characteristic of the first Sm domain
3) It has the highly conserved G of Sm1 but is missing the second highly conserved D of Sm1.
4) It has the absolutely conserved Q of alpha helix 1 and it has the highly conserved Y/F in Sm1

It is missing the YKH motif of the SM2 motif but instead has an HRS motif (the eukaryotic motif here is simply RG).

So its pretty similar in terms of secondary structure and conserved amino acids to Gram positive Hfq proteins. I think we can at least call it an RNA binding protein and maybe an Hfq protein.

Cheers,
Sally Molloy

I personally am inclined to call these proteins "RNA binding proteins".
debbie

Recent Activity

New function? Hfq RNA binding protein