SEA-PHAGES Logo

The official website of the HHMI Science Education Alliance-Phage Hunters Advancing Genomics and Evolutionary Science program.

Welcome to the forums at seaphages.org. Please feel free to ask any questions related to the SEA-PHAGES program. Any logged-in user may post new topics and reply to existing topics. If you'd like to see a new forum created, please contact us using our form or email us at info@seaphages.org.

Called an orpham but isn't?

| posted 05 Apr, 2022 21:14
We are annotating the singleton Arzan. Gene 58 (in the phamerator draft; 36,809 to 37,624) is listed as being in pham 100827, but matches to pham 101472 members on Blastp with very low e-values. This looks like a mistake. Can we use the pham 101472 starterator report?
| posted 06 Apr, 2022 16:29
All information is informative. I guess it depends on how you use it! It might be helpful to do a protein alignment.
debbie
| posted 06 Apr, 2022 16:36
I think the alignment is nearly identical, which is why we think it is an error. Do errors like this happen?
| posted 06 Apr, 2022 16:50
Hi Adam,

Actually, they're not nearly nearly identical; there's only 32% amino acid identity between Arzan 58 and the next best match in the DB! (Which is Heisenberger 58 as of me writing this.)

That's quite a lot of differences, and while it's probably reasonable to consider these homologs, they're still different enough that:

1. Phamerator's algorithm put them in different phams and
2. trying to make start calls on Arzan 58 based on Heisenberger 58 is not advisable.

The AA sequences are 32% identical, but the nucleotide sequences show almost no similarity at all, so using a starterator report isn't a good idea in this situation, since SD sites and potential start codons have had lots of evolutionary time to change.

This one you'll have to call on its own!
–Dan
| posted 06 Apr, 2022 18:28
Thanks Dan, sort of embarrassing….When I said nearly identical I was basing that not on looking at the aa comparison but on the e values in blastp, and that they are almost the same length. Obviously low e values don't mean identity and I should have looked at this with the students. thanks for your help on this. Still learning…A
 
Login to post a reply.