SEA-PHAGES Logo

The official website of the HHMI Science Education Alliance-Phage Hunters Advancing Genomics and Evolutionary Science program.

Welcome to the forums at seaphages.org. Please feel free to ask any questions related to the SEA-PHAGES program. Any logged-in user may post new topics and reply to existing topics. If you'd like to see a new forum created, please contact us using our form or email us at info@seaphages.org.

How is the RBS spacer distance counted?

| posted 07 Feb, 2025 21:07
How is the RBS spacer distance counted in DNA Master? At which position in the Shine-Dalgarno sequence does the count start?

I see an indication for how the spacer distance is calculated in the Bioinformatics Guide that doesn't seem to match what I see in DNA Master. The article indicates it counts from the G (in brackets): TAAGGAG[G]TA. (This raised other questions for me since I don’t think that G is even in the consensus sequence in the Kibler6 scoring matrix. See attachment.).
https://seaphagesbioinformatics.helpdocsonline.com/article-24

When I look at output of SD sequences identified in DNA Master, it looks to me like the count starts earlier in the consensus sequence (where I have inserted the "|" symbol): TAA|GGAGGTA

Examples of sequence upstream of annotated starts:
• TCAAGACAA|GGAGACAAGCGCC spacer distance 13. Gene 12 Andre, z-score 3.17.
• ACCCGCTCACAGAGA|GGAAGAA spacer distance 7. Gene 19 Andre, z-score 3.15.
• AAACCATGAAA|GGAAGGCCATC spacer distance 11. Gene 8, MulchExplorer, Z-score 3.22.

I'm also curious whether the spacer distances you see in the Karlin Medium Shine-Dalgarno spacer matrix in the local settings of DNA Master are counted the same way as the spacer distances in the "Choose ORF start" RBS window (accessed through the Frames window in DNA Master).

A few extra questions on RBS sequence analysis in DNA Master:

Also, does the z-score in the RBS table take into account the spacer distance? I know the final score does and that the raw score does not, but I'm not sure about the z-score. Are there any good resources that could help students interpret these scores and connect these scores to some basic concepts in statistics?

Looking at this made me realize I also have some questions about the Kibler6 consensus sequence matrix and what it is looking for. The consensus sequence highlighted in the Kibler6 matrix cuts off earlier than I would have expected given other descriptions of the SD consensus sequences. See attached document for screenshots and descriptions of what I've been trying to sort out.

Thank you so much!

(Edit: seems my word doc attachment won't upload)
Edited 08 Feb, 2025 15:17
| posted 08 Feb, 2025 15:39
Hi Kristen,
I will send you to the literature for this. When Dr. Lawrence adopted this method of SD determinations, he used the work of Dennis Kibler and Sam Karlin. I was under the impression that the spacer distance was between the end of the Shine DalGarno sequence and the start. Since SD sequence rarely match the lambda sequence, and the non-perfect matches have weighting schemes, i don't know how to count. the manual has a Kibler reference and here is a Karlin reference.

Go to the preference settings of DNA master and note that we have picked Kibler 6 and Karlin medium (quite arbitrarily). But the matrixes provided are quite involved.

Other questions:
The Z score uses the final score. But note that both numbers reflect similar relationships.

"Karlin Medium Shine-Dalgarno spacer matrix in the local settings of DNA Master" - the DNA Master setting (if you are using the most updated DNA Master are Karlin-Medium scores (unless you change your settings in the preferences.

Remember, both Kibler and Karlin are applying math to a 'best fit' situation. some may not be easy to recognize. And sometimes a SD sequence is not even in play, so that the numbers are quite useless.

Let me know if you want to chat more about this.
best,
debbie
 
Login to post a reply.