SEA-PHAGES Logo

The official website of the HHMI Science Education Alliance-Phage Hunters Advancing Genomics and Evolutionary Science program.

Welcome to the forums at seaphages.org. Please feel free to ask any questions related to the SEA-PHAGES program. Any logged-in user may post new topics and reply to existing topics. If you'd like to see a new forum created, please contact us using our form or email us at info@seaphages.org.

Issues with BLAST results in PECAAN

| posted 12 Jun, 2018 19:08
After the symposium and faculty meeting, I'm definitely sold on using PECAAN for my annotations. I'm currently annotating a phage and I'm having a bit of trouble with the way the BLAST results work.

1) Most of my NCBI BLAST results don't have any phage alignments, though they do show up in the DNA Master BLAST (see attachment). Do I need to change settings or something to get these results to mirror those that I get in DNA Master?

2) PECAAN is exporting BLAST results whose boxes are not checked. My understanding is that only the check information would be exported to be used in parsing the DNA Master file.

Any help is greatly appreciated!

Cheers,

Evan
Edited 12 Jun, 2018 19:10
| posted 13 Jun, 2018 19:22
Great questions.

Pecaan does not save the actual alignments, it just saves the hit table with the results you see. If you want alignments you have to either use NCBI web based blast or DNA Master.

As for the listing in the notes, the issue is the structure of the non-redundant database that is the default database everyone searches with BLAST. This database is made much smaller and easier to search by combining all identical sequences into a single entry. Thus if there are three phage genes with exactly the same amino acid sequence and you search the non-redundant protein database you will get one hit which will have three names. In the web NCBI blast results you will just see the title to one sequence and the rest are kind of hidden but you will see a link with a tiny triangle and word like "see 2 more Title(s)". If you click the link you would see the full descriptions for all the other proteins.

For PECAAN it was decided to explicitly list out all three in the Description column since any one of the three genes could have the relevant functional annotation and thus all three are also used to create the full notes for the same reason.

in your case the two submitted proteins "3-oxoadipate CoA-transferase" and "CoA transferase subunit A" are identical so they collapse into one entry in the non-redundant database with two descriptions.
Edited 13 Jun, 2018 19:28
| posted 13 Jun, 2018 19:33
Thanks so much Chris, that helps me understand it much better.
 
Login to post a reply.