SEA-PHAGES Logo

The official website of the HHMI Science Education Alliance-Phage Hunters Advancing Genomics and Evolutionary Science program.

Welcome to the forums at seaphages.org. Please feel free to ask any questions related to the SEA-PHAGES program. Any logged-in user may post new topics and reply to existing topics. If you'd like to see a new forum created, please contact us using our form or email us at info@seaphages.org.

All posts created by ClaireRinehart

| posted 05 Jul, 2024 21:52
I have noticed that NCBI is revising whole blocks of phages and submitting them as new submissions with different accession numbers than the original SEA_Phage submissions. This is easy to detect when looking at the new NCBI file because it is a new reference (1) that has been added to the original reference(2:
LOCUS YP_010057231 46 aa linear PHG 10-JAN-2023
DEFINITION HNH endonuclease [Mycobacterium phage Cane17].
ACCESSION YP_010057231
VERSION YP_010057231.1
DBLINK BioProject: PRJNA485481
DBSOURCE REFSEQ: accession NC_054716.1
KEYWORDS RefSeq.
SOURCE Mycobacterium phage Cane17
ORGANISM Mycobacterium phage Cane17
Viruses; Duplodnaviria; Heunggongvirae; Uroviricota;
Caudoviricetes; Ceeclamvirinae; Bixzunavirus; Bixzunavirus cane17.
REFERENCE 1 (residues 1 to 46)
CONSRTM NCBI Genome Project
TITLE Direct Submission
JOURNAL Submitted (07-MAY-2021) National Center for Biotechnology
Information, NIH, Bethesda, MD 20894, USA
REFERENCE 2 (residues 1 to 46)
AUTHORS Fast,K.M., Castleberry,S., Jones,I.K., Larrimore,J.D., Long,C.A.,
Pritchett,N.C., Keener,T., Sandel,M.W., Bollivar,D.W.,
Garlena,R.A., Russell,D.A., Pope,W.H., Jacobs-Sera,D. and
Hatfull,G.F.
TITLE Direct Submission
JOURNAL Submitted (28-JUL-201smile Biology, Illinois Wesleyan University, 1312
Park Street, Bloomington, IL 61701, USA

The major problem that I have with this is that we are not able to see the evidence or documentation that led to this huge change. This causes problems with our students that see an overwhelming block of identical functions in NCBI usually without noticing the original submissions with different functions. Sometimes the original submissions are visible in the NCBI BLAST results as shown in the PECAAN output below:

HNH endonuclease [Mycobacterium phage Cane17]
>gb|AXQ51660.1| hypothetical protein SEA_CANE17_46 [Mycobacterium phage Cane17] >gb|QAY13996.1| hypothetical protein SEA_COLT_48 [Mycobacterium phage Colt]

and other times the original SEA_ evidence is sorted way down the list of results.

We instruct our students to go with the Phagesdb results which are supported by HHPred or CDD evidence.

The NCBI results are great for confirming the 1:1 start correlations.

Enjoy!
Claire Rinehart
Posted in: Functional AnnotationRefSeq and INSDC name disagreements in NCBI Blast for Functonal Assignment
| posted 20 May, 2024 14:57
You can't enter un-called tRNAs into PECAAN. Sorry.
You can get PECAAN to rerun the tRNA and tmRNA search by clicking on the Admin menu button and then selecting the Phage option. Next enter the name of your phage into the search box and once you see your phage, click on the Edit button at the right hand edge of your phage line. You will then be given the option to 'Rerun the tRNA and tmRNA' search.
Posted in: PECAANNew Features in PECAAN
| posted 27 Apr, 2024 01:25
I verified that the Starterator and Database are both version 561 but pham 161446 at URL: http://phages.wustl.edu/starterator/Pham161446Report.pdf gives me a 404 error, page not found.

I checked many of the other Starterator phams and they seem to be working now that you have updated.
Thanks for your help.
Posted in: StarteratorPham not found in Starterator
| posted 02 Mar, 2024 22:24
Debbie,
I have additionally found Luxx gene 21 and 22 starts have been changed to bring this genome into "conformity" with the other EE genomes. Looking at the data I again disagree with these changes.
Please educate me on the rationale.
Thanks,
Claire Rinehart

Luxx Gene 21 (reverse gene)
Starterator calls start 15583, which has a very poor Z- and Final Score but captures all of the coding capacity. Start 15577 is just six bases shorter and has a viable Z- and Final score, while capturing most all of the coding capacity. The best scoring start is at 15520 with excellent Z- and Final Scores but looses 63 bases of coding capacity found in the tail region of the typical plot, but into the peak region of the atypical plot. My choice is start 15577.

Starterator
Info for manual annotations of cluster EE:
•Start number 3 was manually annotated 1 time for cluster EE.
•Start number 5 was manually annotated 9 times for cluster EE.
•Start number 6 was manually annotated 78 times for cluster EE.
•Start number 7 was manually annotated 2 times for cluster EE.
•Start number 10 was manually annotated 2 times for cluster EE.
•Start number 12 was manually annotated 5 times for cluster EE.
•Start number 14 was manually annotated 1 time for cluster EE.
•Start number 15 was manually annotated 2 times for cluster EE.

Gene: Luxx_21 Start: 15583, Stop: 15083, Start Num: 6
Candidate Starts for Luxx_21:
(2, 15784), (Start: 5 @15595 has 9 MA's), (Start: 6 @15583 has 78 MA's), (Start: 7 @15577 has 2 MA's), (9, 15559), (Start: 10 @15553 has 2 MA's), (Start: 12 @15520 has 5 MA's), (16, 15475), (17, 15439), (18, 15424), (19, 15397), (20, 15385), (22, 15343), (23, 15319), (25, 15289), (28, 15259), (29, 15250), (30, 15229), (31, 15226), (33, 15202), (38, 15118 ),

Ribosomal binding scores:
Direction  Start   Stop  Length  Gap  Spacer  Z-score  Final Score  Codon
Reverse    15784  15083   702   -122    14        2.391    -4.856         ATG
Reverse    15595  15083   513      67      7         0.6         -8.437         GTG
Reverse    15583  15083   501      79    12         0.503    -7.945         ATG
Reverse    15577  15083   495      85    10         1.201    -6.398         ATG
Reverse    15559  15083   477    103    10         0.549    -7.710         ATG
Reverse    15553  15083   471    109    10         1.201    -6.398         ATG
Referse    15520  15083   438    142    13         2.987    -3.155         ATG

Luxx Gene 22 (reverse gene) (reverse gene)
Starterator calls the start at 15893. As you can see below, start 15893 has one of the poorest Z- and Final Scores. A better choice would be 15923 or 15818. In looking at the coding capacity below, Start 15818 would give up a large portion of coding capacity. However, start 15923 would even capture the atypical coding capacity and is my start of choice.

Starterator
Gene: Luxx_22 Start: 15893, Stop: 15663, Start Num: 5
Candidate Starts for Luxx_22:
(3, 15935), (4, 15923), (Start: 5 @15893 has 100 MA's), (6, 15818 ), (7, 15809), (8, 15791), (10, 15740), (11, 15728 ),

Ribosomal binding scores:
Direction  Start   Stop  Length  Gap  Spacer  Z-Score  Final Score
Reverse    15935  15663    273    468    15     1.495     -6.714
Reverse    15923  15663    261    480    15     2.237     -5.221
Reverse    15893  15663    231    510    15     0.936     -7.839
Reverse    15818  15663    156    585      9     2.137      -4.595
Reverse    15809  15663    147    598    18     2.137     -6.122
Edited 03 Mar, 2024 10:55
Posted in: Cluster EE Annotation TipsGenome Curation - a must read!
| posted 02 Mar, 2024 21:55
Debbie,
I was using the GenBank submission of Luxx (cluster EE) to evaluate the annotations of our student's practice genomes that are based on Luxx. I found that gene 18 had a -34 gap start called instead of the -4 that we originally called.

Start  Stop  Length  Gap  Spacer  Z-score  Final Score  Codon  Forward
14090  14578    489 -259       6    0.875       -8.104         GTG   Forward
14315  14578    264  -34      16    1.587       -6.724         GTG   Forward
14327  14578    252  -22      10    1.637       -5.522         GTG   Forward
14345  14578    234   -4      16    2.066       -5.760         GTG   Forward
14426  14578    153   77       6    1.917       -6.007         ATG   Forward

I read the Cluster EE forum notes and see that Luxx was modified to bring the group into "conformity". I would like to learn what would justify gene 18 start being called at 14315 other than the fact that the other 84 genomes in Starterator use that site?

Thanks,
I am still learning.
Claire
Edited 02 Mar, 2024 22:42
Posted in: Cluster EE Annotation TipsGenome Curation - a must read!
| posted 22 Mar, 2022 14:31
The NCBI BLAST is now operating and the backlog has been significantly reduced. Thanks again.

Claire
Posted in: PECAANPECAAN not BLASTing?
| posted 22 Mar, 2022 03:46
Thanks Steve for point this out. You are right. It has been a week for some submissions. We are working on getting the backlog resolved.

Claire
Posted in: PECAANPECAAN not BLASTing?
| posted 02 Jul, 2021 16:52
PECAAN has been modified to output the tRNA report so that it now passes the QC workflow.
Posted in: PECAANPECAAN and tRNA notes problem?
| posted 01 Jun, 2021 21:35
PECAAN Annotation Tutorial Videos
* Finding closest relatives https://youtu.be/5jqoHZwacAM
* Compare genome to closest relatives https://youtu.be/6dh9yiWR2yw
* How the locations of genes are predicted https://youtu.be/51YurlcyJKk
* How to add and delete genes https://youtu.be/aNmH541DGMA
* SEA PHAGES annotation guide https://youtu.be/4MYjl0T5cKY
* Starterator, PhagesDB & NCBI BLAST https://youtu.be/85JwOLoBwFU
* Gaps & Ribosomal Binding Sites https://youtu.be/dj-YcygwP3s
* GM coding capacity, LORF & start sites https://youtu.be/L_AIQ1rAUxg
* tRNA and tmRNA https://youtu.be/n07izcvyUGE
* Assigning a function https://youtu.be/VV97ZP7ZpG0

PECAAN Admin Tutorial Videos
* Why PECAAN? https://youtu.be/KlVepaPHA3g
* How to put a Phage Genome into PECAAN https://youtu.be/Vxf9Bs1QysY
* How to put Users into PECAAN https://youtu.be/3l1pMuMEXgg
* How to update Official Function List https://youtu.be/wiKv9A0cX_c
Edited 02 Jun, 2021 10:55
Posted in: PECAANYouTube Videos for Students and Faculty
| posted 11 Jul, 2020 23:59
Well, here we are again. This time the Cluster A1 phage is STLscum and it has all of the features described above. I notice that there are now others that fit the criteria above that have called this superinfection immunity protein or superinfection exclusion protein including Swag_38, LastResort_38, Jabith_72, and Niza_72. All of these have 2-3 transmembrane domains and are good matches to pfam 14373. Would you reconsider naming this group of proteins "superinfection exclusion protein" after JSwag_38. This seems more appropriate since this protein contributes to the exclusion of super-infecting genomes to the periplasmic space?
Thanks,
Claire
Posted in: Request a new function on the SEA-PHAGES official listSuperinfection Immunity Protein