SEA-PHAGES Logo

The official website of the HHMI Science Education Alliance-Phage Hunters Advancing Genomics and Evolutionary Science program.

Welcome to the forums at seaphages.org. Please feel free to ask any questions related to the SEA-PHAGES program. Any logged-in user may post new topics and reply to existing topics. If you'd like to see a new forum created, please contact us using our form or email us at info@seaphages.org.

All posts created by cdshaffer

| posted 25 Jan, 2025 17:52
Database 584 has been released and the following whole phage reports have been added to the collection. See the original message in this thread for more info and this folder to download your report:
https://wustl.box.com/v/Actino-phage

Phage in release 584:
Babushka, Chavito, CheeseDanish, DanHam62, Donatella, Egi03, Fongie,
GoodLuckBabe, Guinevere, HeadMave, HerbBucket, Iqorha, JellyBread, JimmyPG,
LadyJasley, Marlia, Mentos, Milomuff, MUWow, Panchaali, Parvaparticeps, Penoan,
Pize, Primadonna, PrinceCharming, Red305, Socotra, SoJulia, Solea,
Stormer, Trufflozitus, TwoBits, Violeta, Wogge42
Edited 06 Feb, 2025 06:18
Posted in: StarteratorWhole phage starterator reports
| posted 22 Jan, 2025 22:38
TL;DR: improvements to the back-end now allow for whole phage starterator reports to be generated for all new phage. See phage list at bottom to see if your phage is in release 583. Reports can be found at https://wustl.box.com/v/Actino-phage



I have had some time to update the infrastructure and get Starterator to run inside a Docker container. This has allowed me to move the processing to the WashU cluster with significantly more processing power.

I have added code to now run "Whole phage" reports in bulk on every new phage. As an initial release I am posting the whole phage reports for all phage which have been added since early Dec 2024.

The whole phage reports, focus the starterator results on a single phage and have three benefits. First, the reports will sort the image section so that track #1 will be the gene for your phage. Thus, users will not have to sort through the tracks to find their gene. Secondly, the summary notes have been shortened to only your phage making this section shorter and easier to read. Finally there is a table generated near the start of the report which examines the current annotated start and compares it to the other manually annotated starts from that pham and summarizes that comparison. This allows for very rapid identification of starts that agree with the consensus found within the pham and those which do not.
The downside is that for large phage from large clusters the whole phage report itself can be large and cumbersome so depending on your phage it may still work better for you and your students to use the individual pham reports.

I have generated reports for all phage added since early December. See the complete list below.

Here is the folder:
https://wustl.box.com/v/Actino-phage
P.S. If you have an older phage and would like a whole phage starterator report let me know by email and I will add it to the next run. Also all questions/comments/suggestions should be sent to me.

Chris Shaffer

Here is the list of phage found is this collection:

Ablatia, Acai, AdaS, Akino08, Alatato, Amo99, AngelCake, Artiphact, Astralis,
Axiom, BabyMoney, BaronJohn, Beleetus, Belieber, BetterYeti, BigDome, Bimmel,
BlessJoy, BlueShadow, Bouchard, CallaLilly, CarisSwetlik, Carrillo, Caspiboi,
ChickenTender, Chilliams, Chlochlo, ChocoMunchkin, Circuit, Citrus, CookieDog,
CupcakePrincess, Dancer, Demure, Destructrice, DillyDally, DirtPie, Dodo,
DogYard, EGUnicorn, ELee24, ElectricPheel, EmoNemo, ForDig, Forrestell,
FosterFrank, FreakyGoo, FuegoCuervo, GUPitcher, Gandionco, Garnacho, Gavriela,
Glorp, Glotell, GoldDust, GoldenEssence, Goodgraces, GoongGoong, Gravel,
Greenbelt, Guzman, Gwennie1, Halloweekend, HamCheese, HazuAndZazu,
HollowPurple, HotPotato, Ichiang, Invectra, Iridessa, IsHungry, Issa7, Jabb,
Jakelyne, Jankie, Jesabah, JessellCookie, Jezreel, Jingles, KNG13, KSunshine22,
KendraB23, KevinMinion, Khan1, Kinny, Koan, Kolhan, Kropertea, Kubulix,
LadyAstra, LeoJr, Lethe, LilSmirk, Linayshia, Lishka, LordBart, Losacky,
LucySwiss, Lukepolites, MacKat, MajinBuu, MakCheese, Marsus, Maruru,
MaterMagnus, Mesmerelda, MiamiPanther, Moonflower, MsUbiquitous, Myram,
Natasha, Nathan, Navi1117, NoSwimming, Nostromo, Nova53, November, Ollypop,
Olympi, Pelletreau, Phanita, PhedwardCullen, PhigPhack, PhillyJawn, Phroglets,
Pochacco, Poochiewood, Powelldog, PrairieDogTown, Raid, Rattail, RazorC, RazzB,
RedRaider, RenegadeRaider, Rikishi, Riptide, Rita130, Rockabye, RustyBoy,
SVoro, Scissor2024, Scotia, Scuba, Seldom, SenorClean, Shaffner, ShawBrad,
Shrubaron, Shuckle, ShyRosie, SilentWarrior, SirFrank, Skitty, Smelly,
Softsoap, Spooks, Starburst, StuartMinion, Sunshine23, SwainyDoc, SwissCheezer,
Symere, Talon44, Talos, TicTac, TinaBug, TinyTerror, Toad24, Toneprano,
Towmatter, TownLake, Trackstar, TriFive, Tubberson, Underpass, Utopia,
Vitaenoii, Vitus, Vivum, WaddleDee, Wardwill, Westrich, Westy, Windest,
Worcestershire, Wrackline, Yeshua, Zahlia, Zebo, Zhuangyuan, Zippen, Zodiariah,
Edited 22 Jan, 2025 22:40
Posted in: StarteratorWhole phage starterator reports
| posted 09 Jul, 2024 21:25
The RecA like proteins in the BN cluster are not placed in the same pham and the highly likely RecA (exemplified by Spud_205). See this discussion for details: See this forum post #5567.
As such care should be taken when annotating a protein as a RecA and the less specific term "ASCE ATPase" should be used unless there is clear evidence for the presence of all the important features to support the RecA annotation (again see topic 5567 linked above)
Posted in: Cluster BN Annotation TipsRecA in cluster BN
| posted 29 Jun, 2024 18:59
Recent crystal's have supported the annotation of both a large and small subunit. See the Crystal 7JOQ. If you have sufficiently good matches to this crystal it can support the indentification of the small terminase in BE phage. If you have a small terminase be sure to call the other terminase the large terminase. If you only can find support for one, just annotate simply terminase.
Edited 03 Jul, 2024 17:03
Posted in: Cluster BE Annotation Tipsterminase
| posted 08 Jun, 2024 13:50
As of 2020 we are calling all the single endolysins as simply "endolysin". See this discussion for details: https://seaphages.org/forums/topic/4656/
Posted in: Cluster BL Annotation Tipslysin A
| posted 08 Jun, 2024 13:44
As of 2020 we are calling all the single endolysins as simply "endolysin". See this discussion for details: https://seaphages.org/forums/topic/4656/
Posted in: Cluster BG Annotation Tipslysin A
| posted 28 May, 2024 17:57
As far as I know there is no way to use just the graphical interface to get the list of all proteins with a certain functional call. You can do this easily with the command line. Just to get you started I created a few files really easy to do if you know how to search the Actino_draft database.

So 1st I did a check of the variatinon of the terms that you might be interested in. So here are with all the various functional terms that have been used that include "primase". That file is here: http://phages.wustl.edu/primase_terms.txt

Next here is a list of all phams where at least one member is annotated with a term that starts "DNA primase". There are 70 of those phams you could look at in more detail: http://phages.wustl.edu/phams_with_primase.txt

finally I create a long tab delimed list that reports the phage, the genes, the phams and the function where the function starts with "DNA primase…". This list is just over 3600 entries so you would probably want to download, open in Excel or similar and filter. You could download here: http://phages.wustl.edu/primase_phams.txt
Posted in: Functional AnnotationGG cluster DNA primse/helicase
| posted 26 Apr, 2024 17:26
You can always run the search manually:

1. In PECAAN go to the sequence tab and select the amino acid sequence
go to the HHPRED web server: https://toolkit.tuebingen.mpg.de/tools/hhp
2. Paste in your sequence,
3. Select databases: For the typical search, I have students add two databases {Unitrot and Pfam] to search in addition the the default PDB database. You add these by selecting them in the "Select structural/domain databases" menu.
4. click submit
5. wait. Time varies but usually takes 1 to 3 minutes.
Results are kept for a few days (links will be in the left column if you come back later from the same computer). Most of the scores you see in PECAAN are in the "Hitlist" section of the results.
As a bonus, at the top you will see a graphical representation of the locations of the hits which can help you see the overall domain structure of your protein. And at the bottom you get full alignments which can be helpful in a deep dive into exactly what does and does not match between your protein and the hit.
Posted in: PECAANHHPred not updating in PECAAN
| posted 28 Feb, 2024 18:10
Yes If you are using a Dot plot tool to compare genomes and it checks both strands you are good. In your case, if you have large sections of one genome that are inverted in another genome(an thus on the other strand) this will be seen in the dot plot as long diagonal lines that change the slope from positive to negative.

However, the protocols as posted on QUBES uses Gepard (which is really fast) but it only compares the top strand of each sequence. So to look for similarity when you suspect one sequence is inverted, you would need to compare the reverse complement of one of the phage to the normal strand sequence of the other.

Other programs like NCBI BLASTN compare both strands (use the "compare two sequences" check box). BLASTn can be quite a bit slower (when dealing with multiple phage sequences, and may fail totally if your sequences are too long), but it you want to look for large scale similarity and you are not sure which strand to look, BLASTn will probably do better. I would do an initial assessment with BLAST on a single genome vs single genome and once I knew which strands to compare I could do the final comparisons in Gepard.
Edited 28 Feb, 2024 22:48
Posted in: Bioinformatic Tools and AnalysesPhage Comparative Genomics Lab Manual - QUBES Resource
| posted 22 Feb, 2024 21:31
When I use that sequence in an HHPRED search I get an alignment to roughly the 1st half of crystal 5LD9 the JAMM/MPN(+) Protease ( amino acids 10 - 90). On the PDB page for the crystal it looks like the crystal has the same amino acid coordinates as does the native protein, so I can use those ~10 - 90 coordinates where I look at the literature on this protein. According to this paper the active site residues of the JAMM protease motif are (ExnH xHx7Sx2D ). This motif has a nice match in the phage protein, (the HxH are at 73 and 75, the S and D are also there at the correct distance ) so I think this phage protein is also, like JAMM/MPN(+), a metalloprotease.

So now the question is more of an issue of nomenclature/semantics. Should there be two terms in the approved list (something like "metalloprotease HEXXH type" and "metalloprotease EHHSD type" ) or should we lump together the HEXXH and EHHSD types under the same "metalloprotease" term and update the approved terms list to maybe say "Typically has HEXXH motif but other metalloprotease motifs (e.g. "ExnHxHx7Sx2D" ) have been described and can be used to support this function if present" or words to that effect.
Edited 22 Feb, 2024 21:40
Posted in: Functional AnnotationMetalloprotease without HEXXH motif?