4. Using PSORTdb
cPSORTdb is designed to allow convenient retrieval of PSORTb predictions by genomes. Newly sequenced genomes from NCBI are automatically analyzed by PSORTb v.3, and the results are stored in cPSORTdb. You can access the data by typing in the first few letters of your genome of interest and select from the list of genomes from the drop-down list.
When you select a genome name from the precomputed genomes page or from general search result, you will come to the genome summary page which lists the following information for the genome:
A: Predicted localization break down - a pie chart showing the percentage of proteins predicted to reside in each of the possible localizations for the listed genome
B: Basic information for the organism - taxonomic information, organism type and number of predicted genes in the genome
C: Prediction results - number of proteins within the genome predicted to be localized to each of the possible localizations for the organism type. Each of the localization label links to a list of all the proteins predicted to reside in that localization. Both PSORTb v.3 and PSORTb v.2 predictions are available for all genomes.
D: Download - click the button to download a tab-delimited version of prediction results for all the proteins in this genome.
A: Database selection - allows user to choose between ePSORTdb and cPSORTdb.
B: Drop down menu - selects a field to search against.
C: Drop down menu of selections or autocomplete search input text field - allows user to start entering keywords and will auto-suggest keywords most likely to match the letters entered.
D: Add or remove button - allows user to input additional search criteria to narrow down searches.
Database selection: Searches can be carried out against ePSORTdb and cPSORTdb (A).
Flexible database browsing options: PSORTdb can be browsed by selecting one or more of the search categories as listed in the table below. When you choose a particular field from the drop down menu list (B), a drop down menu of possible values appropriate to this field, or a text field with autocomplete capabilities (C) appears on the right, letting you know the correct keywords or the correct syntax to be used. You can then either highlight keywords from the drop down menu, or you may type your own keywords into box (D), and then choose the appropriate taxonomic name for the organism or organism group you are looking for.
The following table lists the possible searching criteria used for browsing PSORTdb:
Field name | Possible values |
Localization |
Archaea / Gram-positive bacteria: Cytoplasmic, Cytoplasmic membrane, Cell wall, Extracellular Gram-negative: Cytoplasmic, Cytoplasmic membrane, Periplasmic, Outer membrane, Extracellular Gram-positive with outer membrane: same as Gram-negative Gram-negative with no outer membrane: Cytoplasmic, Cytoplasmic membrane, Extracellular |
Secondary localization | Flagellar, Fimbrial, Host-associated, Phage-associated, Capsule, Gas vesicle, OMV, T3SS, Spore, Spore outer coat, Toga, S-layer |
Domain | Archaea or Bacteria |
Protein name | Name of protein |
Gram stain | Gram-positive or Gram-negative |
Organism name | Name of organism as listed in NCBI completed microbial genomes |
Phylum name | Name of phylum from the listed of NCBI completed microbial genomes |
Class name | Name of phylum from the listed of NCBI completed microbial genomes |
Order name | Name of order from the listed of NCBI completed microbial genomes |
Family name | Name of family from the listed of NCBI completed microbial genomes |
Genus name | Name of genus from the listed of NCBI completed microbial genomes |
A brief explanation of columns from left to right:
RefSeq Accession - NCBI's RefSeq accession (identifier) for each protein.
Name - protein name with a hyperlink to protein information and details of PSORTb prediction results.
Organism - organism in which the protein belongs to. Clicking on the organism name will lead to PSORTb prediction summary page for that particular genome.
Localization - localization prediction results made by PSORTb.
Score - confidence score of PSORTb's subcellular localization prediction result.
Clicking on one the column header will allow the results to be sorted by that particular column. Results of browsing and keyword searches can be viewed page by page. Furthermore, from the results list the user can download the result list by clicking on the button in the top right corner of the page.
Both ePSORTdb and cPSORTdb datasets can be searched using BLASTP (search the protein database using amino acid sequence input) and BLASTX (searching the protein database using nucleotide sequence input). One or more proteins can be submitted at the same time, these must be in FASTA format. A sequence with a FASTA sequence file consists of three parts:
A title line, which must begin with a '>' symbol, and may be followed by any type of text.
A new line character at the end of the title line.
The sequence itself, which continues until the end of file or the next `>' is reached.
An example of FASTA format is shown below:
>sp|O52956|A85A_MYCAV Antigen 85-A precursor (85A) MTLVDRLRGAVAGMPRRLVVGAAGAALLSGLIGAVGGSATAGAFSRPGLPVEYLQVPSAAMG RDIKVQFQSGGANSPALYLLDGMRAQDDFNGWDINTPAFEWYNQSGISVAMPVGGQSSFYSD WYKPACGKAGCTTYKWETFLTSELPQYLSAQKQVKPTGSGVVGLSMAGSSALILAAYHPDQF VYAGSLSALLDPSQGMGPSLIGLAMGDAGGYKAADMWGPKEDPAWARNDPSLQVGKLVANNT RIWVYCGNGKPSDLGGDNLPAKFLEGFVRTSNLKFQDAYNGAGGHNAVWNFDANGTHDWPYW GAQLQAMKPDLQSVLGATPGAGPATAAATNAGNGQGT
For more information, see the description at NCBI.
Results of a BLAST search are presented through the standard BLASTP layout, displaying the retrieved proteins with their associated parameters.
Summary of the retrieved proteins with their Score and E-value. Clicking on the name of the protein (e.g.antigen 85-A ) will take you to the complete PSORTdb entry of the protein. Click on "BLAST Alignment" to view the sequence alignment result of your query protein to the listed BLAST hit. Selecting the genome name will take you to the summary genome page. You can also click the "BLAST Report" tab on the top right corner to view the plain text version of the BLAST results.
From the Downloads tab, the user can choose to download any of the following:
For individual prokaryotic genomes, a FASTA file of all predicted proteins in the genome, as well as PSORTb v3.0 prediction results for each genome.
Complete ePSORTdb and cPSORTdb dataset (PSORTb v3.0).
We're always looking for new proteins to add to our database!
If you think you've got a good candidate, please submit it to us!
Interested in hearing about our latest updates? Enter your email below to subscribe to our mailing list!