Proteomic Tool Kit

Tools and resources for the analysis, visualization and characterization of proteomic data, including publically available proteomic databases of interest to plant science researchers.

1001 Proteomes

An Arabidopsis thaliana non-synonymous SNP browser


AraCyc is a metabolic pathway database for Arabidopsis thaliana that contains information about both predicted and experimentally determined pathways, reactions, compounds, genes and enzymes. The Omics viewer is a software for displaying large scale data such as microarray gene expression results or proteomic data in the context of biochemical pathways.


ARAMEMNON is a database of plant membrane proteins, using Arabidopsis thaliana as reference model plant. The database also holds all putative membrane proteins of five other plant species, including rice and maize. Data on protein topology, predicted homologies and sequences is stored. 


Protein-protein interactions (PPIs) play fundamental roles in various cellular processes. Here, we present a new version of computational interactome that contains more than 345,000 predicted PPIs involving about 51.2% of the Arabidopsis proteins.

Compared to the earlier version, the updated AraPPINet displays a higher accuracy in predicting protein interactions through performance evaluation with independent datasets.


The AT_CHLORO database stores sub-plastidial (envelope, stroma, thylakoids; Ferro et al., Mol Cell Proteomics 2010; Bruley et al., Frontiers Plant Sci 2012) and sub-thylakoidal (Grana, stroma-lamellae) localization of Arabidopsis thaliana chloroplast proteins obtained from quantitative proteomics experiments as well as curated function and localization of proteins(Tomizioli et al, Mol Cell Proteomics 2014).

The AT_CHLORO database links to other public web sites (TAIR, PPDB, AtProteome, SUBA, POGs, Aramemnon)

The MASCP Gator (Joshi et al., Plant Physiol. 2011), the aggregation portal for the visualization of Arabidopsis proteomics data, gathers information from a variety of online proteomics resources including AT_CHLORO.

The AT_CHLORO database stores information for proteins that have been identified in different sub-fractions obtained from A. thaliana chloroplasts (Ferro et al., 2010).


Comprehensive enzyme information edited from the scientific literature for many proteins across a range of organisms including Arabidopsis.

From the abstract: The BRENDA (BRaunschweig ENzyme Database) enzyme information system is the main collection of enzyme functional and property data for the scientific community. The content covers information on function, structure, occurrence, preparation and application of enzymes as well as properties of mutants and engineered variants. The number of manually annotated references is more than 100,000, the number of ligand structures almost 100,000. BRENDA now provides new viewing options such as the display of the statistics of functional parameters and the 3D view of protein sequence and structure features. Furthermore a ligand summary shows comprehensive information on the BRENDA ligands. The enzymes are linked to their respective pathways and can be viewed in pathway maps. It is possible to submit new, not yet classified enzymes to BRENDA, which then are reviewed and classified by the International Union of Biochemistry and Molecular Biology.


The Center for Eukaryotic Structural Genomics aims to increase the production of available 3-D protein structures. As part of this project CESG has produced a large number of Arabidopsis ORF Gateway clones, protein expression clones, small amounts of purified protein and over 40 3-D structures. Information on all ORFs studied to date are available by BLAST search. Protocols for producing recombinant proteins are also available.


eGenPub, a text mining system for extending computationally mapped bibliography for UniProt Knowledge base by capturing centrality



Links to an extensive range of proteomics analysis software.

IntAct of EMBL and EBI

Protein-Protein Interaction Database from a range of organisms, including Arabidopsis.

From the abstract: IntAct is an open-source, open data molecular interaction database populated by data either curated from the literature or from direct data depositions. As from September 2011, IntAct contains approximately 275 000 curated binary interaction evidences from over 5000 publications.

MIAPE: The Minimum Information About a Proteomics Experiment

MIAPE sets the standards for describing proteomic experiments and samples. The website includes links to a list of papers published in Nature Biotechnology on standards for different proteomic analysis techniques. 


MRMaid helps you design SRM assays by suggesting peptides and product ions to monitor based on millions of experimental spectra from the PRIDE database.


Super-database containing information on the Arabidopsis thaliana proteome.

From the abstract: The pep2pro dataset, which is an organ-specific characterisation of the Arabidopsis thaliana proteome containing 14522 identified proteins based on 2.6 million peptide spectrum assignments. This dataset provides evidence of protein expression and reveals organ-specific processes.

PhosPhAt, the Arabidopsis Protein Phosphorylation Site Database

Phosphorylation site database: The Arabidopsis Protein Phosphorylation Site Database (PhosPhAt 4.0) contains information on Arabidopsis phosphorylation sites which were identified by mass spectrometry in large scale experiments by different research groups. Specific information about the peptide properties, their annotated biological function as well as the experimental and analytical context is given. For a majority of peptides, the actual annotated mass spectrum is displayed in interactive manner.

Phosphorylation site predictor: The PhosPhAt service has a built-in plant specific phosphorylation site predictor trained on the experimental dataset for Serine, threonine and tyrosine phosphorylation (pSer, pThr, pTyr). Protein sequences or Arabidopsis AGI gene identifier can be submitted to the predictor.


Plant Protein Phosphorylation DataBase

P3DB version 2.0 hosts protein phosphorylation data for 6 species from 23 experimental studies, containing 11,601 phosphoproteins, harboring 32,963 phosphosites. Datasets for a number of plant species are available, including Arabidopsis, rice and maize. 

From the abstract: With a web-based user interface, the database is browsable, downloadable and searchable by protein accession number, description and sequence. A BLAST utility was integrated and a phosphopeptide BLAST browser was implemented to allow users to query the database for phosphopeptides similar to protein sequences of their interest.


The protein plastid database allows users to use a BLAST search or search for plastid typ information from several plant plastid proteomes. 

From the abstract: plprot was established as a plastid proteome database to provide information about the proteomes of chloroplasts, etioplasts and undifferentiated plastids.


ProMEX is a mass spectral reference database. The database consists of tryptic peptide fragmentation mass spectra derived from plants.

Proteomic Standard Initiative

The Proteomics Standards Initiative (PSI) aims to define community standards for data representation in proteomics to facilitate data comparison, exchange and verification.


Structures of 33 Arabidopsis thaliana proteins described and visualised.

Seed Proteome

Seed proteome databases for Arabidopsis and sugar beet (coming soon). Includes protein catalogues and protocols. 

SUBA, SUB-cellular location database for Arabidopsis proteins

A tool to investigate subcellular localisation of proteins in Arabidopsis through the unification of disparate datasets. The web accessible interface allows the construction of powerful user based queries resulting in a one-stop-shop for protein localisation.

From the abstract: The localisation data in SUBA encompasses 10 distinct subcellular locations, >6743 non-redundant proteins and represents the proteins encoded in the transcripts responsible for 51% of Arabidopsis expressed sequence tags. The SUBA database provides a powerful means by which to assess protein subcellular localisation in Arabidopsis.

The Plant Proteome Database

PPDB is a Plant Proteome DataBase for Arabidopsis thaliana and maize (Zea mays) allowing users to search protein-encoding gene models in Arabidopsis, maize and rice. Every predicted protein in all species can be searched for experimental and other information (even if not experimentally identified).

From the abstract: Experimental identification is based on in-house mass spectrometry (MS) of cell type-specific proteomes (maize), or specific subcellular proteomes (e.g. chloroplasts, thylakoids, nucleoids) and total leaf proteome samples (maize and Arabidopsis). So far more than 5000 accessions both in maize and Arabidopsis have been identified. In addition, more than 80 published Arabidopsis proteome datasets from subcellular compartments or organs are stored in PPDB and linked to each locus.

The Predicted Arabidopsis Interactome Resource (PAIR)

A database of Arabidopsis protein-protein interactions, predicted and experimentally reported, collected from the major interaction databases.

From the abstract: The predicted Arabidopsis interactome resource comprises of 5990 experimentally reported molecular interactions in Arabidopsis thaliana together with 145,494 predicted interactions. PAIR predicts interactions by a fine-tuned support vector machine model that integrates indirect evidences for interaction, such as gene co-expressions, domain interactions, shared GO annotations, co-localizations, phylogenetic profile similarities and homologous interactions in other organisms (interologs). These predictions were expected to cover 24% of the entire Arabidopsis interactome, and their reliability was estimated to be 44%. PAIR features a user-friendly query interface, providing rich annotation on the relationships between two proteins. A graphical interaction network browser has also been integrated into the PAIR web interface to facilitate mining of specific pathways.

UniProt Knowledgebase

The mission of UniProt is to provide the scientific community with a comprehensive, high-quality and freely accessible resource of protein sequence and functional information. As well as the protein knowledgebase, there is a tool to search for sequence clusters and a sequence archive.