• Account
  • Account
    Inquiries
    0

    Inquired items

    0items total Subtotal:$ 0
  • Cart
    Cart

    Inquired items

    items total Subtotal:$ 0

All Departments

Location: Home > Information Center > Technical FAQs > Protein Technology Column > Commonly Used Websites and Databases for Proteomics

Commonly Used Websites and Databases for Proteomics

Date: 2015-09-15 Author: Leading Biology Click: 2636

Commonly Used Websites and Databases for Proteomics


1. Protein Databases

a) UniProt (The Universal Protein Resource) 

Website: http://www.uniprot.org/     

               http://www.ebi.ac.uk/uniprot/

Brief introduction: The Universal Protein Resource (UniProt) is a comprehensive resource for protein sequence and annotation data. The mission of it is to provide the scientific community with a high quality and freely accessible resource of protein sequence and functional information. UniProt is a collaboration between the European Bioinformatics Institute (EMBL-EBI), the SIB Swiss Institute of Bioinformatics and the Protein Information Resource (PIR). In addition, UniProt is comprised of three components: the UniProt Knowledgebase (UniProtKB), the UniProt Reference Clusters (UniRef), and the UniProt Archive (UniParc).The UniProt website provides detailed information on protein function, domain structure, post-transcriptional modification, modification sites, variability, secondary and tertiary structures, etc. In the meanwhile, it also provides corresponding links to other databases, including sequence databases, three-dimensional structure databases, 2-D condensed electrophoresis databases, and protein family databases.


b) PIR (Protein Information Resource)

Website: http://pir.georgetown.edu/

Brief introduction: The Protein Information Resource (PIR) is an integrated public bioinformatics resource to support genomic, proteomic and systems biology research and scientific studies. It’s dedicated to providing timely, high-quality, and most extensive annotations. It consists of iProClass, PIRSF, PIR-PSD, PIR-NREF, and UniPort. Also, it has cross-applications with more than 90 biological databases including protein families, protein functions, protein networks, protein interactions and genomes.


c) BRENDA (enzyme database)

Website: http://www.brenda-enzymes.org

Brief introduction: BRENDA is the main collection of enzyme functional data available to the scientific community. This database provides enzyme classification and its nomenclature, as well as data on biochemical reactions, specifity, structure, cell localization, extraction methods, literature, applications, modifications, and related diseases.


d) CORUM (collection of experimentally verified mammalian protein complexes)

Website: http://mips.gsf.de/genre/proj/corum/index.html

Brief introduction: CORUM database is a mammalian protein complex database and provides data on protein complex names, subunits, function and related literature, etc.


e) CyBase (cyclic protein database)

Website: http://research1t.imb.uq.edu.au/cybase

Brief introduction: CyBase database is a cyclic protein database and provides data on the sequence, structure of cyclic proteins and prediction services.


f) DB-PABP

Website: http://pabp.bcf.ku.edu/DB_PABP/

Brief introduction: DB-PABP is a polyanion conjugated protein database. It is generally known that the interactions between polyanion conjugated proteins and polyanions play an important role in intracellular localization, transport, and protein folding. In addition, many proteins associated with neurodegenerative diseases are polyanion-binding proteins. This database provides data on the identified polyanion binding proteins and it’s also cross-applied to the NCBI protein database.


g) IUPHAR-DB

Website: http://www.iuphar-db.org

Brief introduction:IUPHAR-DB is a G protein-coupled receptors (GPCRs) and ion channel database. It provides a comprehensive description of the genes and their functions with information on protein structures, ligands, expression profiles, signal transduction mechanisms, and diversity.


h) GLIDA

Website: http://pharminfo.pharm.kyoto-u.ac.jp/services/glida/

Brief introduction:GLIDA is shorted for GPCR-ligand database which has been developed for those who work in the field of GPCRs-related drug discovery and need information on both GPCRs and their known ligands. Moreover, it provides data on  GPCRs, ligands and their interactions, homologous receptors network, and conserved recognition regions in order to support new drug discovery.


i) LOCATE

Website: http://locate.imb.uq.edu.au/

Brief introduction:LOCATE is a curated database that houses data describing the  subcellular localization of proteins from mammals. And the subcellular locations are determined by a high-throughput, immunofluorescence-based assay and manually reviewing peer-reviewed publications.


j) InterPro

Website: http://www.ebi.ac.uk/interpro/

Brief introduction:InterPro is used by research scientists interested in the large-scale analysis of whole proteomes, genomes and metagenomes, as well as researchers seeking to characterise individual protein sequences. And InterPro is a resource that provides functional analysis of protein sequences by classifying them into families and predicting the presence of domains and important sites. To classify proteins in this way, InterPro uses predictive models, known as signatures, provided by several different databases (referred to as member databases) that make up the InterPro consortium.


k) OKCAM

Website: http://okcam.cbi.pku.edu.cn

Brief introduction:OKCAM is an ontology-based, human-centered knowledgebase for cell adhesion molecules. Cell adhesion molecules' (CAMs) are essential elements of cell/cell communication that are important for proper development and plasticity of a variety of organs and tissues. And OKCAM, a CAM knowledgebase that provides ready access to these data and ontologic system.


2 Proteome database

a) GELBANK 

Website: http://gelbank.anl.gov

Brief introduction: Gelbank is a publicly available database of annotated two-dimensional gel electrophoresis (2DE) patterns of biological systems with completed genomes and proteomes from organisms and known genome information . In this database, you can search information such as relative molecular mass, isoelectric point and protein sequence for quick retrieval.


b) SWISS-2DPAGE

Website: http://www.expasy.org/ch2d/

Brief introduction: SWISS-2DPAGE is an annotated two-dimensional polyacrylamide gel electrophoresis (2-D PAGE) and SDS-PAGE database established in 1993. It contains data on proteins identified on various 2-D PAGE and SDS-PAGE reference maps human, mouse, Escherichia coli, Saccharomyces cerevisiae and Dictyostelium Dictyostelium. You can locate these proteins on the 2-D PAGE maps or display the region of a 2-D PAGE map where one might expect to find a protein from UniProtKB/Swiss-Prot.


c) SysPIMP (Systematical Platform for Identifying Mutated Proteins)

Website: http://pimp.starflr.info/

Brief introduction: Systematical Platform for Identifying Mutated Proteins(SysPIMP)  is the web-based systematical platform for efficiently identifying human disease-related mutated sequences from mass spectrometry(MS). As is known to all when a certain amino acid residue of a protein changes, the mass spectrum changes, and the disease-related mutation is detected by the change of the protein mass spectrum.


d) Sys-BodyFluid

Website: https://omictools.com/sys-bodyfluid-tool    

Brief introduction: Sys-BodyFluid is a systematical database for human body fluid proteome research. It contains various data on some kinds of body fluid proteomes, including plasma/serum, urine, human milk , saliva, bronchoalveolar lavage fluid, synovial fluid, nipple aspirate fluid, tear fluid, seminal fluid, cerebrospinal fluid and amniotic fluid.


e) BloodExpress

Website: http://hscl.cimr.cam.ac.uk/bloodexpress/

Brief introduction: BloodExpress is a web server to browse gene expression in mouse. It integrates 271 individual microarray experiments derived from 15 distinct studies done on most characterised mouse blood cell types. and gene expression information has been discretised to absent/present/unknown calls..


f) CentrosomeDB (human centrosomal proteins database)

Website: http://centrosome.dacya.ucm.es

Brief introduction: Human centrosomal proteins database (CentrosomeDB) is a new generation of the centrosomal proteins database for Human and Drosophila melanogaster.


g) ConsensusPathDB

Website: http://cpdb.molgen.mpg.de

Brief introduction: ConsensusPathDB is a meta-database that integrates different types of functional interactions from heterogeneous interaction data resources. Physical protein interactions, metabolic and signaling reactions and gene regulatory interactions are integrated in a seamless functional association network that simultaneously describes multiple functional aspects of genes, proteins, complexes, metabolites, etc. With 155,432 human, 194,480 yeast and 13,648 mouse complex functional interactions (originating from 18 databases on human and eight databases on yeast and mouse interactions each), ConsensusPathDB currently constitutes the most comprehensive publicly available interaction repository for these species. And it has cross-applications with multiple databases and provides network data such as protein interaction, biochemical reaction and gene regulation.


h) Proteome Analysis Database

Website: http://www.ebiac.uk.proteome/

Brief introduction: Proteome Analysis Database is online application of InterPro and CluSTr for the functional classification of proteins in whole genomes. The SWISS-PROT group at EBI has developed the Proteome Analysis Database utilising existing resources and providing comparative analysis of the predicted protein coding sequences of the complete genomes of bacteria, archaea and eukaryotes. The two main projects used, InterPro and CluSTr, give a new perspective on families, domains and sites and cover 31-67% (InterPro statistics) of the proteins from each of the complete genomes. CluSTr covers the three complete eukaryotic genomes and the incomplete human genome data. The Proteome Analysis Database is accompanied by a program that has been designed to carry out InterPro proteome comparisons for any one proteome against any other one or more of the proteomes in the database.


i) HPRD (Human Protein Reference Database)

Website: http://www.hprd.org/

Brief introduction: The Human Protein Reference Database (HPRD) represents a centralized platform to visually depict and integrate information pertaining to domain architecture, post-translational modifications, interaction networks and disease association for each protein in the human proteome.


j) NOPdb

Website: http://www.lamondlab.com/NOPdb3.0/

Brief introduction: The Nucleolar Proteome Database (NOPdb) archives data on >700 proteins that are identified by multiple mass spectrometry (MS) analyses from highly purified preparations of human nucleoli, the most prominent nuclear organelle. And it is searchable by either gene names, nucleotide or protein sequences, Gene Ontology terms or motifs, or by limiting the range for isoelectric points and/or molecular weights and links to other databases (e.g. LocusLink, OMIM and PubMed).


k) EndoNet

Website: http://endonet.bioinf.med.uni-goettingen.de/ 

Brief introduction: EndoNet is an information resource about regulatory networks of cell-to-cell communication. It provides information on hormones, hormone receptors, the sources (i.e. cells, tissues and organs) where the hormones are synthesized and secreted, and where the respective receptors are expressed. Also, this database focuses on the regulatory relations between them.


3 Protein interaction and network database

a) 3DID (3D interacting domains)

Website: http://3did.irbbarcelona.org         

               http://gatealoy.pcb.ub.es/3did/

Brief introduction: The database of three-dimensional interacting domains (3DID) is a collection of domain-domain interactions in proteins for which high-resolution three-dimensional structures are known. It describes GO-based functional annotations and interactions between yeast proteins from large-scale interaction discovery studies. Also, it contains templates for interactions between two globular domains as well as novel domain-peptide interactions. Fast retrieval can be performed by domain or motif name, protein sequence, GO code, PDB ID and Pfam code.


b) DOMINE

Website: http://domine.utdallas.edu

Brief introduction: DOMINE is a comprehensive collection of known and predicted protein domain (domain-domain) interactions. It provides interactions inferred from PDB entries, and those that are predicted by 13 different computational approaches using Pfam domain definitions. DOMINE contains a total of 26,219 domain-domain interactions (among 5,410 domains) out of which 6,634 are inferred from PDB entries, and 21,620 are predicted by at least one computational approach.


c) PiSite (Database of Protein interaction sites)

Website: http://pisite.hgc.jp

Brief introduction: PiSITE is a web-based database of protein interaction sites using multiple binding states in the PDB . The PiSITE provides not only information of interaction sites of a protein from single PDB entry, but also information of interaction sites of a protein from multiple PDB entries including similar proteins. In the PiSITE, the identification of the binding sites of protein chains is performed by searching the same proteins with different binding states in PDB at first, and then mapping those binding sites onto the query proteins.


d) Binding MOAD

Website: http://www.BindingMOAD.org

Brief introduction: The Mother of All Databases (MOAD) is a subset of the Protein Data Bank (PDB), containing every high-quality example of ligand-protein binding. Binding MOAD's goal is to be the largest collection of well resolved protein crystal structures with clearly identified biologically relevant ligands annotated with experimentally determined binding data extracted from literature. This database can provide related ligands for proteins of known structure with detailed annotations and experimentally derived affinity data.


e) Phospho.ELM

Website: http://phospho.elm.eu.org

Brief introduction: Phospho.ELM is a manually curated database of eukaryotic phosphorylation sites. This resource includes data collected from published literature as well as high-throughput data sets. The entries provide information about the phosphorylated proteins and the exact position of known phosphorylated instances, the kinases responsible for the modification (where known) and links to bibliographic references.


f)SuperSite

Website: http://bioinformatics.charite.de/supersite

Brief introduction: SuperSiteMe is a database of Biotransformation of xenobiotics. metabolites and drug binding sites in proteins. It provides information on binding mechanism, recognition mechanism, and conserved binding sites.


g) STITCH

Website: http://stitch.embl.de/

Brief introduction: STRING is protein-protein interaction networks, integrated over the tree of life. The STRING database aims to provide a critical assessment and integration of protein-protein interactions, including direct (physical) as well as indirect (functional) associations. The new version 10.0 of STRING covers more than 2000 organisms, which has necessitated novel, scalable algorithms for transferring interaction information between organisms.


h) Reactome

Website: http://www.reactome.org

Brief introduction: REACTOME is an open-source, open access, manually curated and peer-reviewed pathway database. Its goal is to provide intuitive bioinformatics tools for the visualization, interpretation and analysis of pathway knowledge to support basic and clinical research, genome analysis, modeling, systems biology and education. The cornerstone of Reactome is a freely available, open source relational database of signaling and metabolic molecules and their relations organized into biological pathways and processes. It provides network diagrams of biochemical processes, details annotation of protein molecules involved in them, and establishes extensive cross-applications with other databases such as UniPort, KEGG, OMIM, etc.


i) PID (Pathway Interaction Database)

Website: http://pid.nci.nih.gov

Brief introduction: The Pathway Interaction Database (PID) is software supporting the access and display of information about bio-molecular interactions and cellular processes assembled into signaling pathways. And this software is co-founded by NCI and Nature. It provides a known protein pathway for human cell signal transduction, regulation of activity and major cell life, which can be queried by entering a molecular name or metabolic process name.


j) UniHI (Unified Human Interactome database)

Website: http://www.unihi.org

Brief introduction: UniHI (Unified Human Interactome database) is a human protein protein interaction database and it can be queried according to the metabolic pathways of protein names, etc..Users can enter gene or protein identifiers from different organisms to obtain physical and regulatory interaction partners in the human interactome. UniHI integrates human protein-protein and transcriptional regulatory interactions from 15 distinct resources. The UniHI database includes tools (i) to search for molecular interaction partners of query genes or proteins in the integrated dataset, (ii) to inspect the origin, evidence and functional annotation of retrieved proteins and interactions, (iii) to visualize and adjust the resulting interaction network, (iv) to filter interactions based on method of derivation, evidence and type of experiment as well as based on gene expression data or gene lists and (v) to analyze the functional composition of interaction networks.


k) VirHostNet

Website: http://virhostnet.prabi.fr/

Brief introduction: VirHostNet release 2.0 is a knowledgebase dedicated to the network-based exploration of virus-host protein–protein interactions. The new interface is based on Cytoscape web library and provides a user-friendly access to the most complete and accurate resource of virus-virus and virus-host protein-protein interactions as well as their projection onto their corresponding host cell protein interaction networks. Keywords can be queried by importing genes, proteins and paths.


l) Bionemo (molecular information on biodegradation metabolism) 

Website: http://bionemo.bioinfo.cnio.es

Brief introduction: Molecular information on biodegradation metabolism (Bionemo) database mainly provides information on proteins and genes directly implicated in biodegradation metabolism, including information on protein sequence, domain, structure; gene sequence, regulatory elements and transcription unit. In addition to this, it also includes biodegradation metabolic pathway maps, related biochemical reactions, etc.


m) PMAP

Website: http://www.proteolysis.org

Brief introduction: The Proteolysis MAP (PMAP) is an integrated web resource focused on proteases. PMAP is a database for analyzing proteolytic events and pathways and it is to aid the protease researchers in reasoning about proteolytic networks and metabolic pathways.


4 Protein 3D structure database

a) PDB (Protein Data Bank)

Website: http://www.rcsb.org/pdb

Brief introduction: The Protein Data Bank (PDB) was established as the 1st open access digital data resource in all of biology and medicine (Historical Timeline). And today it is a leading global resource for experimental data central to scientific discovery. PDB is a database of biological macromolecular structures, providing three-dimensional structural data, sequence details, and biochemical properties of biological macromolecules such as proteins and nucleic acids. Through an internet information portal and downloadable data archive, the PDB provides access to 3D structure data for large biological molecules (proteins, DNA, and RNA) which are the molecules of life, found in all organisms on the planet.


b) iSARST

Website: http://sarst.life.nthu.edu.tw/   

               http://140.113.15.73/~lab/iSARST/srv/

Brief introduction: The integrated Service of Structural similarity search Aided by Ramachandran Sequential Transformation (iSARST) is an efficient protein structure alignment database. In this service, developers implement two protein structural similarity search methods, SARST and CPSARST. Besides, three outstanding structural alignment tools, FAST, TM-align and SAMO, are recruited as refinement engines. SE is applied to improve structure-based sequence alignments.


5 Protein motif database

a) CDD (Conserved Domain Database)

Website: http://www.ncbi.nlm.nih.gov/Structure/cdd/cdd.shtml

Brief introduction: The Conserved Domain Database is a resource for the annotation of functional units in proteins. Its collection of domain models includes a set curated by NCBI, which utilizes 3D structure to provide insights into sequence /structure/ function relationships. User can obtain the conserved domain information contained in the protein sequence through the CD-Search service, thereby analyzing and predicting the function of the protein.


b) Blocks

Website: http://blocks.fhcrc.org

Brief introduction: Blocks is a conserved region comparison database of protein families, which contains non-vacancy fragments aligned by highly conserved regions.


c) CPDB (database of circular permutation in proteins)

Website: http://sarst.life.nthu.edu.tw/cpdb

Brief introduction: CPDB is the Circular Permutation Database, Circular permutation (CP) of a protein can be visualized as if its original termini were linked and new ones created elsewhere. Although many well-known protein families have been found to have CP members, and some studies have pointed out that there exists many instances of CP in the protein structure database, high-efficiency CP search tools are rarely available, nevertheless, CPSARST (Circular Permutation Search Aided by Ramachandran Sequential Transformation) is an effective one.


d) MegaMotifbase

Website: http://caps.ncbs.res.in/MegaMotifbase/index.html

Brief introduction: MegaMotifbase is a database of structural motifs for protein structures related at the family and/or superfamily level. MegaMotifbase also provides three-dimensional orientation patterns of the identified motifs in terms of inter-motif distances and torsion angles. Important applications of structural motifs are also provided in several crucial areas such as similar sequence and structure search, multiple sequence alignment and homology modeling.


e) Minimotif Miner

Website: http://mnm.engr.uconn.edu

Brief introduction: Minimotif Miner (MnM) analyzes protein queries for the presence of short contiguous peptide motifs that have a known function in at least one other protein (Minimotifs), Minimotif functions include posttranslational modification of the minimotifs (PTM), binding to a target protein or molecule, and protein trafficking. It is a database of protein motifs detection, which can provide the service of searching motifs in protein motifs.


f) Pfam

Website: https://pfam.xfam.org/

Brief introduction: The Pfam database is a large collection of protein families, each represented by multiple sequence alignments and hidden Markov models (HMMs). and Pfam version 32.0 was produced at the European Bioinformatics Institute using a sequence database called Pfamseq, which is based on UniProt release 2018_04.


6 Predictive database

a) InterPreTS (Interaction Prediction through Tertiary Structure)

Website: http://www.russell.embl.de/cgi-bin/interprets2

Brief introduction: InterPreTS (Interaction Prediction through Tertiary Structure) is a web-based version of database for predicting protein–protein interactions. By entering a pair of query sequences, users first search for homologues in a database of interacting domains (DBID) of known three-dimensional complex structures.


b) Predictome

Website: http://predictome.bu.edu

Brief introduction: Predictome is a database of putative functional links between the proteins of 44 genomes based on the implementation of three computational methods-chromosomal proximity, phylogenetic profiling and domain fusion-and large-scale experimental screenings of protein-protein interaction data.

Online Inquiry

Name
Phone *
E-mail Address *
Service & Products Interested *
Project Description
Verification Code * captcha