2.
Structural Databases
Introduction
Biological structural databases are specialized repositories that store 3D structural data of
biomolecules such as proteins, nucleic acids, and complexes. These databases provide critical
insights into the spatial arrangement of atoms within molecules, aiding researchers in
understanding molecular functions, interactions, and drug discovery. Structural databases are
essential for bioinformatics, computational biology, and structural bioinformatics, enabling
researchers to:
Visualize molecular structures.
Predict protein-ligand interactions.
Model unknown structures based on known homologs.
Understand disease mechanisms at the molecular level.
Classification of Structural Databases
Structural databases are categorized based on the type of biological data they store and analyze.
Category Examples Description
Protein Structure Databases RCSB PDB, PDBe, Experimentally determined 3D
PDBsum protein structures.
Protein Classification SCOP, CATH, Pfam structural domains and
Databases evolutionary relationships.
Nucleic Acid Structure NDB, RNA central 3D structures of DNA and RNA
Databases molecules.
Protein-Ligand Interaction BindingDB, ChEMBL, information on protein-small
Databases DrugBank molecule interactions.
Computational Structure AlphaFold DB, ModBase, Provide predicted protein
Prediction Databases Swiss-Model Repository structures and homology models.
Tools for Analyzing Structural Data
PyMOL
Chimera
Discovery studio Viewer
2.1. Retrieve and Visualize a Protein Structure from PDB
Aim: To retrieve the 3D structure of a protein from the Protein Data Bank (PDB) and visualize
using PDB Viewer (PyMOL/ Chimera)
Protocol:
1. Go to PDB website: https://www.rcsb.org
2. Search for a protein (e.g., Hemoglobin) by entering PDB ID: 1A3N in the search box.
3. Filter the search by using left side panel options
4. Select any filtered molecule and take snapshot of experimental data validation
5. Explore the details of structure, sequence, and ligands
6. Download the structure file: Click Download Files → PDB format (.pdb).
7. Open in any pdb viewer - PyMOL: Load the .pdb file in PyMOL using: load 1A3N.pdb
8. Visualize the structure:
a) Ligand present if any
b) Use cartoon representation (show cartoon)
c) Color different chains (color blue, chain A)
d) Rotate and analyze the structure and label the ligand if any
Results and discussions:
2.2. Classify a Protein Domain Using SCOP, CATH and Prosite
Aim: To explore the classification and functional details of a protein molecule.
Protocol:
1. Go to CATH Database: https://www.cathdb.info
2. Search for a protein using PDB ID (e.g., 1A3N) or with name
3. View classification:
1. Super families
a) GO Diversity, EC diversity, species diversity
b) CATH classification and Domains
c) Functional families
d) Structural neighborhood
2. Domains
1. Domain in chain
4. Repeat for SCOP: Visit http://scop.mrc-lmb.cam.ac.uk and search for 1A3N.
1. Family – show ancestry
2. Pfam/ interpro – overview, proteins, domain architecture
3. Domains – explore the details
5. PROSITE: Visit https://prosite.expasy.org and search for motifs in TP53.
1. Give sequence – run
2. Observe the top hit, scores, active site, and amino acid count.
Results and discussions:
2.3. Retrieving Structure of Protein Using AlphaFold DB
Aim: Retrieve a predicted protein structure using AlphaFold DB.
Protocol:
1. Go to AlphaFold DB: https://www.alphafold.ebi.ac.uk
2. Search for a protein using a gene name (e.g., TP53).
3. Download the predicted 3D structure in PDB format.
4. Explore the structural details
o Confidence Analysis
1. plDDT Score:
>90: Highly confident
70–90: Good model
50–70: Low confidence, use with caution
<50: Very low confidence, structure may be unreliable
2. Predicted Aligned Error (PAE):
1. Helps assess domain movements and flexibility.
o Domain details
Number of domains
Q score
Lengths
Boundaries
o Pathogenicity details
Show the likely pathogenicity missense mutations
o Respective pdb structure Confidence
Results and discussion
Conclusion: