Protein Ligand Interaction

  • BindingDB: BindingDB contains data for over 1.2 million compounds and 9.2k targets, supporting research, education, and practice in drug discovery, pharmacology, and related fields.
    Publication Link
  • BindingNet: BindingNet is a dataset for analyzing protein-ligand interactions, containing modeled poses for compounds similar to the crystal ligands found in PDBbind, along with corresponding activities from ChEMBL.
    Code Last Commit Publication Link
  • BioLip2: BioLiP is a semi-manually curated database for high-quality, biologically relevant ligand-protein binding interactions, aiming to serve the needs of ligand-protein docking, virtual ligand screening, and protein function annotation.
    Publication Link
  • Leak Proof PDBBind: This work presents a cleaned PDBBind dataset of non-covalent binders, reorganized to avoid data leakage, allowing for more generalizable binding affinity prediction.
    Code Last Commit Publication Link
  • PDBBind: Forging the Basis for Developing Protein–Ligand Interaction Scoring Functions
    Publication Link
  • PDBBind+:
    Link
  • PDBscreen: PDBscreen provides a dataset with multiple data augmentation strategies suitable for training protein-ligand interaction prediction methods.
    Link
  • PLAS-20K: PLAS-20k: Extended Dataset of Protein-Ligand Affinities from MD Simulations for Machine Learning Applications
    Publication Link
  • PLINDER: the largest and most annotated dataset to date, comprising 449,383 PLI systems, each with over 500 annotations, similarity metrics at protein, pocket, interaction and ligand levels, and paired unbound (apo) and predicted structures
    Code Last Commit Publication
  • SIU: Million-Scale Structural Small Molecule-Protein Interaction Dataset for Unbiased Bioactivity Prediction
    Code Last Commit Publication Link
  • BioLip2-Opt: PDBBind Optimization to Create a High-Quality Protein-Ligand Binding Dataset for Binding Affinity Prediction
    Code Last Commit Publication Link
  • PDBBind-Opt: PDBBind Optimization to Create a High-Quality Protein-Ligand Binding Dataset for Binding Affinity Prediction
    Code Last Commit Publication Link