Protein Ligand Interaction
- BindingDB: BindingDB contains data for over 1.2 million compounds and 9.2k targets, supporting research, education, and practice in drug discovery, pharmacology, and related fields.
- BindingNet: BindingNet is a dataset for analyzing protein-ligand interactions, containing modeled poses for compounds similar to the crystal ligands found in PDBbind, along with corresponding activities from ChEMBL.
- BioLip2: BioLiP is a semi-manually curated database for high-quality, biologically relevant ligand-protein binding interactions, aiming to serve the needs of ligand-protein docking, virtual ligand screening, and protein function annotation.
- Leak Proof PDBBind: This work presents a cleaned PDBBind dataset of non-covalent binders, reorganized to avoid data leakage, allowing for more generalizable binding affinity prediction.
- PDBBind: Forging the Basis for Developing Protein–Ligand Interaction Scoring Functions
- PDBBind+:
- PDBscreen: PDBscreen provides a dataset with multiple data augmentation strategies suitable for training protein-ligand interaction prediction methods.
- PLAS-20K: PLAS-20k: Extended Dataset of Protein-Ligand Affinities from MD Simulations for Machine Learning Applications
- PLINDER: the largest and most annotated dataset to date, comprising 449,383 PLI systems, each with over 500 annotations, similarity metrics at protein, pocket, interaction and ligand levels, and paired unbound (apo) and predicted structures
- SIU: Million-Scale Structural Small Molecule-Protein Interaction Dataset for Unbiased Bioactivity Prediction
- BioLip2-Opt: PDBBind Optimization to Create a High-Quality Protein-Ligand Binding Dataset for Binding Affinity Prediction
- PDBBind-Opt: PDBBind Optimization to Create a High-Quality Protein-Ligand Binding Dataset for Binding Affinity Prediction