Features
Docking of a compound library into one protein structure.
Docking of a compound library into several protein structures. Outputs the compounds that are among the top-scorers for each protein structure.
Docks a small fraction of the library (0.1-1%) then builds a machine-learning model to predict the consensus scores of the rest of the library. This is done iteratively to gradually improve the model. Greatly reduced computation time for large libraries.
Standardisation of the compound library is handled by the ChemBL Structure Pipeline and includes salt removal, bond standardization and tautomer standardisation.
Bento, A. P., Hersey, A., Félix, E., Landrum, G., Gaulton, A., Atkinson, F., Bellis, L. J., de Veij, M., & Leach, A. R. (2020). An open source chemical structure curation pipeline using RDKit. Journal of Cheminformatics, 12(1), 1–16. DOI
Protonation of the library can be handled in one of two ways:
Generation of conformers is handled by GypsumDL (https://durrantlab.pitt.edu/gypsum-dl/) Ropp, P. J., Spiegel, J. O., Walker, J. L., Green, H., Morales, G. A., Milliken, K. A., Ringe, J. J., & Durrant, J. D. (2019). GypSUm-DL: An open-source program for preparing small-molecule libraries for structure-based virtual screening. Journal of Cheminformatics, 11(1), 1–13. DOI
Addition of hydrogens and prediction of tautomers can be done automatically via the Protoss API :
If a binding mode of a ligand is available, this can be used as a reference to define the pocket of the target enzyme
If a binding mode of a ligand is available, this can be used as a reference to define the pocket of the target enzyme using the ligand's radius of gyration. This was shown to be an effective method to determine the optimal binding site size : Feinstein, W. P.; Brylinski, M. Calculating an Optimal Box Size for Ligand Docking and Virtual Screening against Experimental and Predicted Binding Pockets. J. Cheminform. 2015, 7 (1), 18. DOI.
If a binding mode is not known or only the apo form of the target is available, DogSiteScorer (more info) can be used to the search for suitable pockets using different metrics to rank the pockets (volume, druggability score etc.).
Volkamer, A., Kuhn, D., Rippmann, F., & Rarey, M. (2012). DoGSiteScorer: a web server for automatic binding site prediction, analysis and druggability assessment. Bioinformatics, 28(15), 2074–2075. DOI
Gnina is a molecular docking program with integrated support for scoring and optimizing ligands using convolutional neural networks. It is a fork of smina, which is a fork of AutoDock Vina.
McNutt, A. T., Francoeur, P., Aggarwal, R., Masuda, T., Meli, R., Ragoza, M., Sunseri, J., & Koes, D. R. (2021). GNINA 1.0: molecular docking with deep learning. Journal of Cheminformatics, 13(1), 1–20. DOI
A fork of AutoDock Vina that is customized to better support scoring function development and high-performance energy minimization.
Koes, D. R., Baumgartner, M. P., & Camacho, C. J. (2013). Lessons learned in empirical scoring with smina from the CSAR 2011 benchmarking exercise. Journal of Chemical Information and Modeling, 53(8), 1893–1904. DOI
The docking algorithm PLANTS is based on a class of stochastic optimization algorithms called ant colony optimization (ACO).
Korb, O., Stützle, T., & Exner, T. E. (2009). Empirical scoring functions for advanced Protein-Ligand docking with PLANTS. Journal of Chemical Information and Modeling, 49(1), 84–96. DOI
A version of Autodock Vina using Average Sum of Proximity relative Frequencies (ASoF).
Hassan, N. M.; Alhossary, A. A.; Mu, Y.; Kwoh, C.-K. Protein-Ligand Blind Docking Using QuickVina-W With Inter-Process Spatio-Temporal Integration. Sci. Rep. 2017, 7 (1), 15451. DOI
A parallelised version of Autodock Vina.
Alhossary, A.; Handoko, S. D.; Mu, Y.; Kwoh, C.-K. Fast, Accurate, and Reliable Molecular Docking with QuickVina 2. Bioinformatics 2015, 31 (13), 2214–2216. DOI
Classically, out of the number of poses generated by a docking algorithm, the best-scoring docking pose is selected for further analysis. This can also be done in DockM8 by selecting either the best pose for all docking algorithms or by considering the best pose from only one of the docking algorithms.
Clustering of a set of docking poses can also be done by generating a similarity matrix between all of the poses from a single compound. Metrics such as RMSD, symmetry-corrected RMSD (spyRMSD), electrostatic shape similarity (espsim), protein-ligand interaction fingerprint similarity (S
PLIF), shape similarity (USRCAT), and 3DScore can then be calculated for each pair of compounds. A clustering algorithm is then use to output representative poses according to each metric. Clustering can be done either via the K-Medoids or Affinity Propagation algorithms.
spyRMSD : Meli, R., & Biggin, P. C. (2020). Spyrmsd: Symmetry-corrected RMSD calculations in Python. Journal of Cheminformatics, 12(1), 1–7. DOI
espsim : Bolcato, G., Heid, E., & Boström, J. (2022). On the Value of Using 3D Shape and Electrostatic Similarities in Deep Generative Methods. Journal of Chemical Information and Modeling, 62(6), 1388–1398.DOI
3DScore : Plewczynski, D., Łaźniewski, M., Grotthuss, M. von, Rychlewski, L., & Ginalski, K. (2011). VoteDock: Consensus docking method for prediction of protein–ligand interactions. Journal of Computational Chemistry, 32(4), 568–581. DOI
USRCAT : Schreyer, A. M., & Blundell, T. (2012). USRCAT: real-time ultrafast shape recognition with pharmacophoric constraints. Journal of Cheminformatics, 4(1), 27. DOI
SPLIF : Da, C., & Kireev, D. (2014). Structural Protein–Ligand Interaction Fingerprints (SPLIF) for Structure-Based Virtual Screening: Method and Benchmark Study. Journal of Chemical Information and Modeling, 54(9), 2555–2561. DOI
Any of the scoring functions available in DockM8 can be used to rescored the docked poses and output the best scoring pose. Please see the Rescoring tab for more details about which scoring functions are available.