Machine learning methods for drug discovery and toxicology: How to develop a QSAR model using Python

This course will give you the tools and knowledge to develop your own QSAR models for bioactivity or toxicology prediction using open libraries to develop your own Python scripts. You will learn the basics about QSAR modelling and the general workflow to develop the model: including the rationale of the different steps and tools to implement them. Including data management and curation, calculation and selection of molecular descriptors, different machine-learning algorithms, the statistical evaluation of the models and the evaluation of the applicability domain.

This is an online course when you will be able to learn at your rhythm. In our platform you will have access to recorded lessons, text explanation, interactive python exercises and further resources. This is a practical course, with several practical exercises that will finish with a project when you will develop a whole QSAR model by yourself. But you are not alone, a tutor from ProtoQSAR will follow your advance on the platform, give you feedback in your assignments and will be available by e-mail. Additionally, you will be able to interact with your colleagues and instructors by internal forums and chats, and it will be a series of live videoconferences to clarify doubts.

For more details, please download the course content.

Prerequisites: The tasks require a basic knowledge of Python (additional learning resources and links to external resources will be available in the platform for begginers).

Mode: Online Pre-registration: Fill in this form
When? 2 November 2023 to 31 January 2024 Information:
Estimated hours: 60 Course language: English
Price: 480 €* * 280€ for students (requisites in form) or registrations before October 15, 2023.

Course overview:

Introduction of basic concepts
  • Overview of different computational approaches
  • QSAR model workflow
  • Regression vs Classification
  • Statistical analysis of models
 Python basic techniques for chemoinformatics
  • Dataframe importation and analysis
  • Molecule characterization
  • Chemical and biological data curation
QSAR model development
  • Data compilation and curation
  • Calculation of molecular descriptors
  • Train/test splitting
  • Feature reduction and scaling
  • Algorithm selection
  • Hyperparameter optimization
  • Model metrics and validation
  • Applicability domain
  • External prediction