ncibtep@nih.gov

Bioinformatics Training and Education Program

ATOM Modeling Pipeline (AMPL) for Drug Discovery

ATOM Modeling Pipeline (AMPL) for Drug Discovery

 When: Jun. 8th, 2021 1:00 pm - 2:00 pm

To Know

Where:
Online Webinar
This class has ended.

About this Class

Do you want to know how to use Machine Learning (ML) for accelerating drug discovery? Join us on June 8, 1:00 pm – 2:00 pm ET, for the first in a series of workshops on how to use the Atom Modeling PipeLine (AMPL), an open-source conda-based software that automates key drug discovery steps. AMPL is designed to take molecular binding data (ex., IC50, ki, etc.) and carry out key ML steps with minimal user intervention. The first workshop will introduce AMPL and highlight AMPL’s capabilities for creating ML-ready datasets. Follow-on workshops will be offered during the summer and will cover modeling methods and inference. Location: Webex Registration: Not required Presenter: Sarangan Ravichandran, PhD, PMP Senior Data Scientist, ATOM Consortium/Frederick National Laboratory for Cancer Research (FNLCR) and Adjunct Professor in Bioinformatics, Hood College Supporting materials: Tutorial and AMPL: A Data-Driven Modeling Pipeline for Drug Discovery The workshop on June 8 will include two parts, a short presentation followed by a hands-on tutorial. Part 1: A 20-minute presentation that will cover the following topics:
  • Introduction to small-molecule binding and the database sources
  • Issues associated with data ingestion and curation
  • Exploratory data analysis of the ingested and curated datasets
  • Use of different featurization methods like molecular fingerprints or properties (Molecular Weight, number of hydrogen-bond acceptors, etc.)
  • Creation of ML-ready datasets
Part 2: A 35-minute AMPL code demonstration followed by a 5-minute Q&A. We will share a Python Jupyter notebook that will cover the following ML steps: data ingestion/curation, featurization, and visualization to create ML-ready datasets. Here are the key sections of the notebook:
  • Highlights of AMPL functions that are designed to address the common issues encountered during the data ingestion and curation of drug discovery or small-molecule-focused projects
  • Introduction of the extensible AMPL featurizer module and a demonstration on how simple keyword choices can lead to the computation of a range of different feature sets
  • Exploratory Data Analysis and visualization code templates that can be adopted for other drug discovery projects with very little modification
To learn more about the software, visit the AMPL GitHub repository at this link Questions? Contact the NCI Data Science Learning Exchange