Skip to content

RollerCoaster1899/QSAR_Model_Generation

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 

Repository files navigation

Quantitative Structure Activity Relationship

  1. Importing necessary libraries: The code imports several libraries such as NumPy, ChemBL API, Pandas, RDKit, scikit-learn, and matplotlib. These libraries provide functionality for data manipulation, molecular analysis, machine learning, and data visualization.

  2. Defining functions for data retrieval and model building:

    • target_retrieval(search): Retrieves information about a target from ChemBL API based on a search term.
    • target_info_retrieval(search): Retrieves activity data for a specific target from ChemBL API.
    • build_qsar_model(data): Cleans the activity data, calculates pIC50 values, generates molecular fingerprints, and constructs a QSAR (Quantitative Structure-Activity Relationship) model using the Random Forest algorithm.
    • evaluate_molecule(mol_smiles, model): Predicts the pIC50 value for a given molecule using the trained QSAR model.
  3. Defining additional utility functions:

    • plot_predicted_vs_real(y_test, y_pred): Generates a scatter plot comparing the predicted pIC50 values to the real pIC50 values.
    • identify_pharmacophores(feature_importance, fingerprint_columns, threshold): Identifies important pharmacophores (fingerprint features) based on their feature importance values.
    • draw_molecule(smiles): Draws a molecule using its SMILES string representation.
  4. Example usage:

    • Retrieves target information and activity data for a given search term.
    • Builds a QSAR model using the activity data.
    • Evaluates a molecule using the trained model and prints the predicted pIC50 value.
    • Prints target information and activity data.
    • Generates a plot comparing the predicted and real pIC50 values.

This code provides a basic framework for retrieving target information, building QSAR models, and evaluating molecules using molecular fingerprints. It also includes utility functions for data visualization and identifying important pharmacophores.

About

Calculate molecular descriptors based on SMILES structure

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published