ResearchBib Share Your Research, Maximize Your Social Impacts
Sign for Notice Everyday Sign up >> Login

Random Forest-Based Compound ATC Classification Using Structural and Physiochemical Information |Biomedgrid

Journal: American Journal of Biomedical Science & Research (Vol.17, No. 3)

Publication Date:

Authors : ;

Page : 252-258

Keywords : Drug repurposing; Compound classification; ATC classification; Anatomical Therapeutic Chemical classification; Drug mode of action; Random forest;

Source : Downloadexternal Find it from : Google Scholarexternal


There are nearly 4,000 FDA-approved drugs (including salts) for different indications. An additional 4,600 compounds have an ongoing investigation in clinical trials. These drugs and compounds represent a mere fraction of the total chemical space, approximated to be greater than 10e12. Approximately 1.8 million compounds from the chemical space represent potential candidates for FDA approval and are currently being classified via traditional experimental methods, which are both costly and laborious, yet notalways- deterministic. This highlights the need for automated Anatomical Therapeutic Chemical (ATC) classification, which provides classification systems for drugs or drug-like substances into five levels. ATC level one and level two classifications are available only for 2,739 compounds (in ChEMBL). In this research, two random forest-based ensemble models are trained to predict the ATC classes for the 1.8M preclinical compounds available at ChEMBL. We used structural and physiochemical properties for the feature extraction. Using independent testing, we obtained a micro F1 score of 0.69 and 0.81 for the ATC level one and two models respectively, with the ATC level one model showing a greater accuracy (71.9%) compared to existing ATC level one classification methods. This research illustrates how improved classification of preclinical compounds may enhance the performance of future in-silico-based drug repurposing methods and help understand the mode of action for the preclinical compounds.

Last modified: 2024-06-24 22:01:14