Chemistry-Based Modeling on Phenotype-Based Drug-Induced Liver Injury Annotation : From Public to Proprietary Data

Mohammad Moein*, Markus Heinonen, Natalie Mesens, Ronnie Chamanza, Chidozie Amuzie, Yvonne Will, Hugo Ceulemans, Samuel Kaski, Dorota Herman

*Corresponding author for this work

Research output: Contribution to journalArticleScientificpeer-review

2 Citations (Scopus)
29 Downloads (Pure)


Drug-induced liver injury (DILI) is an important safety concern and a major reason to remove a drug from the market. Advancements in recent machine learning methods have led to a wide range of in silico models for DILI predictive methods based on molecule chemical structures (fingerprints). Existing publicly available DILI data sets used for model building are based on the interpretation of drug labels or patient case reports, resulting in a typical binary clinical DILI annotation. We developed a novel phenotype-based annotation to process hepatotoxicity information extracted from repeated dose in vivo preclinical toxicology studies using INHAND annotation to provide a more informative and reliable data set for machine learning algorithms. This work resulted in a data set of 430 unique compounds covering diverse liver pathology findings which were utilized to develop multiple DILI prediction models trained on the publicly available data (TG-GATEs) using the compound’s fingerprint. We demonstrate that the TG-GATEs compounds DILI labels can be predicted well and how the differences between TG-GATEs and the external test compounds (Johnson & Johnson) impact the model generalization performance.

Original languageEnglish
Pages (from-to)1238−1247
JournalChemical Research in Toxicology
Issue number8
Early online date2022
Publication statusPublished - 21 Aug 2023
MoE publication typeA1 Journal article-refereed


Dive into the research topics of 'Chemistry-Based Modeling on Phenotype-Based Drug-Induced Liver Injury Annotation : From Public to Proprietary Data'. Together they form a unique fingerprint.

Cite this