Unveiling optimal molecular features for hERG insights with automatic machine learning

We developed MaxQsaring, a novel universal framework integrating molecular descriptors, fingerprints, and deep-learning pretrained representations, to predict the properties of compounds. Applied to a case study of human ether-à-go-go-related gene (hERG) blockage prediction, MaxQsaring achieved state-of-the-art performance on two challenging external datasets through automatic optimal feature combinations, and successfully identified top the 10 important interpretable features that could be used to model a high-accuracy decision tree. The models’ predictions align well with empirical hERG optimization strategies, demonstrating their interpretability for practical utilities. Deep learning pre-trained representations have been demonstrated to exert a moderate influence on enhancing the performance of predictive models. Nevertheless, their impact on augmenting the generalizability of these models, particularly when applied to compounds possessing novel scaffolds, appears to be comparatively minimal. MaxQsaring excelled in the Therapeutics Data Commons (TDC) benchmarks, ranking first in 19 out of 22 tasks, showcasing its potential for universal accurate compound property prediction to facilitate a high success rate of early drug discovery, which is still a formidable challenge.

Comments (0)

No login
gif