MAGED: Multimodal attentive graph learning with gene expression dynamics on knowledge graphs for TCM target prediction

As a traditional medical system with millennia of practical experience, Traditional Chinese Medicine (TCM) has accumulated rich clinical knowledge in disease prevention and treatment (Fu et al., 2022; Wu et al., 2024). Nevertheless, the mechanistic connections between macroscopic clinical outcomes and the underlying molecular actions remain insufficiently elucidated. Studies suggest that herbal medicines act through a "systems pharmacology" mechanism involving multi-component synergy on multiple targets (Chen et al., 2023; Zhang et al., 2024), yet their chemical complexity and combinatorial rules make experimental characterization costly and challenging (Song et al., 2024). Conventional target identification requires concurrent discovery of active compounds and their targets, often suffering from long cycles, high cost, and high false-positive rates (Chen et al., 2020; Cui et al., 2022). Structure-based prediction is also limited: synergy among compounds introduces noise; large molecules in mineral/animal-derived medicines lack stable structures; and chemical transformations during processing (炮制) are not fully characterized. Thus, computational methods are urgently needed to complement and guide experimental efforts.

Computational methods for Drug-Target Interaction (DTI) prediction have been increasingly applied to Herb-Target Interaction (HTI) prediction due to shared molecular mechanisms. Existing approaches fall into three categories: structure-based (e.g., molecular docking), network-based (topological propagation/embedding), and deep learning-based methods. Structure-based methods require high-quality 3D structures, limiting their coverage (Sydow et al., 2019). Network-based approaches use graph topology but often ignore node attributes like chemical or functional features (Wang et al., 2019). With the growth of biological data, deep learning models (e.g., DNNs, GNNs, Transformers) have excelled in binary classification/link prediction by automating feature extraction from large datasets (Chen et al., 2024; Zhao et al., 2025).

Recent research in this field has increasingly adopted knowledge graph–based approaches to systematically integrate heterogeneous evidence from clinical observations and scientific literature, and to support target inference through graph reasoning algorithms such as random walk, graph embedding, and graph neural networks. Representative methods include heNetRW (Yang et al., 2018), which constructs a heterogeneous herb-target network and ranks putative targets via a random-walk strategy; and HTINet2 (Duan et al., 2024), which further incorporates knowledge graph embeddings into a deep learning framework, leveraging herbal properties and clinical treatment information to enhance the identification of herb-target interactions. Despite these advances, current methods remain limited in their ability to jointly model the theoretical foundations of TCM and the underlying molecular regulatory mechanisms. On one hand, TCM emphasizes holistic principles and “pattern-based treatment” (证治), where therapeutic effects are closely linked to multidimensional herb attributes, including the Four Natures, Five Flavors, meridian tropism, and functional properties. Capturing such rich semantic information requires more expressive representations, such as those derived from pretrained large language models. On the other hand, molecular-level dynamics induced by herbal interventions, such as gene expression perturbations and causal biological pathways, have not been systematically incorporated into existing inference frameworks. This omission constrains the model's ability to trace the mechanistic route from pharmacological action to symptom improvement. Consequently, a key scientific challenge lies in developing a cross-level, multimodal learning framework capable of jointly modeling macroscopic symptoms, herbal attributes, and microscopic biological pathways. Such a unified model is essential for improving the reliability of target prediction and enhancing the interpretability of the underlying therapeutic mechanisms.

Multimodal learning integrates multi-source heterogeneous data, improving robustness through cross-modal complementarity (Peng et al., 2024; Ren et al., 2023). Yet, most HTI prediction methods rely on single data types, neglecting integration of KG structural priors with functional omics data (e.g., transcriptomics). Furthermore, synergistic modeling of multi-source data remains underexplored. Interpretability is also critical, integrating causal reasoning (e.g., DeMAND (Woo et al., 2015)) with prior regulatory networks ensures biological plausibility and traceability. Biomedical knowledge graphs have already demonstrated strong potential in integrating multi-omics data and facilitating target discovery (Chandak et al., 2023; Cui et al., 2025; Serra et al., 2025). Therefore, developing novel computational frameworks capable of synthesizing multi-scale biological regulatory information with TCM theory holds significant scientific value for accurately predicting herbal targets and systematically elucidating their pharmacological mechanisms.

To address these gaps, we frame HTI prediction as a link prediction task in a knowledge graph and propose an end-to-end multimodal graph attention framework. Our model integrates semantic herb attributes (from TCM theory) and transcriptomic functional perturbations (e.g., pathway enrichment scores) into dynamic contextual representations. These are injected into graph message passing and attention mechanisms to enhance biologically relevant signals and suppress noise. Specifically, we encode textual herb attributes using a pre-trained language model and combine them with normalized enrichment scores (NES) via a learnable multimodal encoder. The resulting context vectors gate attention weights in a heterogeneous graph, modulating relationships and neighbor importance. We use a hierarchical attention design: direction-aware attention for causal regulatory edges and context-aware attention for herb associative edges. We build a cross-scale biomedical KG incorporating TCM concepts (e.g., symptoms, herb properties) and molecular interactions (e.g., mRNA regulation, protein-DNA binding, non-coding RNA). Training on this graph enables simultaneous use of semantic, symptomatic, and biomolecular information, significantly improving HTI prediction accuracy and interpretability.

Comments (0)

No login
gif