Variational temporal deconfounder network for individualized treatment effect estimation with longitudinal observational data

Estimating the causal effect of a treatment, intervention, or exposure on an outcome is critical for evidence-based medicine, informing regulatory decision-making, guiding clinical guidelines, and aiding healthcare professionals in treatment selection. Randomized controlled trials (RCTs) are the gold standard for estimating the causal effect of treatments, typically for average treatment effects (ATE) [1]. However, they are not always feasible or ethical (e.g., randomly assigning individuals to smoke), and they can be time-consuming, costly, and may lack generalizability to real-world populations [2,3].

Over the past decade, the widespread adoption of electronic health record (EHR) systems has enabled large-scale real-world data (RWD) for research. These data complement traditional clinical trials by offering insights into treatment effectiveness and safety across diverse settings [4]. Recognizing the value of real-world evidence (RWE), the U.S. Food and Drug Administration (FDA) has issued guidance on using RWD sources (e.g., EHRs, administrative claims) to support regulatory decisions [5,6]. Additionally, personalized healthcare has emerged as a major focus in modern medical research, offering the potential to tailor treatments to individual patient characteristics [5]. By accounting for the heterogeneity in treatment response, personalized medicine seeks to enhance clinical outcomes by emphasizing the importance of individualized treatment effect (ITE) estimation. In recent years, research into ITE estimation using more accessible observational real-world data has flourished, aiming to bridge this critical gap in advancing personalized medicine [6].

Despite growing interest in real-world, longitudinal data for personalized treatment strategies, robust ITE estimation in observational settings remains challenging. Confounding bias arises when certain variables (confounders) influence both the outcomes and treatments [7]; potentially skewing causal estimates. Although measurable confounders can often be mitigated by controlling in models, hidden confounders—those unobservable or unmeasured—pose a more difficult obstacle [8]. For instance, in cancer care, factors such as drug resistance and toxicity can significantly affect both treatment decisions and patient outcomes yet may be missing or unrecorded in EHRs. Furthermore, real-world longitudinal data often involve multiple, time-varying treatments and high-dimensional patient information [9], complicating efforts to capture the full complexity of treatment effects. Consequently, there is a pressing need for innovative methodologies that can (1) handle the complexities of longitudinal data, (2) account for hidden confounders, and (3) effectively model heterogeneous treatment effects. Addressing these challenges paves the way for truly individualized care, offering clinicians and policymakers actionable insights derived from diverse, real-world patient populations.

To address the above challenges, we propose a novel framework called Variational Temporal Deconfounder Network (VTDNet), which consists of three key components: a temporal transformer-based encoder-decoder structure to adjust latent variables for the time-varying nature of longitudinal data; a Treatment Block is designed to mitigate bias introduced by hidden confounders by inferring the interrelationships among multiple treatments within the dataset; and the Potential Outcome Block, which is responsible for estimating the outcomes of interest by incorporating treatment information using an attention mechanism. The proposed VTDNet effectively models dynamic treatment effects using irregular time points and high-dimensional data, while accounting for hidden confounders over time.

The main contributions of this paper can be summarized as follows:•

We address the challenge of estimating ITE in a longitudinal setting where multiple treatments may be administered over time.

We propose a new framework, VTDNet, which consists of three components designed to mitigate temporal unmeasured confounding bias.

To validate the effectiveness of our proposed framework, we conducted a simulation study incorporating hidden confounders to evaluate VTDNet’s ability to infer these latent variables.

We assess the model's performance using two real-world datasets: (1) the Medical Information Mart for Intensive Care (MIMIC-III) dataset, which contains time-series patient data from intensive care units (ICU) and primarily captures short-term acute changes, and (2) the National Alzheimer’s Coordinating Center (NACC) database, which provides longitudinal data on cohorts either at risk for or diagnosed with Alzheimer’s disease, capturing patterns of long-term chronic progression. Our results demonstrate that the VTDNet model effectively estimates ITE in the presence of hidden confounders.

We performed stratification analysis, revealing substantial heterogeneity in patient responses to the interventions, and identified key patient characteristics that distinguish high- and low-benefit subgroups

Comments (0)

No login
gif