Comparing Digital Versus Face-to-Face Delivery of Systemic Psychotherapy Interventions: Systematic Review and Meta-Analysis of Randomized Controlled Trials


IntroductionBackground

Digital delivery of mental health interventions has gained increasing prominence in recent decades. Digital delivery provides solutions to many mental health intervention obstacles, such as limited access [], geographical distance [], financial constraints [], and transportation issues []. These factors, and not least the COVID-19 pandemic [], have contributed to increased delivery of digital modalities of mental health care []. One current line of research uses randomized controlled trials (RCTs) to determine the efficacy of digitally delivered family therapies [] and parenting interventions [], including but not limited to systemic approaches. This research supports the efficacy of the digital delivery of interventions and finds no evidence for digitally delivered interventions being inferior to face-to-face interventions [-]. However, most of the current literature does not compare face-to-face and digitally delivered interventions directly, does not differentiate between the types of digital delivery modalities (eg, therapist-guided videoconferencing and self-guided web-based programs), and does not consider how the modality differentially impacts specific disorders, outcomes, or populations. Furthermore, these studies aggregate findings across broad and diverse contextual definitions of family therapy and parenting interventions instead of applying unified and stringent theory-informed definitions. Although these reviews (eg, McLean et al [] and Florean et al []) do not exclusively focus on a unified and stringent definition, they do provide evidence for systemic approaches in the broad context of family interventions. Many family therapists consider themselves to be systemic therapists, and there is an overlap between family therapy and systemic therapy (ST). However, ST is grounded in a theoretical treatment model, and unlike family therapy, it is not primarily defined by the therapy setting []. Thus, ST provides a unified and stringent definition that has practical consequences for research and clinical practice.

ST is a conceptual framework for mental health interventions that focuses on interpersonal relations, interactions, social surroundings and resources, perspectives, and constructions of problems; attempted solutions are appreciated and used as integral parts of the intervention [,,]. Consequently, most family therapy and parenting interventions can be said to contain elements of ST to varying degrees while also potentially including elements from other conceptual frameworks, such as cognitive behavioral therapy and psychoeducation []. Conversely, as ST interventions often focus on families as important resources requiring incorporation into intervention delivery [], they can, but need not be, similar to other family therapy or parenting interventions in terms of including other family members in their setting. The efficacy of ST as a face-to-face intervention is well documented in various reviews and meta-analyses for a range of psychological disorders for youth and adult populations [,-,]. In line with developments for other psychotherapeutic interventions, the implementation of digital delivery modalities of ST has rapidly increased in recent years [-]. The implementation of videoconferencing is particularly promising, and the experiences of practitioners implementing this delivery modality are overall positive []. Despite such indications, there are gaps in the literature [,]. There is an urgent need to provide an evidence base for practitioners to understand how face-to-face and digital delivery of the same ST intervention compare (compare McLean et al [], Florean et al [], and Fairburn and Patel []).

The need for greater granularity regarding the comparative efficacy of digital and face-to-face delivery modalities is in line with the more general requirement to identify “conditions under which systemic therapy works best” []. Thus, we simultaneously address both the need for a systematic review and meta-analysis that investigates (1) how various types of ST interventions, delivered in their traditional face-to-face form, compare to the same ST intervention delivered digitally with regard to their efficacy and (2) delivery modality as key contextual factors that modulate ST intervention outcomes.

Research Questions

We conducted a systematic review and meta-analysis to address the following research questions: (1) What are the characteristics, quality, and resulting evidence from published RCTs comparing the efficacy of ST interventions using digital treatment modalities with ST interventions using face-to-face treatment modalities across settings, populations, and outcomes? (2) If a meta-analysis of RCTs comparing ST interventions in digital and face-to-face modalities can be conducted, what is the difference in delivery modalities for mental disorder outcomes?


MethodsOverview

This systematic review followed reporting guidelines as laid out in the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) 2020 statement ( 2020 checklist) []. We aimed to identify, evaluate, and synthesize findings from RCTs comparing digital and face-to-face delivery modalities for ST interventions across settings, target populations, and outcomes. This systematic review was preregistered (PROSPERO ID: CRD42022335013), with no deviation from the registered protocol.

Inclusion and Exclusion Criteria

We applied the following inclusion and exclusion criteria for articles to be considered for our systematic review within the following categories: (1) study design, (2) participants, (3) interventions, (4) digital delivery and face-to-face delivery modalities, (5) outcomes, and (6) article type.

RCTs of any type were included (eg, parallel, cluster, and quasi-RCTs). No minimum number of participants was prespecified.

Only those articles were deemed eligible for inclusion that involved individual participants of all ages and genders (1) with one or more diagnoses of a mental disorder, (2) identified as at risk in one or more mental or behavioral health domains, or (3) adversely affected by salient circumstances (eg, belonging to a structurally disadvantaged community and caring for a family member). Articles involving participants at the system level (eg, families) were also included as long as 1 member of the system fulfilled at least one of the aforementioned criteria. No further exclusion criteria were applied to increase the number of eligible studies.

We only included articles reporting on interventions delivered via at least 2 distinct delivery modalities: digital delivery and face-to-face delivery. Digital delivery was defined as exclusively relying on an external electronic medium (eg, videoconferencing hardware and software, telephone, and synchronous and asynchronous text-based conversations) for intervention delivery. While this allowed heterogeneity of digital delivery modalities across some dimensions (eg, synchronicity and extent of interaction with mental health practitioners), it kept other dimensions stable across all digital delivery modalities (use of electronic medium and no mental health practitioner physically copresent with recipients of intervention). Face-to-face delivery was defined as exclusively relying on in-person intervention delivery. This allowed for the use of external tools or media to supplement the intervention while specifying that intervention delivery had to occur solely between people physically present in the same physical space (eg, a clinician’s office).

To increase the number of eligible studies, any relevant mental, behavioral, or somatic health efficacy; adherence and attrition; and further outcomes were included in this review. No a priori categorization of outcomes was applied. Efficacy outcomes were defined as any outcomes pertinent to improvement in relevant mental, behavioral, and somatic health domains. Further outcomes were defined as any outcomes not directly pertinent to improvement in relevant health domains (eg, subjective satisfaction rating of program delivery).

There were no restrictions imposed on the article type in terms of language or date. Articles published in peer-reviewed journals as full-text articles and other types of peer-reviewed full-text articles were included. Unpublished data (eg, dissertations), as well as study protocols, clinical trial registries, and non–peer-reviewed articles, were excluded.

Intervention Inclusion Criteria

Only articles reporting on ST interventions delivered via both digital and face-to-face delivery modalities were included. ST was operationalized based on 4 adapted definition criteria of ST [], which were applied to the descriptions of the identified constituent primary parts (CPPs; eg, core sessions) of interventions. The definition criteria were derived from the following definition of ST (referred to as systemic therapy) with our adjustments in square brackets []:

We use systemic/systems-oriented therapy/therapies (ST) as a general term for a major therapeutic orientation that can be distinguished from other major approaches (e.g., CBT or psychodynamic therapy). We define systemic therapy as a form of psychotherapy that (1) perceives behavior and mental symptoms within the context of the social systems people live in; (2) focuses on interpersonal relations and interactions, social constructions of realities, and[/or] the recursive causality between symptoms and interactions; (3) includes family members and[/or] other important persons (e.g., teachers, friends, professional helpers) directly or indirectly through systemic questioning, hypothesizing, and specific interventions; and (4) appreciates and utilizes clients’ perspectives on problems, resources, and[/or] preferred solutions.

For each intervention, the appropriate unit of primary intervention delivery (as opposed to parts of the intervention labeled supplementary, additional, or optional, etc) was determined following the study’s authors (eg, core sessions, core components, etc.). The total number of the intervention’s CPPs was determined. If the intervention description did not afford to determine an appropriate unit of intervention delivery, the total number of CPPs was defined as 1.

For each description of content for any CPP, we assessed if it met any of our definition criteria of ST. Notably, any particular definition criterion could, in theory, be met by a set of CPP descriptions >1, allowing different content descriptions to meet the same criterion. This allowed a range of different interventions to be classified as being similar in kind (ie, being ST interventions), reflecting the integrative nature of clinical ST practice [,], rather than requiring all interventions to have identical CPPs.

All definition criteria of ST needed to be met by the respective CPPs, with at least 2 of the 4 definition criteria being met completely by all CPPs. If any of the definition criteria of ST were not completely met, at most 2 of the 4 definition criteria needed to be met by a minimum of at least 50% of all CPPs (or, in the case of uneven numbers of CPPs, the closest possible number below the theoretical half point). The same CPP could satisfy >1 criterion. Definition criterion (1) could additionally be met by the relevant conceptual background provided.

illustrates an example for 1 trial included in our study. Our approach used to screen trials against inclusion criteria for all trials is detailed in [,-].

Textbox 1. Identified constituent primary parts (CPPs) of 1 trial against systemic therapy (ST) definition criteria for Behavioral Family Systems Therapy for Diabetes (BFST-D) intervention [33].

Number and unit of CPPs and 4 primary intervention components

Perceives behavior and mental symptoms within the context of the social systems people live inFamily functioning and maladaptive parent-child interactions were identified as central barriers to diabetes treatment adherence.Focuses on interpersonal relations and interactions, social constructions of realities, or the recursive causality between symptoms and interactions: 3 of 4 CPPsCPP1: family-problem solving (as a family, defining the problem, generating solutions, making decisions, implementing and monitoring results, and refining ineffective solutions)CPP2: communication training (instruction, feedback, modeling, and rehearsal of approaches toward improving maladaptive communicative patterns)CPP4: family restructuring (functional and structural approaches toward changing maladaptive or ineffective family system patterns and characteristics, such as weak parental coalitions or cross-generational coalitions).Includes family members and other important persons (eg, teachers, friends, professional helpers) directly or indirectly through systemic questioning, hypothesizing, and specific interventionsAll CPPs are delivered to the caregiver-adolescent dyad.Appreciates and uses clients’ perspectives on problems, resources, and preferred solutions: 2 of 4 CPPs are as follows:CPP1 and CPP3: cognitive restructuring (addressing beliefs, attitudes, and attributions that could negatively affect effective interactions).Search StrategyDatabase Search

The following databases were searched from the earliest available date until March 15, 2022: PubMed, Embase (via Ovid), Cochrane CENTRAL (via Ovid), CINAHL (via EBSCO), PsycINFO (via EBSCO), and PSYNDEX (via EBSCO). The search strategy followed the guideline from Bramer et al [], and the original search string was developed for use in PubMed and adapted manually for all other databases. The search strings contained a set of the following: (1) 1 compiled term for intervention type, (2) 1 compiled specificity term to define the target domain, and (3) 1 compiled delivery modality term. For the identification of RCT study design, we used the RobotSearch AI RCT highly sensitive filter [], which provides high sensitivity in identifying RCTs [-]. [,,,-] provides the full search strings, references of the published sources upon which we based our search terms, and explanations of the logical structure. On March 17, 2022, we conducted complementary searches in ClinicalTrials.gov, the International Clinical Trials Registry Platform, and the EU Clinical Trials Register (all via the respective standard website interface). On May 4, 2022, backward citation screening (articles cited) was conducted on identified relevant reviews and meta-analyses ( [,,,,,-]). On May 10, 2022, forward citation screening (articles citing included articles) was conducted via ISI Web of Science, backward citation screening on included articles was conducted via PubMed, and a “related articles” screening was conducted on included articles (first 20 articles listed as related articles) via PubMed.

Article Selection and Screening

Duplicates were removed using Automated Systematic Search Deduplicator []. The resulting list of search results was split into 2 sets (sorting by title in EndNote [version 20; Clarivate] and allocating alternating sets of approximately 100 consecutive references; number of articles=1816 and 1817) that were each assigned to 2 pairs of authors (PE and JB; MB and JB), with each author independently screening titles and abstracts using manualized standard operating procedures that we developed accommodating our inclusion and exclusion criteria. All discrepancies at the title and abstract screening stage were resolved through discussion between all 3 involved authors (PE, MB, and JB). PE and MB independently screened at the full-text stage. Discrepancies at this stage were resolved through discussion between PE and MB. GM was consulted as an independent author of this study in cases of persisting discrepancies after discussions at any stage. Because on several occasions a trial underlying an included article contributed data to >1 included article, article-trial correspondence was determined either via trial registration numbers (for 7 articles) or via references to articles on the same trial, unique trial names, as well as overlap in article authors, article dates, sample sizes and characteristics, interventions, and procedures (for 5 articles).

Data Extraction

Data were independently extracted by PE and MB using the Cochrane data collection form [] for intervention reviews ( [,,,-]). Discrepancies were resolved through discussion.

Risk of Bias

The authors (PE and MB) independently assessed the risk of bias (RoB) using the revised Cochrane Risk of Bias (version 2) tool for randomized trials [] and following the process outlined in chapter 8 of the Cochrane handbook []. Discrepancies were resolved through discussion. RoB figures were created using robvis [].

Data Synthesis and Analysis

We used RevMan (version 5.4) [] to create forest plots and calculate, depending on the variable type and type of analysis, mean differences (MDs), standardized MDs, risk ratios, and associated CIs at the 95% level. Where possible, we calculated missing SDs using the Cochrane collaboration RevMan calculator []. We conducted meta-analyses only in cases of acceptable levels of clinical and methodological diversity and where sufficient data were available []. We conducted meta-analyses using random effects models as we expected heterogeneity across articles to be high. Statistical heterogeneity was assessed using the I2 statistic. MDs and CIs were calculated where meta-analytic analysis was not possible, but nonaggregated analysis was still deemed appropriate. This was to increase qualitative comparability across studies with nonnegligible heterogeneity in statistical approaches and reported results. All relevant reported results are listed in [,,,-,-]. We grouped MDs of outcomes according to their CIs into (1) fully located within the range from –0.20 to 0.20 (equivalence), (2) fully located outside this range (superiority), or (3) partially located within and partially located outside this range (neither equivalence nor superiority). In the absence of an established a priori minimal important difference, we accept Cohen d cutoff of 0.2 (small effect) as the defined range from –0.20 to 0.20 to indicate equivalence []. We interpreted a CI of an MD falling within, but not exceeding, this range as evidence for equivalence, a CI of an MD falling entirely outside this range as evidence for nonequivalence or superiority of one delivery modality, and a CI falling within and exceeding this range as insufficient evidence to determine equivalence or nonequivalence on the respective measure. Where calculating MDs and CIs was impossible due to missing information, reported results on differences between digital and face-to-face delivery modalities were provided (). Study authors were contacted to provide missing information. A funnel plot was not created because we identified <10 trials [].


ResultsSearch Results

A flowchart of our search and screening process is depicted in []. The primary search for this study retrieved 4977 references and 3633 after removal of duplicates. Additional references (n=2) were identified through complimentary searches. The resulting 3635 references were screened at the title and abstract level. On the basis of our inclusion and exclusion criteria, 97.1% (3529/3635) articles were excluded during the title and abstract screening, and 2.9% (106/3635) articles were screened at the full-text screening stage. Overall, 0.3% (12/3633) of the initially identified articles reporting on 4 trials were included in this systematic review. Some (3/12, 25%) of the articles reporting on 2 trials [,,] are only reported in and , as they could not be included in further meta-analytic aggregation or discrete analysis of MDs and CIs.

Figure 1. PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) flowchart of the articles included in the systematic review. RCT: randomized controlled trial. *Results from CINAHL (EBSCO): n=432; PSYNDEX (EBSCO): n=53; PsycINFO (EBSCO): n=505; Embase (Ovid): n=2177; Cochrane (Wiley): n=533; PubMed: n=4159; **RobotSearch AI, Marshall et al []. Table 1. Characteristics of included studies (N=754)a.Trial and articlesTrial sample size (% of N; F2Fb; DDc)Youth age (y), mean (SD)Male, n (%)FeatureDIdF2FIeSessions, mean (SD)BFST-Df
Duke et al [], 201690 (11.9%; 46; 44)15.02 (1.75)55 (61.1)HbA1cg >9%VideoconferencingClinic5.8 (3.3)
Harris et al [], 201590 (11.9%; 46; 44)14.9h (1.7)55 (61.1)hHbA1c >9%Video conferencingClinic5.8 (3.3)
Freeman et al [], 201392 (12.2%; 45; 47)15.1i42 (45.7)hHbA1c >9%VideoconferencingClinic—j
Riley et al [], 201582 (10.9%)k14.155 (61.1)HbA1c >9%VideoconferencingClinic6.3 (3.4)
PAASl
Murry et al [], 2019421 (55.8%; 141; 141; CCm: 136)—195h (46)PRBnIIoGSp—
Murry et al [], 2019412 (54.6%; 137; 138; CC: 137)11.0191h (46)PRBIIGS—
Murry et al [], 2018412 (54.6%; 137; 138; CC: 137)11.4191 (46.4)PRBIIGS4.0 (3.0)
F-PSTq
Kurowski et al [], 2020150 (19.9%; 34; 56; SGr: 60)16.5 (1.1)96 (64)TBIsVideoconferencing+OPt,uClinic6.3 (2.6)
Wade et al [], 2019150 (19.9%; 34; 56; SG: 60)16.5 (1.1)96 (64)TBIVideoconferencing+OPuClinic6.3 (2.6)
Wade et al [],2019150 (19.9%; 34; 56; SG: 60)16.5 (1.1)96 (64)TBIVideoconferencing+OPuClinic6.3 (2.6)
Wade et al [],2019150 (19.9%; 34; 56; SG: 60)15.5 (1.5)v96 (64.0)TBIVideoconferencing+OPuClinic6.3 (2.6)
SUCCEATw
Truttmann et al [], 2020102 (13.5%; 50; 52)14.9 (1.9)9 (8.9)ANxOPGWSy6.5 (2.0)

aSum of n at the trial level. In case of inconsistencies in reported n across individual articles for each trial, the mode n was selected.

bF2F: face-to-face delivery.

cDD: therapist-guided digital delivery.

dDI: digital implementation.

eF2FI: face-to-face implementation.

fBFST-D: Behavioral Family Systems Therapy for Diabetes.

gHbA1C: glycated hemoglobin.

hCalculated based on percentages and overall sample size provided.

iCalculated based on mean ages, sample sizes, and SDs provided.

jNot available.

kOnly participants who completed the Child Depression Inventory measure (only those aged <18 y eligible).

lPAAS: Pathways for African American Success.

mCC: control condition.

nPRB: potential risk behavior.

oII: interactive interface.

pGS: group session.

qF-PST: Family Problem-Solving Therapy.

rSG: self-guided digital delivery.

sTBI: traumatic brain injury.

tOP: online program.

uTwo distinct digital delivery conditions: (1) videoconferencing+web-based program and (2) online program.

vMost likely a clerical mistake in reporting.

wSUCCEAT: Supporting Carers of Children and Adolescents with Eating Disorders in Austria.

xAN: anorexia nervosa.

yGWS: group workshop.

Characteristics of Included ArticlesNumber of Trials, Articles, and Sample Size

A total of 12 articles reporting on 4 trials were identified. One trial was reported on by a single article [] (Supporting Carers of Children and Adolescents with Eating Disorders in Austria [SUCCEAT] trial), while the remaining 92% (11/12) of articles reported on a total of 3 trials. Of all 12 articles, 58% (n=7) of articles provided trial registration numbers, allowing for formally confirmed identification of the reported trial.

For the purposes of this study, the 4 trials are abbreviated according to their respective intervention names as follows: Behavioral Family Systems Therapy-Diabetes (BFST-D) trial [,,,], Family Problem-Solving Therapy (F-PST) trial [-], Pathways for African American Success (PAAS) trial [,,], and the SUCCEAT trial [].

As articles reporting on the BFST-D trial and articles reporting on the PAAS trial varied with regard to sample sizes ( and ), the total number of randomized participants across all trials could not be unambiguously determined. Given that for both trials, only 1 article each reported a different sample size as compared to all other articles reporting on the same trial, we reported the mode sample size for the total number of participants randomized to all conditions in all trials, including a control condition (N=754), and for the total number of participants randomized to face-to-face and digital delivery conditions (n=617). The same principle was applied for all reported participant characteristics, which varied in terms of proportions and absolute numbers depending on variation in reported sample sizes.

Participants

Some (4/12, 33%) articles reported on interventions delivered to youths with poorly controlled type 1 diabetes and their parents (BFST-D trial), 25% (n=3) articles reported on interventions delivered to parents, siblings, and youths identified as being at disproportionate risk for contracting HIV and sexually transmitted infections, teen parenthood, and substance use (PAAS trial). Moreover, 33% (4/12) articles reported on interventions delivered to youths with traumatic brain injury and their parents and other family members (F-PST trial), and 8% (1/12) articles reported on interventions delivered to parents of youths with anorexia nervosa (SUCCEAT trial). As there is inconsistent terminology between and within articles, the term parent is used here and throughout this paper to refer to parents, grandparents, legal guardians, and caregivers included in the respective articles. All identified interventions either involved families in intervention delivery (BFST-D, PAAS, and F-PST trials) or in terms of interpersonal target outcomes (SUCCEAT trial). Additional participant information is provided in and [,,,-,-]. Participant characteristics were reported to be similar across conditions in all articles, with the exception of the F-PST trial, which included a lower proportion of White participants in the face-to-face delivery condition, probably linked to quasi-randomization involving place of residence in the allocation procedure.

Face-to-Face and Digital Interventions and Delivery Format

Interventions were provided by health professionals in 75% (9/12) of articles (BFST-D, F-PST, and SUCCEAT trials) and by trained community leaders in 25% (3/12) articles (PAAS trial). Intervention content was similar across delivery modalities for all trials, with all articles explicitly mentioning the same intervention being delivered across conditions and all CPPs of interventions being delivered in both conditions ( and ).

All therapist-guided digital delivery was provided via videoconferencing software. Self-guided intervention delivery was provided via access to web-based modules for 33% (4/12) of the articles (F-PST trial) and for 25% (3/12) of the articles (PAAS trial) via an interactive interface visually representing users through avatars in a virtual environment. Face-to-face delivery was reported as individual in-clinic sessions for 67% (8/12) of the articles (BFST-D and F-PST trials), as group sessions at community centers for 25% (3/12) of the articles (PAAS trial), and in-clinic workshops for multiple individuals for 8% (1/12) of the articles (SUCCEAT trial).

Outcomes

Across all 12 articles, 56 outcome measures (ie, dependent variables) were reported, including cases of parent-report and youth-report measures of the same construct. Primary outcomes (8/56, 14%), 71% (40/56) secondary outcomes, and 14% (8/56) further, dropout, or attrition outcomes were identified. We have provided an overview of all identified outcomes in [,,-,-].

We summarized and structured data along three domains: (1) outcome type (ie, youth, parent, family functioning, and dropout, attrition, or further), (2) delivery modality type (face-to-face, therapist-guided digital, and self-guided digital), and (3) measurement time point (postintervention test and follow up test) to reduce clinical and methodological diversity within outcome data categories. We provide a more detailed overview of delivery modality types (eg, videoconferencing or in-clinic sessions) in , an overview of outcome measures (ie, dependent variables) in , and measurement time points in .

The application of these 3 domains to the categorization of data yielded 16 potential outcome categories: youth ( [,,-]); parent (); family functioning ( []); and adherence, attrition, and further outcomes ( [,,,,]) for either therapist-guided digital versus face-to-face delivery () or self-guided digital versus face-to-face delivery modalities ( and 0) at either posttest ( and 0) or follow-up test () time points.

Risk of BiasOverview

The 12 articles were rated as high risk for allocation concealment. For sequence generation, 75% (9/12) articles were rated as having some concerns. For randomization, 2 of the 4 trials used quasi-randomization. Some (9/12, 75%) articles were rated as having some concerns in terms of missing data. Overall proportions of RoB levels across articles are presented in ; individual RoB assessments at the article level are presented in [,,,-,-]. Moreover, 92% (11/12) articles reported on 3 trials, with inconsistent reporting at times, which constituted a challenge for assessing the RoB, particularly for the domain of selective outcome reporting, as multiple articles reporting on the same trial were found to simultaneously overlap in some reported outcomes and diverge for other outcomes (eg, inconsistent reporting of the Diabetes Family Conflict Scale across articles reporting on the BFST-D trial).

Figure 2. Total proportions of articles’ risk of bias assessments by assessment domain. Figure 3. Risk of bias assessments of all included articles. D1: bias arising from the randomization process; D2: bias due to deviations from intended intervention; D3: bias due to missing outcome data; D4: bias in measurement of the outcome; D5: bias in selection of the reported result. Sequence Generation

Of all 12 articles, 8% (n=1) reported a quasi-randomization temporal sequencing design (SUCCEAT trial); 33% (n=4) reported a quasi-randomized, multicenter RCT design (F-PST trial), in which participants were unequally allocated to the digital and face-to-face delivery conditions based on their place of residence (all participants who lived outside a 40-kilometer radius from the site were randomly allocated to the digital delivery conditions, and participants within the radius were randomly allocated to all conditions, favoring the face-to-face delivery condition by a ratio of 2:1); and 58% (n=7) reported randomized controlled trials (BFST-D and PAAS trials). Of those 7 articles, 57% (n=4) reported block randomization without providing information on allocation sequence concealment, increasing the RoB from sequence generation (BFST-D trial). Overall, 25% (3/12) of the included articles were rated low RoB arising from the randomization, while 75% (9/12) were rated as giving rise to some concerns.

Allocation Concealment

The question of adequate allocation blinding procedures (for participants and intervention providers) is a common concern in psychotherapy outcome research []. This also applies to the articles included in this review. Given the absence of a universally agreed-upon alternative approach [], all articles were rated as high RoB in this domain.

Handling of Incomplete Data

For any of the 12 articles, not all outcome data were available. Some (2/12, 17%) of the articles provided evidence of data missing at random [] or treated attrition as an outcome variable [] and were rated as having low RoB. Most (9/12, 75%) articles did not adequately analyze if missing outcome data did not potentially bias the results and were rated as giving rise to some concerns.

Outcome Assessment

RoB arising from the measurement of the outcome variables was rated low risk for 83% (10/12) of the articles and with some concerns for 17% (2/12) of the articles that used a large proportion of outcome measures for which judgments regarding validity and reliability could not be made [,] ().

Selective Reporting

As indicated earlier, RoB from selective reporting was challenging to assess given the interaction of multiple articles reporting on the same trial and inconsistencies across articles on the same trials. This was further exacerbated in the case of a trial for which no preregistration could be identified (PAAS trial) and of an article exclusively reporting on unregistered outcomes of a preregistered trial []. Some (5/12, 42%) articles were rated as giving rise to some concerns, 9% (5/12) were rated as having low RoB, and 17% (2/12) of the articles were rated as having high RoB.

Appropriateness of a Meta-Analysis to Determine the Efficacy of Delivery Modalities, Adherence, and Attrition

Meta-analytic aggregation was deemed inappropriate for all delivery modality comparisons of youth, parent, and family functioning outcomes due to the relatively low number of trials within the respective outcome categories, combined with the presence of substantial clinical and methodological diversity across trials []. Meta-analytic aggregation of the number of sessions attended and attrition was conducted because heterogeneity was lower and the number of eligible trials was higher for these outcomes as compared to the other outcome measures reported and included ( [,,] and [,,]). The substantial clinical and methodological diversity across trials remains problematic for drawing meaningful conclusions using meta-analytic aggregations. However, a preliminary quantitative synthesis was included as it still serves as a starting point for discussion and future research.

Figure 4. Meta-analysis of number of sessions attended in face-to-face versus digital delivery modality comparison. Figure 5. Meta-analysis of drop-out (event) risk ratio in face-to-face versus digital delivery modality comparison. Nonaggregated Analyses for Face-to-Face and Digital Delivery Modalities

Of all 56 reported outcomes (ie, dependent variables) across all trials and articles, 25% (n=14) outcomes from 2 trials (13/14, 93% in the F-PST trial and 1/14, 7% in the SUCCEAT trial) showed MDs with corresponding CIs falling either (1) fully within our predefined range of minimal important difference of –0.20 to 0.20 or (2) fully outside of this range. These outcomes are reported in and . All remaining 75% (42/56) outcomes showed MDs with corresponding CIs partially located within and partially located outside this range. Among these, 4% (2/56) further outcomes, 2% (1/56) family functioning outcome, and 2% (1/56) youth outcome were reported as substantially different between delivery modalities in the original articles while still including our range for the minimal important difference. shows a full list of all MDs and CIs of outcomes.

Table 2. Outcomes with substantial differences (95% CI excluding 0) or equivalence across all trials (N=252; sum of n at the trial level).Outcome type, postintervention test and follow-up test, trial, and articleOutcomeType of DDaF2Fb delivery, nF2F delivery, mean (SD)DD, nDD, mean (SD)Mean difference (95% CI)DirectionalityYouth
Postintervention test

F-PSTc; Wade et al [], 2019


PedsQLd, parent reportTGe2262 (15.71)4370.9 (18.49)–8.90 (–17.48 to –0.32)Favors DD


PedsQL, parent reportSGf2262 (15.71)5171.7 (18.85)–9.70 (–18.06 to –1.34)Favors DD
Follow-up test

F-PST; Wade et al [], 2019


HBIg somatic, youth reportTG229.8 (4.92)446.6 (5.90)3.20 (0.50 to 5.90)Favors DD


HBI cognitive, parent reportSG2219.1 (8.58)4812.3 (9.91)6.80 (2.25 to 11.35)Favors DD


HBI somatic, youth reportSG229.8 (4.92)486.8 (5.89)3.00 (0.35 to 5.65)Favors DD


PedsQL, parent reportSG2267 (16.18)4876.6 (18.98)–9.60 (–18.23 to –0.97)Favors DDParent
Postintervention test

F-PST; Wade et al [], 2019


CES-DhTG2215.1 (1.81)4312.8 (1.53)2.30 (1.42 to 3.18)Favors DD


BSIiTG2256.1 (2.46)4353.3 (2.02)2.80 (1.61 to 3.99)Favors DD


CES-DSG2215.1 (1.81)5113.5 (1.42)1.60 (0.75 to 2.45)Favors DD


BSISG2256.1 (2.46)5154.7 (1.88)1.40 (0.25 to 2.55)Favors DD

SUCCEATj; Truttmann et al [], 2020


SCLk-90-RTG480.24 (0.26)460.31 (0.36)–0.07 (–0.20 to 0.06)Equivalence
Follow-up test

F-PST; Wade et al [], 2019


BSITG2251.7 (2.42)4453.3 (2.02)–1.60 (–2.77 to –0.43)Favors F2F delivery


CES-DSG2211.8 (2.03)4813.2 (1.58)−1.40 (−2.36 to −0.44)Favors F2F delivery

Wade et al [], 2019


P-PElTG229.0 (1.4)437.8 (2.1)1.20 (0.34 to 2.06)Favors F2F delivery

aDD: digital delivery.

bF2F: face-to-face.

cF-PST: Family Problem-Solving Therapy.

dPedsQL: Pediatric Quality of Life Inventory.

eTG: therapist guided.

fSG: self-guided.

gHBI: Health and Behaviour Inventory.

hCES-D: Center for Epidemiological Studies Depression Scale.

iBSI: Brief Symptom Inventory.

jSUCCEAT: Supporting Carers of Children and Adolescents with Eating Disorders in Austria.

kSCL: Symptom Checklist 90-Revised Global Severity Index.

lP-PE: Parent Rated Program Evaluation.

Of the 14 outcomes falling fully within or fully outside the minimal important difference range, superiority of the digital delivery condition was found for 71% (n=10) outcomes. Of these 10 outcomes, 60% (n=6) were youth outcomes and 40% (n=4) were parent outcomes. The quality of life of youths, reported by parents (Pediatric Quality of Life Inventory, parent report) at post intervention test, was substantially higher for both therapist-guided and self-guided digital delivery modalities compared to face-to-face delivery modalities and remained substantially higher for the self-guided digital delivery condition at follow-up. Somatic symptoms of youths, reported by them (Health and Behaviour Inventory somatic, youth report) at follow-up, were substantially higher in the face-to-face delivery condition compared to both therapist-guided and self-guided digital delivery conditions. In addition, cognitive symptoms of youths, reported by parents (Health and Behaviour Inventory cognitive, parent report) at follow-up, were substantially higher in the face-to-face delivery condition compared to the self-guided digital delivery condition. Depressive symptoms of parents (Center for Epidemiological Studies Depression Scale) at post intervention test were substantially higher in the face-to-face delivery condition compared to both therapist-guided and self-guided digital delivery conditions. Similarly, mental symptoms of parents (Brief Symptom Inventor) at post intervention measurement were substantially higher in the face-to-face delivery condition compared to both therapist-guided and self-guided digital delivery conditions.

Superiority of the face-to-face delivery condition was found for 3 outcomes. All of these outcomes were parent outcomes from the F-PST trial at follow-up. In contrast to comparisons at post intervention measurement, depressive symptoms of parents (Center for Epidemiological Studies Depression Scale) at follow-up were substantially higher in the self-guided digital delivery condition compared to the face-to-face delivery condition. Similarly, contrasting with comparisons at post intervention measurement, mental symptoms of parents (Brief Symptom Inventory) at follow-up were substantially higher in the therapist-guided digital delivery condition compared to the face-to-face delivery condition. Program evaluation by parents was more favorable in the face-to-face delivery condition compared to the therapist-guided digital delivery condition.

Only 2% (1/56) of outcomes fell fully within the minimal important difference range. In the SUCCEAT trial, psychopathological symptoms of parents (Symptom Checklist 90-Revised Global Severity Index) at post intervention test were equal between therapist-guided digital and face-to-face delivery conditions.

Meta-Analytic Aggregation of Adherence and Attrition

A total of 9 articles reporting on all 4 trials (BFST-D trial [,,], PAAS trial [], F-PST trial, and SUCCEAT trial) reported on the number of sessions attended, and all (11/12, 92%) articles, except for 1 (PAAS trial []), reporting on all trials reported on attrition. The number of sessions attended and attrition were meta-analyzed, and the meta-analytic aggregations are reported in and . The heterogeneity between trials was moderate (I2=48% for number of sessions attended) to high (I2=61% for attrition), and given the low number of included trials, this heterogeneity estimate might be inaccurate [], and conclusions from the meta-analyses should thus be interpreted with caution.


DiscussionPrincipal Findings

We identified 12 articles reporting on 4 trials from which data were extracted (). Of note, despite broad inclusion criteria for participants, participants included only youths or parents of youths. Recipients of interventions were youths with poor diabetic control, traumatic brain injuries, increased risk behavior likelihood, and parents of youths with anorexia nervosa. A total of 56 relevant outcomes were identified. Two trials provided digital intervention delivery via videoconferencing, one via an interactive graphic interface, and the other via a web-based program. RoB judgments varied substantially across domains and articles, with largely some concerns judgments for randomization, missing outcome data, and selective reporting. Applying a minimal important difference criterion to indicate equivalence or superiority, nonaggregated analyses of MDs and CIs between delivery modalities yielded varying types of comparative outcomes across the included studies: superiority of the digital delivery modality and superiority of the face-to-face delivery modality (CIs falling fully outside the range of the minimal important difference) and equivalence between delivery modalities (CIs falling fully inside the range of the minimal important difference), with most comparative outcomes, indicating neither superiority of one modality nor equivalence between delivery modalities (CIs falling within and outside the range of the minimal important difference). Overall, our analyses did not reveal any strong evidence for the superiority of face-to-face compared to digital delivery conditions and might tentatively instead indicate a higher proportion of favorable effects of digital delivery modalities for certain outcomes and certain contexts. Due to the qualitative nature of this analysis, the limited number of studies included in this review, and the large heterogeneity between studies, results need to be interpreted with caution. In addition, more research is needed to draw meaningful conclusions on the comparative efficacy of digital versus face-to-face delivery modalities of ST interventions.

We found a limited and insufficient amount of evidence to meta-analytically estimate the comparative efficacy of digital versus face-to-face delivery modalities of ST interventions with regard to most psychological, somatic, and behavioral parent and youth outcomes, as well as family functioning and further outcomes at either postintervention test or follow-up test. Meta-analytic analysis was deemed appropriate only for attrition and the number of sessions attended. We compared digital versus face-to-face delivery modalities for attrition and the number of sessions attended, showing no evidence for the superiority of either delivery modality in these domains.

Comparison With Prior Work

Our findings are largely consistent with related findings from a recent meta-analysis on family therapy, including 3 trials reporting no evidence for a difference at postintervention test between therapist-guided digital versus face-to-face delivery of interventions for youth outcomes and parent outcomes []. Similarly, our findings are consistent with findings from a recent meta-analysis on therapist-guided digital parenting interventions that included 4 trials reporting no evidence for a difference at posttest for youth outcomes between therapist-guided digital and face-to-face delivery conditions []. None of this prior work, including ours, either strongly supports the superiority of face-to-face over digital delivery conditions or vice versa.

This study adds to research as the first to contribute to the existing body of research from an ST standpoint comparing RCTs of digital and face-to-face delivery modalities of the same intervention. While showing considerable overlap with family therapy, couple therapy, and parenting interventions, ST interventions are characterized by a

Comments (0)

No login
gif