Predicting Postoperative Recurrence Using a Support Vector Machine for Patients With Esophageal Squamous Cell Carcinoma: Machine Learning Modeling Development and Validation Study


Introduction

Esophageal cancer poses a threat to public health due to its high morbidity and mortality rates [,]. Postoperative tumor node metastasis (TNM) staging is the most valuable index for evaluating the prognosis of patients with esophageal squamous cell carcinoma (ESCC). However, it is an index that can only be confirmed after surgery and can only provide a theoretical basis for postoperative treatment strategies. Thus, it is of little significance for the planning of surgical strategies before surgery in individual patients, especially for those with a poor physical condition or in whom performing surgery is difficult, as it is difficult to evaluate the TNM staging of these patients.

Numerous factors affect the postoperative recurrence rate in patients with ESCC [], including postoperative complications, Eastern Cooperative Oncology Group (ECOG) performance status, clinicopathological characteristics, tumor markers, and inflammatory as well as nutritional indicators. These inflammatory and nutritional indicators include the neutrophil-to-lymphocyte ratio (NLR), C-reactive protein-to-prealbumin ratio (CPR) [], platelet×C-reactive protein multiplier (P-CRP), lymphocyte-to-monocyte ratio, Glasgow prognostic score (GPS), and other inflammatory markers [], all of which affect postoperative survival and prognosis. Tumor marker levels are widely recognized for their prognostic value in predicting postoperative recurrence of esophageal cancer [,]. However, studies that include other clinical indicators are relatively limited. A comprehensive and systematic analysis of preoperative blood indicators, patient-specific conditions, intraoperative factors such as duration of surgery and blood loss, and postoperative complications is necessary to identify key risk factors. Screening these indicators and constructing an optimal predictive model will have significant clinical value in improving postoperative management and patient outcomes.

Recently, an increasing number of studies have focused on predictive models. Here, we present the development and validation of a clinical prediction model using a support vector machine (SVM). The SVM was used to develop a more robust model for predicting postoperative recurrence compared with other approaches. As a new data mining methodology, SVM has been applied to predict tumor progression and clinical outcomes by integrating molecular markers and clinical features [-]. Furthermore, this method is suitable for small patient cohorts, where independent and random assignment into 3 groups enhances the reliability of analysis and validation. Given its advantages, SVM is likely to continue to provide valuable insights into the accurate prediction of the recurrence of ESCC [,]. We collected information on commonly used clinical blood indicators and surgical data, and the patients were followed up to analyze potential risk factors. Through iterative combinations of these factors weighted by their relative importance, we developed an optimal recurrence prediction model. Our goal is to integrate the indices from the optimal SVM model into an artificial intelligence model for patients with ESCC who have not yet had an individualized treatment plan developed.


MethodsPatients and Follow-Up

Baseline data was obtained from the medical records of patients diagnosed with ESCC between June 2014 and November 2016 at Jinling Hospital. Data was abstracted in December 2016 by 2 independent researchers (MQX and ZSJ) and a study database created, of which basic information was used for follow-up. These patients were followed up until October 2021. The collected data primarily comprised preoperative information, including basic information, blood indicators (inflammation, infection, and tumor markers), presence or absence of adjuvant therapy, intraoperative blood loss, duration of surgery, and postoperative complications. Follow-ups were conducted approximately every 3 months, primarily through phone calls. If the patients could not be reached, we obtained their contact details from the outpatient department, and additional attempts were made to establish communication. When phone contact was unsuccessful, we sent letters to or conducted home visits. Patients who remained unreachable were considered lost to follow-up and were excluded from the study.

Inclusion and Exclusion Criteria

We included patients who met the following criteria: (1) patients had a diagnosis of ESCC confirmed by a postoperative histopathological examination, (2) they had radical resection for ESCC, (3) they had complete clinical and follow-up data, and (4) the surgery was performed by the same surgeon. Patients who met the following criteria were excluded: (1) they had liver or kidney dysfunction or hematological disease; (2) they had a concurrent or previous history of other malignant tumors; (3) they had perioperative death, defined as mortality due to serious complications within 1 month postoperatively; and (4) they were receiving preoperative chemoradiotherapy.

Statistical Analysis

Data were analyzed using SPSS (version 22.0; IBM Corp) and R (version 3.6.1; R Foundation for Statistical Computing []). Univariate and multivariate analyses of the relative prognostic importance of parameters were performed using the Cox proportional hazards model. An SVM uses implicit mapping of input data into a high-dimensional feature space using a kernel function []. Learning occurs in this feature space based on the “kernel trick.” Due to its popularity in machine learning and pattern classification, numerous SVM packages are available, such as LIBSVM and KERNLAB. In this study, we used the R package KERNLAB. The SVM model was developed using perioperative data, inflammation markers, and tumor markers to predict ESCC recurrence. From SVM model 1 (SVM1) to SVM6, in the initial analysis, we evaluated all potential predictors through correlation and Cox proportional hazards regression. Candidate variables showing statistically significant associations with esophageal cancer recurrence (P<.05) underwent receiver operating characteristic (ROC) curve evaluation. All risk factors were area under the curve (AUC)-ranked and iteratively pruned to optimize the SVM model’s predictor set. An identical approach was applied from SVM7 to SVM10. SVM1 included all preoperative markers (ECOG, NLR, CPR, CY211, squamous cell carcinoma antigen [SCC], P-CRP, GPS, and age); SVM2 included factors in SVM1 excluding P-CRP; SVM3 included factors in SVM2 excluding GPS; SVM4 included factors in SVM3 excluding SCC; SVM5 included factors in SVM4 excluding age; SVM6 included factors in SVM5 excluding CPR (final variables: ECOG, NLR, and CY211); SVM7 included TNM, adjuvant therapy, differentiation, tumor size, and complications; SVM8 included factors in SVM7 excluding complications; SVM9 included factors in SVM8 excluding tumor size; and SVM10 included factors in SVM9 excluding differentiation. ROC curve analysis was performed for each SVM model, and the AUC values were used to calculate the predictive ability of the SVM models for recurrence.

All patients included in the study were randomly assigned to the test, validation 1 (Val1), or validation 2 (Val2) groups. Using the SVM algorithm, each group was further assigned to a high- or low-risk of recurrence group. In the test group, we combined several predictive indicators of recurrence to stratify patients into high- and low-risk subgroups. The predictive performance of this integrated predictive model was then validated in 2 independent cohorts (Val1 and Val2). The Kaplan-Meier method was used to calculate and plot recurrence curves, further validating the ability of the SVM models to distinguish patients with a high and low risk of recurrence. Sensitivity, specificity, the Youden index, the positive predictive value (PPV), and the negative predictive value (NPV) were assessed to evaluate the practical value of the model. χ2 tests were used to analyze differences in sensitivity, specificity, PPV, and NPV among the SVM models. A calibration curve was created using the Hosmer-Lemeshow goodness-of-fit test to assess the degree of calibration of the model to ensure its accuracy and reliability. All tests were 2-sided, and P<.05 was considered statistically significant.

Ethical Considerations

This study was approved by the Institutional Ethics Review Board (IERB 2018NZKY-021‐03) of the Ethics Committee of Jinling Hospital. Verbal informed consent was obtained by telephone during follow-up communications. Standard university hospital guidelines, in accordance with the principles detailed in the Declaration of Helsinki, were followed in handling patient tissues and publication, ensuring confidentiality and anonymity. All participants who completed the survey received a complimentary disease knowledge resource as a token of appreciation and compensation for their participation.


ResultsBasic Patient Information

We collected data from 311 patients with postoperative ESCC, which included 241 men (77.5%) and 70 women (22.5%) with a median age of 66 years (range 40-83 y). Preoperative data, blood indicators, intraoperative blood loss, duration of surgery, TNM stage, degree of differentiation, postoperative adjuvant therapy, and complications are shown in . The results of quantitative correlation analysis between preoperative tumor markers and postoperative clinical indicators are shown in . Postoperative complications included pulmonary infection, incision infection, gastrointestinal dysfunction, recurrent nerve injury, severe pulmonary infection, respiratory failure, hydropneumothorax, anastomotic fistula, anastomotic or thoracic fistula, and hemorrhage requiring re-operation. On October 15, 2021, 144 (46.3%) patients were recurrence-free, whereas 167 (53.7%) had a recurrence. The postoperative follow-up period ranged from zero to 93.5 months (median 36 mo), concluding in October 2021. The postoperative disease-free survival (DFS) was 78.7% at 1 year, 59% at 3 years, and 53.6% at 5 years (see ).

Risk Factors for Recurrence and Predictive Ability

According to univariate and multivariate Cox regression model analyses, age, ECOG performance status, NLR, CPR, CY211, TNM staging, and postoperative complications were identified as independent risk factors (see and ). Postoperative adjuvant therapy and ECOG performance status showed the highest predictive ability, as measured using the AUC values (AUC=0.63, 95% CI 0.570-0.695), followed by NLR (AUC=0.599, 95% CI 0.536-0.663). The predictive ability of CY211, CPR, tumor size, and cell differentiation was lower than that of TNM staging (AUC=0.676, 95% CI 0.615-0.737; see ).

Table 1. Risk factors affecting the recurrence of patients with esophageal squamous cell carcinoma (ESCC) by Cox single factor analysis.Clinical parametersBSEWalddfUnivariateP valueHR (95% CI)Age (years)−0.3350.1683.99310.715 (0.515-0.994).05Gender−0.1680.2030.68910.845 (0.568-1.258).41ECOG1.1620.18340.49213.198 (2.235-4.574)<.001NLR0.690.17116.39511.995 (1.428-2.786)<.001LMR−0.2140.1671.63410.808 (0.582-1.121).20P-CRP0.3420.1684.15111.407 (1.013-1.955).04GPS0.3940.1685.51111.483 (1.067-2.062).02CRP (mg/dL)0.2420.1672.10111.274 (0.918-1.769).15CPR0.5920.1712.17611.808 (1.297-2.523)<.001SCC (ng/ml)0.4190.176.06411.521 (1.089-2.122).01CY211(ng/ml)0.6510.17214.38811.918 (1.370-2.685)<.001Surgical method−0.120.1760.46510.887 (0.628-1.253).50Tumor location0.0780.1630.22911.081 (0.785-1.490).63Intraoperative blood loss0.2280.1911.42311.256 (0.864-1.828).23Operative time−0.1060.1680.39410.900 (0.647-1.251).53Tumor size0.6110.18910.48511.843 (1.273-2.669).001T0.6660.17314.89411.947 (1.388-2.731)<.001N1.4070.17663.59814.085 (2.890-5.773)<.001TNM1.3370.17459.17413.807 (2.708-5.351)<.001Cell differentiation0.8150.17421.90912.258 (1.606-3.176)<.001Adjuvant therapy0.8670.17125.83512.380 (1.704-3.325)<.001Complications0.4640.1717.40111.591 (1.139-2.222).007

aHR: hazard ratio.

bECOG: Eastern Cooperative Oncology Group.

cNLR: neutrophil-to-lymphocyte ratio.

dLMR: lymphocyte to monocyte ratio.

eP-CRP: platelet × C-reactive protein multiplier.

fGPS: Glasgow prognostic score.

gCRP: C-reactive protein.

hCPR: C-reactive protein-to-prealbumin.

iSCC: squamous cell carcinoma antigen.

jT: size or extent of the primary tumor.

kN: regional lymph nodes.

lTNM: tumor node metastasis.

Table 2. Risk factors affecting the recurrence of patients with esophageal squamous cell carcinoma (ESCC) by Cox multiple factor regression analysis.Clinical markersBSEWalddfMultivariate (HR (95% CI)P valueAge (≥66 vs <66 years)−0.5520.17410.0310.576 (0.409-0.810).002ECOG (≥1 vs <1)1.3620.19946.68213.905 (2.642-5.772)<.001NLR (≥2.43 vs <2.43)0.5530.1839.11811.739 (1.214-2.489).003CPR0.5390.189.00111.714 (1.206-2.438).003CY211 (≥2.65 vs <2.65 ng/mL)0.5260.1788.77711.692 (1.195-2.396).003TNM (III+IV vs I+II)1.3890.1859.23814.010 (2.816-5.712)<.001Complications0.5330.1828.57111.704 (1.193-2.435).003

aHR: hazard ratio.

bECOG: Eastern Cooperative Oncology Group.

cNLR: neutrophil-to-lymphocyte ratio.

dCPR: C-reactive protein-to-prealbumin.

eTNM: tumor node metastasis.

Table 3. The area under the curve (AUC) of receiver operating characteristic (ROC) curves with preoperative and postoperative clinical markers in predicting postoperative recurrence in patients with esophageal squamous cell carcinoma (ESCC).Clinical indexesAUC (95% CI)P valueReference0.500 (—)—Age (years)0.445 (0.381-0.510).10ECOG0.633 (0.571-0.695)<.001NLR0.599 (0.536-0.663).003P-CRP0.539 (0.475-0.604).23GPS0.543 (0.478-0.607).20CPR0.591 (0.527-0.654).006SCC (ng/ml)0.555 (0.491-0.619).09CY211(ng/ml)0.598 (0.534-0.661).003Tumor size0.573 (0.510-0.637).03TNM0.676 (0.615-0.737)<.001Cell differentiation0.571 (0.507-0.635).03Adjuvant therapy0.633 (0.570-0.695)<.001Complications0.550 (0.486-0.614).13

aAUC: area under the curve.

bNot applicable.

cECOG: Eastern Cooperative Oncology Group.

dNLR: neutrophil-to-lymphocyte ratio.

eP-CRP: platelet × C-reactive protein multiplier.

fGPS: Glasgow prognostic score.

gCPR: C-reactive protein-to-prealbumin.

hSCC: squamous cell carcinoma antigen.

iTNM: tumor node metastasis.

SVM Combined With the ROC Model for Predicting Recurrence

The SVM model combined with ROC analysis was used to predict recurrence. In the test, Val1 and Val2 groups, the sensitivity of SVM2—which included all preoperative markers—for predicting recurrence was 94.12%, 70.59%, and 60.98%, respectively, with a specificity of 98.21%, 63.33%, and 56.86%, respectively. The sensitivity of SVM6—which included ECOG, NLR, CY211—in the test, Val1, and Val2 groups was 67.86%, 60.47%, and 68.18%, respectively, with a specificity of 86%, 63.33%, and 64.91%, respectively. The sensitivity of SVM7—which included TNM, adjuvant therapy, differentiation, tumor size, and complications—in the test, Val1, and Val2 groups was 92.86%, 76.74%, and 84.09%, respectively, with a specificity of 76%, 61.67%, and 71.93%, respectively (see ). No significant difference was observed between the sensitivity and specificity of SVM2 and SVM7 (P>.05). However, SVM6 had a lower sensitivity for predicting recurrence than SVM7. The sensitivity of SVM6+8 for predicting recurrence was 94%, 79.59%, and 72.73% in the test, Val1, and Val2 groups, respectively, with a specificity of 98.11%, 69.84%, and 78.43%, respectively. These sensitivities were comparable with those of SVM6+TNM, and the specificities were higher than those of SVM6+TNM (P<.001; see ).

Table 4. Comparison among different marker combinations obtained before surgery and after surgery and all markers according to sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), and accuracy in predicting patients’ recurrence.Variable combinationsTest + validation group 1 + validation group 2 (n=310)Sensitivity, %χ2P valueSpecificity, %χ2P valuePPV, %χ2P valueNPV, )χ2P valueAccuracy, %χ2P valueSVM 276.223.804.0573.050.526.46870.780.003.9678.212.000.1674.520.429.513.819.050.237.630.872.352.406.122.854.09SVM 665.7314.830<.00170.660.057.8165.730.829.3670.668.338.00468.385.479.02SVM 785.31——69.46——70.52——84.67——76.77——SVM 6+882.520.414.5281.446.465.0179.193.174.0884.470.002.9681.932.520.110.414.523.339.071.626.200.042.841.02.31SVM 6+985.31<0.001≥.9973.050.526.4773.050.269.6085.310.023.8878.710.336.560.901.340.060.810.146.700.711.400.590.44SVM 6+ TNM81.120.901.3471.860.231.6371.170.017.9081.630.466.5076.130.036.860.094.764.282.042.676.100.441.513.154.07

aP value: Corresponding comparisons.

bPPV: positive predictive value.

cNPV: negative predictive value.

dSVM: support vector machine.

eχ2 test was used for comparisons among markers obtained SVM 2, SVM 6, and SVM 7, and among markers of SVM6+8, SVM6+9, SVM 6+TNM, and SVM7, respectively.

fχ2 test was used in comparisons between markers of SVM 2 and SVM 6.

gNot available.

hχ2 test was used for comparisons between markers of SVM 6+8 and SVM 6+9.

iχ2 test was used in comparisons between markers of SVM6+TNM and SVM 6+9.

jTNM: tumor node metastasis.

kχ2 test was used in comparisons between markers of SVM 6+8 and SVM6+TNM.

Multifactor Integrated Analysis for the Prediction of Postoperative DFS

We generated a heatmap showing the high- and low-distribution profiles of risk factors affecting recurrence in patients with ESCC (see and ). Postoperative survival analysis revealed that the DFS of the predicted low recurrence risk group in the SVM6 and SVM6+TNM models was much longer than that of the predicted high recurrence risk group. A considerable difference in cumulative survival rates was also observed (see ).

Figure 1. Heatmap of high and low distribution profiles of risk factors affecting the recurrence of patients with esophageal squamous cell carcinoma (ESCC). ECOG: Eastern Cooperative Oncology Group; NLR: neutrophil-to-lymphocyte ratio; TNM: tumor node metastasis. Figure 2. Survival analyses were performed for the low-risk versus high-risk groups of SVM6, SVM7, and SVM6+TNM models. We randomly divided the 311 patients into test, Val1, and Val2 groups, and then each group was divided into a high-risk recurrence group and a low-risk recurrence group. Kaplan-Meier survival analysis showed that the average postoperative survival time of patients with esophageal squamous cell carcinoma (ESCC) in the low-risk recurrence group was longer than that of the high-risk group among test, Val1, and Val2 groups (SVM6, SVM7, and SVM6+TNM models). (P<.001). SVM: support vector machine; PLRR: predicted low-recurrence-risk; PHRR: predicted high-recurrence-risk; TNM: tumor node metastasis. Development and Validation of a Nomogram for the Prediction of DFS

A nomogram was developed using the available data to predict DFS. Vertical lines were drawn from the correct status of each prognostic factor on the top axis (points). Summing all points allowed for the projection of a vertical line from the “total points” axis to the bottom axes, facilitating the conversion into 1-, 3-, and 5-year DFS rates (see and ). The SVM6-based nomogram demonstrated reliable performance in predicting DFS, with an AUC of 0.769. Postoperative outcomes were predicted and evaluated with a sensitivity of 65.73%, specificity of 70.66%, and PPV of 83.54%. Similarly, the SVM6+TNM-based nomogram effectively predicted DFS with an AUC of 0.847 (see ), offering sufficient sensitivity (81.12%) and specificity (71.86%) for postoperative assessment. This nomogram provides valuable insights for guiding treatment decisions and follow-up plans in patients with ESCC. The calibration curves were used to evaluate the consistency of the nomogram (SVM6+TNM and SVM 6). The findings indicated a high degree of uniformity between the predicted and observed probabilities of survival in the training set and internal validation set (see and ).

Figure 3. The nomogram (SVM 6+TNM) predicted individual patient-level 1-, 3-, and 5-year disease-free survival (DFS) based on preoperative and postoperative clinical index. Vertical lines were drawn from the correct status of each prognostic factor to the top axis (points). After the addition of all the points, a vertical line was drawn from the “total points” axis to the bottom axes. This helps in the conversion into a 1-, 3-, and 5-year DFS probability. ECOG: Eastern Cooperative Oncology Group; NLR: neutrophil-to-lymphocyte ratio; TNM: tumor node metastasis; DFS: disease-free survival. Table 5. Receiver operating characteristic curves for the support vector machines (SVMs) models using testing data, validation 1 data, and validation 2 data separately.CombinationsTestVal 1Val 2Val 1+2AUC (95% CI)P valueAUC (95% CI)P valueAUC (95% CI)P valueAUC (95% CI)P valueBefore surgerySVM model 10.962 (0.919-1.000)<.0010.650 (0.547-0.753).0070.579 (0.462-0.697).190.618 (0.540-0.696).004SVM model 20.962 (0.919-1.000)<.0010.670 (0.568-0.771).0020.589 (0.472-0.707).140.633 (0.556-0.710).001SVM model 30.916 (0.856-0.977)<.0010.602 (0.491-0.713).080.602 (0.491-0.714).080.602 (0.524-0.681).01SVM model 40.930 (0.872-0.988)<.0010.558 (0.450-0.665).300.663 (0.552-0.774).0060.606 (0.528-0.683).009SVM model 50.852 (0.771-0.932)<.0010.584 (0.477-0.690).130.666 (0.556-0.777).0050.620 (0.543-0.697).003SVM model 60.769 (0.677-0.862)<.0010.619 (0.509-0.729).040.665 (0.558-0.773).0040.642 (0.565-0.719).001After surgerySVM model 70.844 (0.763-0.925)<.0010.692 (0.588-0.796)<.0010.780 (0.687-0.873)<.0010.736 (0.666-0.806)<.001SVM model 80.853 (0.774-0.933)<.0010.730 (0.634-0.826)<.0010.744 (0.642-0.847)<.0010.736 (0.666-0.806)<.001SVM model 90.753 (0.658-0.847)<.0010.677 (0.565-0.789).0040.670 (0.569-0.771).0020.673 (0.598-0.748)<.001SVM model 100.720 (0.621-0.819)<.0010.627 (0.517-0.737).030.690 (0.583-0.797)<.0010.658 (0.581-0.734)<.001Preoperative and postoperative markersSVM 6+70.910 (0.851-0.969)<.0010.726 (0.615-0.837)<.0010.695 (0.592-0.798)<.0010.709 (0.633-0.784)<.001SVM 6+80.961 (0.917-1.000)<.0010.747 (0.654-0.841)<.0010.756 (0.655-0.857)<.0010.750 (0.682-0.819)<.001SVM 6+90.952 (0.904-1.000)<.0010.700 (0.601-0.798)<.0010.731 (0.628-0.834)<.0010.714 (0.643-0.785)<.001SVM 6+100.838 (0.755-0.920)<.0010.717 (0.616-0.818)<.0010.721 (0.618-0.823)<.0010.718 (0.646-0.790)<.001SVM 6+TNM0.847 (0.768-0.927)<.0010.684 (0.579-0.788).0020.764 (0.667-0.860)<.0010.723 (0.651-0.794)<.001

aVal: validation.

bAUC: area under the curve.

cSVM: support vector machine.

dTNM: tumor node metastasis.

Figure 4. Calibration curve of 1-, 3-, and 5-year disease-free survival (DFS) in the training set and internal validation set. The error bars represent the 95% CI of these estimates. Val: validation.
DiscussionPrincipal Findings

Cancer recurrence remains a major challenge in oncology, significantly impacting patient prognosis. To address this, we developed a machine learning model that predicts recurrence risk, facilitating timely interventions to optimize DFS. Given that surgical and pharmaceutical standards in ESCC treatment generally provide consistent benefits in terms of mortality, DFS is influenced by a combination of multiple factors []. Currently, ESCC has a low DFS and imposes a high financial burden on patients, and solely relying on endoscopic follow-up to reduce postoperative recurrence has proven ineffective. In this study, we collected perioperative data from patients with ESCC and conducted follow-ups to develop an artificial intelligence–derived model capable of predicting postoperative recurrence. Implementing this approach is expected to improve DFS. While TNM staging is useful [], such staging can only be confirmed postoperatively and is only suitable for patients who have already undergone surgery, offering limited value in preoperative planning. Therefore, identifying predictive indicators for DFS preoperatively is important.

Assays for preoperative tumor markers and inflammatory factors [-] are cost-effective, convenient, and reliable for diagnosing, treating, and evaluating ESCC prognosis. Surgical factors, such as surgery type, duration of surgery, and intra-operative blood loss, are known risk factors for postoperative recurrence [-]; thus, we included these factors into our analysis. In addition, postoperative adjuvant therapy [] and complications [] affect prognosis. Collecting comprehensive perioperative data will assist in identifying independent risk factors and facilitate the development of a predictive model for postoperative recurrence. We identified three key findings: (1) Univariate and multivariate Cox regression analyses identified age, ECOG performance status, NLR, CRP, TNM stage, and postoperative complications as independent risk factors for esophageal cancer recurrence. While these factors showed robust predictive value, their combined discriminative ability (AUC=0.676; P<.001) was marginally inferior to that of TNM staging alone; (2) The sensitivity of SVM6+8 (combining SVM6 and SVM8, SVM7 excluding complications) for predicting recurrence in patients with ESCC was comparable with that of SVM6+TNM (SVM6 combined with TNM staging) and higher than that of SVM6+TNM. We used a nomogram to input the indexes in the SVM6 into the artificial intelligence program for patients with ESCC who have not yet developed an individualized plan. It can predict and evaluate the postoperative recurrence outcome of patients with ESCC with a sensitivity of 65.73%, specificity of 70.66%, and accuracy of 68.38%. For patients who have undergone surgery, we can enter the indicators in SVM6+TNM into the artificial intelligence program, which can predict and evaluate the postoperative recurrence outcomes of patients with ESCC with sensitivity (81.12%), specificity (71.86%), and accuracy (76.13%); and (3) Survival analysis stratified patients into predicted low-recurrence-risk and high-recurrence-risk groups, based on the SVM model, exhibited significantly prolonged disease-free survival and a markedly lower recurrence rate compared to the predicted high-recurrence-risk group. These findings may contribute to the formulation of personalized follow-up strategies in clinical practice.

Comparison With Previous Work

Numerous models have been developed to predict the overall survival in postoperative patients with ESCC, but only a few have focused on predicting postoperative recurrence, and their predictive accuracy remains low. Many models, such as logistic regression, decision trees, and random forests, are better suited for large cohort studies. By contrast, the SVM model is suitable for small cohorts that can be independently assigned to 3 groups: 1 test group and 2 verification groups. These groups can be randomly assigned, internally, to assess the practical value of the model. This study was conducted in accordance with the TRIPOD (Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis) guidelines, which provide guidance for transparently reporting studies that develop, validate, or update diagnostic or prognostic prediction models using clustered data []. We developed a model to predict postoperative recurrence based on perioperative data, which we deem essential for improving the overall survival rate of patients with ESCC. Surgical method, duration of surgery, and intraoperative blood loss were not identified as risk factors for postoperative recurrence. Considering the single-center study design and that intraoperative indicators are influenced by the experience and skill level of the surgeon, differences in the duration of surgery and intraoperative blood loss were relatively small, resulting in minimal impact on postoperative recurrence. As the prognostic significance of preoperative blood tests (inflammatory and tumor markers), postoperative pathological stage, and degree of differentiation in patients with ESCC has been previously confirmed [], we included these indicators into the SVM model. The optimal combination was continuously screened to predict the risk of recurrence of postoperative patients with ESCC to provide guidance for surgical evaluation.

Strengths and Limitations

Esophagectomy is currently the ideal treatment for patients with ESCC. However, because of the complexity, extensive trauma, and prolonged duration of the surgery, patients experience physiological stress and a high incidence of postoperative complications [], including anastomotic fistula, pulmonary infection, and respiratory failure. Because of swallowing difficulties and tumor-related metabolic consumption, malnutrition is commonly observed among patients with ESCC. Furthermore, the heightened stress response increases inflammation, weakens the immune system, and impairs tissue repair. Accordingly, reducing surgical risks and improving patient prognosis are crucial. In this study, we analyzed inflammatory indicators and identified NLR, P-CRP, GPS, and CPR as risk factors for postoperative ESCC recurrence [-]. Tumor markers are key factors influencing postoperative survival. The results also showed that SCC and CY211 were risk factors for postoperative ESCC recurrence. Additionally, age, ECOG [], NLR, CPR, TNM stage, and complications were identified as independent risk factors. Our study has the following strengths. First, these prognostic factors were incorporated into an SVM learning model to determine an optimal combination that can be integrated into an artificial intelligence model [] for a comprehensive evaluation of patient status and prognosis, thereby improving clinical practice. Second, the two validation cohorts further confirmed the model’s accuracy and generalizability. However, this study has some limitations. First, as a retrospective study, it is subject to selection bias. This study’s primary limitation involves potential selection bias from excluding patients lost to follow-up. We addressed this limitation by expanding our sample size, which minimized attrition effects and maintained adequate statistical power for robust conclusions. Second, given the extended follow-up period of this study, new postoperative adjuvant therapies have emerged in clinical practice. Our team has now updated the dataset with recently collected information from esophageal cancer surgery patients, which will enable further in-depth analysis. Finally, the sample size in this study was limited, including only retrospective data from a single health care institution, and randomized validation of the SVM model helps address the limitations of single-center data. However, external validation remains a critical step in ensuring the reliability, generalizability, and clinical applicability of research findings, even in studies with large sample sizes. Despite the advantages of a larger cohort, issues such as overfitting, selection bias, or dataset-specific artifacts may still arise. Thus, to further enhance its clinical usability, we plan to implement this predictive model across multiple hospitals.

Future Directions

In this study, we used the SVM model and analyzed the ROC curve to qualitatively and quantitatively evaluate the predictive ability of the model. In addition, a nomogram was generated to evaluate the DFS of patients with ESCC. Subsequently, treatment plans were adopted based on the predicted high- and low-risk of recurrence. Differences between the high- and low-risk groups guided individualized medical treatments, such as personalized surgical planning (or appropriate surgical procedures), optimization of radiotherapy and chemotherapy dosage and timing, and selection of appropriate follow-up intervals. Patients in the high-risk group for postoperative recurrence should undergo enhanced follow-up with close monitoring through gastroscopy, histopathological examination, and imaging studies. By contrast, follow-up schedules for the low-risk group should be based on blood test results to ensure appropriate monitoring. The development of this artificial intelligence model enables early prediction of postoperative recurrence risk in patients with ESCC while facilitating the generation of personalized medical plans, such as optimized postoperative radiotherapy and chemotherapy regimens as well as reasonable follow-up schedules []. By reducing unnecessary postoperative examinations, this model enhances the efficiency of follow-up care. It is particularly well-suited for use in towns and community health care settings to assist local medical practitioners in accurately assessing patient status, reducing the rate of recurrence of postoperative ESCC, and improving the 5-year survival rate.

Conclusion

Age, ECOG performance status, NLR, CPR, TNM, and complications were identified as independent risk factors for postoperative ESCC recurrence. These factors, which affect patient prognosis, were incorporated into the SVM learning model to determine the optimal risk-predictive combination. This model, integrated with an artificial intelligence model, provides a comprehensive assessment of patient status and prognosis, assisting the development of follow-up treatmen

Comments (0)

No login
gif