The study design is illustrated in Supplementary Figure S1. Using the TCGA database, we compared the expression of 433 genes related to oxidative stress in 473 CC tissues. We identified 36 down-regulated genes and 29 up-regulated genes, which had a false discovery rate (FDR) < 0.05 and a |log2 fold change (FC)|≥ 1.5. Comprehensive data on the differentially expressed oxidative stress-related genes (DEOSGs) can be found in Supplementary Table S3. The top 50 genes with the greatest increase and decrease are shown in Supplementary Figure S2A. Additionally, all DEOSGs are depicted in the volcano plot (Supplementary Figure S2B).
3.2 Screening of prognostic oxidative stress-related genes in CCThrough a univariate Cox regression analysis in the TCGA database, we have identified 23 prognostic oxidative stress-related genes for CC (P < 0.05). Genes with a hazard ratio (HR) less than 1 are considered low-risk genes, while genes with HR greater than 1 are considered high-risk genes. The 95% confidence intervals for the HR values are shown in brackets in supplementary Figure S3A. The waterfall plot illustrates variations in the distribution of somatic mutations in the 23 DEOSGs (supplementary Figure S3B), and the co-occur plot indicates the co-occurrence between these 23 DEOSGs (supplementary Figure S3C).
3.3 Construction and evaluation of a prognostic model for CC based on oxidative stress-related genesThrough LASSO regression analysis, we identified 16 genes (NGF, IL13, RPS6KA5, TERT, CDKN2A, SERPINE1, CD36, PPARGC1A, ACADL, ACOX1, POMC, BDNF, MSRA, DDIT3, GSTM1, CPT2) based on the best λ value to construct a prognostic model (Supplementary Figure S4A–B). The risk score is calculated using the following formula: risk score = (0.0708 * NGF exp.) + (−2.4907 * IL13 exp.) + (−0.1846 * RPS6KA5 exp.) + (0.7210 * TERT exp.) + (0.1351 * CDKN2A exp.) + (0.1102 * SERPINE1 exp.) + (0.1850 * CD36 exp.) + (−0.3145 * PPARGC1A exp.) + (0.3644 * ACADL exp.) + (−0.0843 * ACOX1 exp.) + (0.1684 * POMC exp.) + (0.5691 * BDNF exp.) + (−0.2211 * MSRA exp.) + (0.2156 * DDIT3 exp.) + (−0.1499 * GSTM1 exp.) + (−0.2575 * CPT2 exp.). We divided the 473 individuals with colorectal cancer into low- and high-risk groups by using the median score obtained from the risk score method. Subsequently, the patients with different risks were further divided into two groups based on PCA for oxidative stress genes and model genes (Supplementary Figure S4C–D). Analysis of the TCGA database revealed a significant difference in overall survival between the low- and high-risk groups (P < 0.001) (Supplementary Figure S4E). To validate the model, we used the GSE17538 database as an external validation group. The analysis showed a statistically significant difference in overall survival between the low- and high-risk groups (P = 0.010) (Supplementary Figure S4F). Furthermore, in the TCGA cohort, the high-risk group exhibited significantly poorer PFS compared to the low-risk group (P < 0.001) (Supplementary Figure S4G).
To evaluate the risk signature as an independent predictive variable, we used univariate and multivariate Cox regression models. The results showed that the risk score was an independent predictor of a worse patient prognosis (HR = 3.461, 95% CI 2.552–4.693) based on univariate Cox regression analysis (Supplementary Figure S5A). After adjusting for other confounding variables, the multivariate Cox regression analysis demonstrated that the risk score remained a significant predictive variable for individuals with CC (HR = 2.973, 95% CI 2.124–4.162) (Supplementary Figure S5B). We assessed the sensitivity and specificity of the model using ROC analysis, and found that the areas under the ROC curves for 1, 3, and 5 years were 0.706, 0.730, and 0.781, respectively (Supplementary Figure S5C). Over five years, the prognostic model showed superior predictive accuracy compared to other clinical characteristics in the TCGA cohort (Supplementary Figure S5D).
To determine the association between each clinicopathological characteristic and the risk score, we analyzed age, sex, tumor grade, and TNM stage in the TCGA cohort. The results indicated that the risk score was positively associated with T-stage, except for T1 and T2 stages (Supplementary Figure S5E). Patients with a higher N stage (Supplementary Figure S5F) or M1 stage (Supplementary Figure S5G) of CC were significantly associated with higher risk scores. Regarding tumor grade, the risk score was positively associated with tumor grade, except for stages III and IV (Supplementary Figure S5H). However, there was no significant relationship between the risk score and age or sex in individuals with CC (all P > 0.05).
3.4 Construction and validation of a prognostic nomogram for CC based on oxidative stress-related genesBased on the findings of the multivariate Cox regression analysis in the TCGA cohort, we constructed a nomogram using independent risk variables including age, sex, tumor grade, TNM stage, and risk scores (Supplementary Figure S6A). To evaluate the nomogram's prediction ability, we plotted clinical ROC curves and calibration curves. The calibration curve (Supplementary Figure S6B) showed a high consistency between the risk predicted by the nomogram and the detected 1-, 3-, and 5-year survival rates. The nomogram also exhibited superior predictive accuracy compared to other clinical variables over a 5-year period, as indicated by the clinical ROC curve (AUC = 0.819) (Supplementary Figure S6C). Moreover, the Nomo risk score showed a significant association with overall survival according to the Univariate Cox regression analysis [HR = 1.309 (1.238–1.383), P < 0.001, Supplementary Figure S6D]. Additionally, the Nomo risk score was identified as an independent risk factor for overall survival in subjects with CC using multivariate Cox regression analysis [HR = 1.167 (1.075–1.266), P < 0.001, Supplementary Figure S6E].
3.5 Differential analysis of immune cells and GSVA analysisThe analysis of immune cell differences revealed significant variations in the expression of B memory cells, plasma cells, T CD4 memory resting cells, NK cells, Macrophages M0, Dendritic cells resting cells, Dendritic cells activated cells, and Eosinophils between the high- and low-risk groups of the TCGA cohort (P < 0.05) (Supplementary Figure S7A). The MCP-counter algorithm also identified significant differences in the expression of NK cells, myeloid dendritic cells, myeloid dendritic cells, and fibroblasts between the low-risk and high-risk groups (Supplementary Figure S7B). Furthermore, the GSVA analysis revealed that the high-risk group of the TCGA cohort had significantly elevated expression levels of the circadian rhythm mammal, basal cell carcinoma, glycosaminoglycan biosynthesis, chondroitin sulfate, and ECM receptor interaction mechanisms (Supplementary Figure S7C).
3.6 Screening and functional enrichment analysis of DEGs in TCGA cohort of CCBased on the median risk score mentioned above, a differential analysis was conducted on individuals with CC in the TCGA cohort, resulting in the identification of 70 genes. Among these genes, 17 were found to be down-regulated while 53 were up-regulated (Supplementary Table S4). The GO analysis of these 70 DEGs in CC indicated that biological processes (BP) were mainly associated with muscle contraction and intermediate filament-based mechanisms. CC were dominated by collagen-containing extracellular matrix and endoplasmic reticulum lumen. In terms of molecular function (MF), receptor ligand and signaling receptor activator activity were found to be predominant (Supplementary Figure S8A-B). Furthermore, the KEGG pathway analysis revealed that the main signaling pathways involved were ECM-receptor interaction, Human papillomavirus infection, and PI3K-Akt signaling pathway (Supplementary Figure S8C-D).
3.7 Identification of hub genes and prognostic analysis of CDKN2A and SERPINE1Using the STRING software, it was discovered that out of the 70 DEGs identified in CC, 63 were involved in constructing PPI networks. The resulting PPI network consisted of 62 edges, with an average node degree of 1.97 and an average local clustering coefficient of 0.393. Statistically, the difference in this PPI network was found to be significant (P < 0.05) (Fig. 1A). To identify the top 15 hub genes based on node degree, the Cyto-Hubba plug-in was utilized, resulting in the selection of FN1, CDKN2A, SFRP2, MYH11, SFRP4, CILP, COL9A3, SERPINE1, WIF1, COMP, ACTG2, KRT14, THBS2, CALB2, and KRT17 (Fig. 1B). Furthermore, by intersecting the top 15 hub genes with the genes used to construct the predictive model for CC, CDKN2A and SERPINE1 were also identified (Fig. 1C). The Human Protein Atlas (HPA, https://www.proteinatlas.org/) offers an IHC-based assay for relative protein abundance [23]. In terms of oxidative stress, SERPINE1 exhibited a higher coefficient than CDKN2A (12.33 vs. 11.43). To examine the protein expression of SERPINE1 in CC and normal tissues, HPA data was utilized, which revealed a noticeable accumulation of SERPINE1 in CC tissues (Fig. 1D, E).
Fig. 1
Construction of PPI networks and identification of hub genes. A The PPI network showed the interactions of the DEGs (interaction score = 0.4). B Visualization of the PPI network and the candidate hub gene according to the EPC ranking. C A Venn diagram shows the number of overlapped genes between the top ten hub genes and the genes involved in the construction of the prognostic model for CC. D–E Immunohistochemical staining of SERPINE1 gene in normal tissue (D) and cancer tissue (E) of CC in HPA database
3.8 Clinical correlation analysis and immune infiltration analysis of SERPINE1The relationship between SERPINE1 expression and the clinicopathological characteristics of subjects with CC in the TCGA dataset was investigated. The results revealed a significant correlation between SERPINE1 expression and the cancer grade and TNM stages of individuals with CC. SERPINE1 expression was lower in T2 patients compared to T3 or T4 patients (P = 0.00025 and P = 0.0027, respectively) (Fig. 2A). Additionally, SERPINE1 expression was lower in N0 patients compared to N1 or N2 patients (P = 0.023 and P = 0.0058, respectively) (Fig. 2B). Moreover, SERPINE1 expression was lower in M0 patients compared to M1 patients (P = 0.046) (Fig. 2C). In terms of tumor stage, SERPINE1 expression was lower in Stage I patients compared to Stages II, III, or IV patients (P = 0.011, 0.0012, and 0.00062, respectively) (Fig. 2D). However, there was no correlation between SERPINE1 expression and gender or age (P = 0.60 and 0.32, respectively). Survival analysis using the Kaplan–Meier method revealed that patients with reduced SERPINE1 expression had significantly longer overall survival compared to those with elevated expression (Fig. 2E). To examine the connection between SERPINE1 and immune cell infiltration, a correlation study was conducted to assess the relationship between SERPINE1 expression and the abundance of 22 immune cell types. The results showed that B cells naive, macrophages M0, and activated mast cells were significantly more abundant in the high-expression group of SERPINE1, while plasma cells, T cells CD4 memory resting, NK cells resting, dendritic cells resting, and mast cells resting were more abundant in the low-expression group (Fig. 2F).
Fig. 2
Clinical correlation analysis and immune infiltration analysis of SERPINE1. A The relationship between SERPINE1 expression and T stage in TCGA cohort. B The relationship between SERPINE1 expression and N stage in TCGA cohort. C The relationship between SERPINE1 expression and M stage in TCGA cohort. D The relationship between SERPINE1 expression and tumor stage in TCGA cohort. E Kaplan–Meier curves for comparison of the overall survival between SERPINE1 low-and high-expression groups in the TCGA database (P < 0.001). F CIBERSORT score of 22 immune cell infiltrations among TCGA samples of SERPINE1 low-expression and high-expression groups
3.9 The expression of SERPINE1 in CC tissues and cell linesFollowing the comprehensive online machine learning analysis, RT-qPCR and western blot were conducted to confirm the significantly higher levels of SERPINE1 mRNA and protein in CC tissues compared to para-tumor tissues (Fig. 3A, B). In order to validate these findings, three different CC cell lines (SW480, CACO-2, and HT29) were selected for in vitro experiments, with human intestinal epithelial cells (NCM460) serving as the control group. The results consistently demonstrated overexpression of SERPINE1 in the CC cancer cell lines (Fig. 3C, D). Furthermore, IHC analysis of human tissues provided additional confirmation of the prominent accumulation of SERPINE1 in tumor regions (Fig. 3E, F; Supplementary Figure S9).
Fig. 3
The expression of SERPINE1 in CC tissues and cell lines. A SERPINE1 mRNA expression in CC specimens relative to para-tumor tissues as detected by RT-qPCR. B SERPINE1 protein level in 6 CC patients relative to para-tumor tissues as conducted by western blot. C SERPINE1 expression in 3 different CC cell lines (SW480, CACO-2, and HT29) relative to NCM460 was conducted by RT-qPCR. D SERPINE1 expression in 3 different CC cell lines (SW480, CACO-2, and HT29) relative to NCM460 was conducted by western blot. E–F Immunohistochemical analysis of SERPINE1 expression in CC patients. *p < 0.05; **p < 0.01; ***p < 0.001
3.10 Effects of SERPINE1 knockdown on CC cell viability, proliferation, and oxidative stressFunctional interference techniques were employed to examine the impact of SERPINE1 deletion on the behavior of CC cells, aiming to determine the specific role of SERPINE1 in the initiation and development of CC. Figure 4A–C demonstrate the successful transfection in CC cell lines. The results of CCK-8 analysis revealed that suppressing SERPINE1 significantly reduced cell viability in SW480 and CACO-2 cell lines (Fig. 4D, E). Similarly, the data from colony formation analyses indicated that the clone capacity of SW480 and CACO-2 cell lines was diminished after silencing SERPINE1 (Fig. 4F–I). These findings suggest that SERPINE1 plays a crucial role in promoting the progression of CC. Furthermore, it is well-established that oxidative stress is involved in the occurrence and development of cancers. Therefore, this study investigated the levels of ROS, MDA, and GSH in SW480 and CACO-2 cells (Fig. 4J–M), which revealed a significant increase in MDA and ROS levels and a noticeable decrease in GSH levels after transfecting SERPINE1 compared to the si-ctrl group. In summary, the anti-tumor effect of SERPINE1 knockdown on CC may be partly attributed to inducing oxidative stress.
Fig. 4
The effect of SERPINE1 knockdown on CC cell viability, proliferation as well as oxidative stress in vitro. A The efficiency of SERPINE1 knockdown (si- SERPINE1) was assessed by western blot in SW480 and CACO-2 cells. B Quantitative analysis of western blot in SW480 cell. C Quantitative analysis of western blot in CACO-2 cell. D The viability of SW480 cell was assessed by CCK-8 assays. E The viability of CACO-2 cell was assessed by CCK-8 assays. F–G The clone capacity of SW480 cell was evidenced by colony formation assay. H–I The clone capacity of CACO-2 cell was evidenced by colony formation assay. J–K ROS level of SW480 and CACO-2 cells. L MDA level of SW480 and CACO-2 cells. M GSH level of SW480 and CACO-2 cells. *p < 0.05; **p < 0.01; ***p < 0.001
Comments (0)