Multi omics network toxicology and in vitro experiments elucidate the role of benzo [a] pyrene in prostate cancer

Abstract

Background:

In recent years, growing attention has been paid to the role of Benzo [a]pyrene (BaP) in the development and progression of prostate cancer (PCa). However, the specific molecular mechanisms remain unclear. This study aims to explore the potential association between BaP and PCa and to identify key molecular targets that may underlie this relationship, using an integrative bioinformatics approach.

Methods:

This study initiated with a computational toxicology assessment of BaP’s carcinogenicity and endocrine-disrupting properties using the ProTox 3.0 platform. Subsequently, potential target genes linking BaP to PCa were identified by integrating multiple public databases. The overlapping genes underwent PPI network construction and visualization, followed by GO functional annotation and KEGG pathway enrichment analyses to elucidate the underlying biological mechanisms. Through screening 101 machine learning algorithm combinations, we identified the most relevant key genes associated with PCa progression. Molecular docking technology was then employed to evaluate the binding interactions between BaP/natural active products and these key targets. The CIBERSORT algorithm was utilized to analyze RRM2’s regulatory role in the PCa tumor microenvironment, complemented by pan-cancer analysis to investigate RRM2’s universal functions across various malignancies. Finally, in vitro cell experiments were conducted for validation.

Results:

This study further underscores the carcinogenic properties and endocrine-disrupting effects of BaP. Integration of multi-source databases identified 443 potential BaP-PCa targets. GO and KEGG enrichment analyses revealed that these targets are primarily involved in regulating cell proliferation, inflammatory responses, oxidative stress, and multiple oncogenic signaling pathways. Machine learning algorithm screening showed that the Enet (α = 0.1) model exhibited the best predictive performance and robustness. Through molecular docking, Kaplan-Meier survival analysis, and validation using the Human Protein Atlas (HPA) database, RRM2 was identified as a key regulatory gene and found to play a central role in BaP-mediated immunosuppression processes. Pan-cancer analysis demonstrated that RRM2 has universal functions across various malignancies. Molecular docking results indicated that seven known anti-tumor natural products exhibit significant binding affinity with RRM2. In vitro experiments demonstrated that BaP treatment was associated with increased RRM2 expression in prostate cancer cells, while baicalin treatment reduced this effect, providing preliminary experimental support for the bioinformatic predictions.

Conclusion:

This study delineates a potential mechanistic framework by which BaP may be associated with PCa progression through multi-target and multi-pathway mechanisms, highlighting RRM2 as a candidate core mediator. These findings provide a theoretical foundation for future experimental validation and epidemiological studies.

Background

Prostate cancer (PCa) is a clinically heterogeneous malignancy characterized by significant variability in its clinical manifestations (Wang et al., 2018). As the most frequently diagnosed malignancy and the second leading cause of cancer-related mortality among men globally, PCa poses a serious threat to public health and imposes a substantial healthcare burden (Siegel et al., 2024; Raychaudhuri et al., 2025). However, the precise molecular mechanisms underlying its high incidence remain incompletely understood. Beyond established risk factors such as age, ethnicity, and family history, the specific roles of environmental exposures, lifestyle factors, and dietary components in PCa pathogenesis require further in-depth investigation (Wang et al., 2018; Cui et al., 2024; Feijó et al., 2025).

Benzo[a]pyrene (BaP) is a classic environmental pollutant generated during the incomplete combustion of organic matter in industrial processes (Huang et al., 2025). Global exposure assessment data indicate a daily BaP intake of 50–613 ng/day in the general population, with dietary sources and air pollution significantly contributing to a cumulative cancer risk of up to 5.163 × 10−6 (Huang et al., 2025; Hummel et al., 2018). Furthermore, specific occupational settings can exacerbate exposure levels, with lifetime cumulative doses reaching 300 μg/m3·year (Araki et al., 2024). As a typical endocrine-disrupting chemical, BaP’s capacity to interfere with normal hormonal function has been clearly demonstrated (Zhang et al., 2016; Bukowska et al., 2022). Given the crucial role of hormonal signaling pathways in PCa development and progression, BaP exposure may disrupt endocrine homeostasis, thereby affecting core biological processes such as cell proliferation, differentiation, and apoptosis, ultimately promoting tumorigenesis (Feijó et al., 2025; Modica et al., 2023; Shi et al., 2017; Patke et al., 2024). Although existing studies have associated BaP exposure with various adverse health outcomes, the specific molecular mechanisms underlying its role in PCa initiation and progression await systematic elucidation.

Against this backdrop, the rapid advancement of bioinformatics provides crucial technical support for deciphering the complex mechanisms linking environmental exposures to disease occurrence (Li et al., 2025; Goldenberg et al., 2019). Network toxicology, as an emerging methodology that integrates multi-omics strategies with network analysis technologies, enables systematic exploration of the potential mechanisms through which toxicants induce disease, and has been widely applied to investigate the pathogenic mechanisms of environmental factors and identify key therapeutic targets (Goldenberg et al., 2019; Shi et al., 2024). Concurrently, machine learning algorithms have demonstrated significant advantages in disease risk prediction and core biomarker screening (Ngiam and Khor, 2019). Leveraging these methodological advances, this study integrates network toxicology, multi-omics analysis, machine learning algorithms, and molecular docking to generate hypotheses regarding the potential association between BaP exposure and PCa development, and to propose candidate molecular mechanisms that warrant further experimental investigation.

Materials and methodsData sources

We integrated transcriptomic data and matched clinical information from 1,028 prostate cancer (PCa) patients, sourced from: (1) the TCGA-PRAD cohort (accessed via the UCSC Xena platform) (https://xena.ucsc.edu/), and (2) three independent cohorts from the GEO database (GSE21032, GSE70770, and GSE116918) (https://www.ncbi.nlm.nih.gov/geo/summary/). To enhance statistical power, GSE21032 and GSE70770 were merged into a combined cohort (designated the GSE cohort). To ensure data reliability, patients lacking biochemical recurrence (BCR) status or with follow-up time less than 1 month were excluded. Detailed clinical characteristics of each cohort are summarized in Supplementary Tables S1 and S2. Additionally, the Human Protein Atlas (HPA) (https://www.proteinatlas.org/) and SangerBox databases (http://sangerbox.com/home.html) were employed for protein expression validation and pan-cancer analysis, respectively.

Toxicity prediction

ProTox 3.0 is an online computational platform for predicting the toxicity of small molecules (Ngiam and Khor, 2019). The SMILES structure of BaP, obtained from the PubChem database, was used to systematically predict its toxicity profile and related parameters on the ProTox 3.0 platform. This tool covers multiple toxicological endpoints, enabling a comprehensive assessment of BaP’s potential toxicity and safety risks, thereby providing a reference for subsequent experimental validation and risk management.

Acquisition of BaP target genes

To comprehensively identify potential targets of BaP, we queried four authoritative databases: SwissTargetPrediction, TargetNet, PharmMapper, and the Comparative Toxicogenomics Database (CTD). The canonical SMILES string and SDF structure file of BaP were obtained from the PubChem database. The following thresholds were applied to screen for high-confidence human targets: probability >0.01 for SwissTargetPrediction, AUC ≥0.7 for TargetNet, conformations = 300 and energy cutoff = 20.0 kcal/mol for PharmMapper, and Interactions >10 for the CTD database. During data standardization, the UniProtKB database was used to uniformly correct gene symbols and functional annotations of predicted targets, with Entrez Gene ID serving as the unique identifier.

Acquisition of PCa targets

Following previously reported methods [reference needed], transcriptomic data from 534 patients in the TCGA-PRAD cohort, comprising 483 tumor tissues and 51 normal control tissues (see Supplementary Table S2), were analyzed. Differential expression analysis was performed using the DESeq2 R package, with screening thresholds set at |log2FoldChange| > 0.5 and adjusted p-value <0.05. This threshold was selected to balance sensitivity while capturing a broader spectrum of potentially biologically relevant genes.

Identification of BaP-PCa targets and PPI network construction

Common targets between BaP and PCa were identified using Venn analysis. Subsequently, protein-protein interaction (PPI) analysis was performed on these overlapping targets using the STRING database (confidence score threshold ≥0.4). The network was imported in TSV format into Cytoscape 3.10.3 for visualization and topological analysis. The built-in “Centiscape 2.0″tool in Cytoscape was used to calculate multiple topological parameters, including degree centrality, closeness centrality, and betweenness centrality, to assess node importance within the network. Finally, genes were ranked based on closeness centrality, with higher values indicating greater hub importance.

Functional enrichment analysis

Gene Ontology (GO) systematically categorizes gene function into three classes: Cellular Component (CC), Molecular Function (MF), and Biological Process (BP). The Kyoto Encyclopedia of Genes and Genomes (KEGG) links genomic information with functional pathways at a systems level. In this study, the R package clusterProfiler was used to perform GO and KEGG functional enrichment analyses on the screened targets.

Development and evaluation of a BaP-PCa risk model using machine learning

We employed a comprehensive machine learning framework to evaluate the prognostic performance of BaP-related gene signatures, systematically assessing 101 combinations derived from 10 algorithms: Random Forest (RF), Support Vector Machine with radial basis function kernel (SVM), eXtreme Gradient Boosting (XGBoost), LightGBM, Elastic Net (EN), Ridge Regression, Lasso, Decision Tree (DT), k-Nearest Neighbors (KNN), and Neural Networks with a single hidden layer (NN), covering regularization-based regression, tree-based ensemble methods, and kernel-based modeling strategies. The 101 combinations correspond to different hyperparameter configurations and feature selection strategies. For each algorithm, we performed hyperparameter optimization using grid search combined with 10-fold cross-validation on the TCGA-PRAD training set, strictly tuning within each fold to prevent data leakage. The final model for each algorithm was selected based on the average performance across all folds. To mitigate overfitting, we required the models to maintain consistent predictive performance across three independent cohorts, TCGA-PRAD, GSE116918, and GSE46602, retaining only those robust models with a concordance index (C-index) > 0.65 in all three cohorts. More detailed information on the model framework can be found in our previous publication (Zhang K. et al., 2025; Xu et al., 2024).

Molecular docking and protein expression validation

Molecular docking was employed to elucidate the interaction mechanisms between BaP and core BaP-PCa target proteins. The molecular structure of BaP was obtained from the PubChem database, and the three-dimensional structures of the target proteins were sourced from the AlphaFold Protein Structure Database. AutoDockTools 1.5.7 was used for molecular and protein structure preparation and to perform docking simulations, predicting binding modes, binding affinity (expressed as binding free energy, ΔG), and potential functional impacts. A binding free energy lower than 0 kcal/mol indicates spontaneous binding, while an energy below −5.0 kcal/mol suggests a stable complex. Furthermore, to determine the protein expression levels of the core targets, the Human Protein Atlas (HPA) database was queried to examine the expression of core targets in PCa and normal prostate tissues.

Tumor microenvironment analysis

The CIBERSORT algorithm (Newman et al., 2015) was used to quantify the infiltration levels of 22 immune cell types in PCa tissues. Patients in the TCGA-PRAD cohort were divided into high and low expression groups based on the median RRM2 expression value to characterize changes in the tumor microenvironment (TME) composition. Pearson correlation analysis was subsequently performed to validate the correlation between RRM2 expression and the infiltration levels of immunosuppressive cells (regulatory T cells and M2 macrophages). To confirm the reliability of the CIBERSORT analysis results, differences in immunosuppression-related indicators between subgroups were further compared.

Pan-cancer analysis of the core protein RRM2

To further elucidate the key role of RRM2 in cancer development and progression, we investigated its pan-cancer expression patterns using the SangerBox platform and performed univariate Cox regression analysis to evaluate the association between RRM2 expression and progression-free survival (PFS) across different tumor types.

Cell culture and intervention

Human prostate cancer cells (DU145 and PC3) were purchased from the Cell Bank of the Chinese Academy of Sciences. Cells were cultured in a constant temperature incubator at 37 °C with 5% CO2, following the supplier’s recommended protocol. The formulations and doses of BaP and baicalin were based on previously reported methods and concentrations (Nwagbara et al., 2007). In these experiments, BaP was used at a concentration of 10 μM. Baicalin concentrations were determined based on the IC50 values from the dose-response curves reported in the literature (Miocinovic et al., 2005), with 100 μM used for treating DU145 cells and 150 μM for PC3 cells.

Western blot

Total protein was extracted from DU145 and PC3 cells using RIPA lysis buffer (Solarbio, China) supplemented with a protease inhibitor cocktail (Yeasen, China). Protein samples were separated by 10% SDS-polyacrylamide gel electrophoresis and subsequently transferred to PVDF membranes using a wet transfer method. The membranes were blocked with 5% skim milk at room temperature for 1 h, followed by incubation with an anti-RRM2 primary antibody (1:5000 dilution; catalog #11661-1-AP, Proteintech) at 4 °C overnight. After washing with TBST, the membranes were incubated with species-appropriate secondary antibodies at room temperature for 2 h.

EdU proliferation assay

To evaluate the effects of BaP and baicalin on the proliferation of DU145 cells, the 5-ethynyl-2′-deoxyuridine (EdU) cell proliferation assay kit (Beyotime Biotechnology, Shanghai, China) was used. Treated and untreated cells were seeded in 24-well plates at a density of 3 × 103 cells per well. After 24 h of culture, 20 μL of EdU working solution was added to each well and incubated at 37 °C for 2 h. Subsequently, cell fixation and staining were performed according to the manufacturer’s instructions. Finally, EdU-positive cells were observed and recorded using a fluorescence microscope to quantitatively analyze the cell proliferation rate.

Wound healing assay

To assess the effect of BaP and baicalin on cell migration capacity, a wound healing assay was performed. Treated and untreated DU145 cells were seeded in 6-well plates and cultured until fully confluent. A straight scratch was made on the cell monolayer using a sterile 200 μL pipette tip, and detached cells were gently washed away with PBS, establishing a standardized in vitro wound healing model. To exclude the influence of cell proliferation on migration analysis, the medium was replaced with low-serum migration medium. Images of the same field were captured at 0 h and 48 h post-scratch using a phase-contrast microscope (×10 objective), and cell migration capacity was quantitatively evaluated by comparing changes in scratch width.

Transwell invasion assay

To evaluate cell invasion capacity, Transwell® chambers (polycarbonate membrane, 8 μm pore size) were used. Treated and control cells were resuspended in serum-free medium and seeded into the upper chamber. RPMI-1640 medium containing 10% FBS was added to the lower chamber as a chemoattractant. After incubation at 37 °C with 5% CO2 for 24 h, cells that migrated to the lower surface of the membrane were fixed with 4% paraformaldehyde and stained with 0.1% crystal violet. Finally, five random fields per sample were photographed using an inverted microscope (×10 objective), and migrated cells were counted for quantitative analysis.

Statistical analysis

All statistical analyses were performed using R software (version 4.5.0). The nonparametric Wilcoxon rank-sum test was used to assess differences in gene expression and immune cell infiltration levels. Statistical significance was defined as a two-sided P-value <0.05. Data visualization was performed using the R package “ggplot2″ and the Sangerbox platform. The primary endpoint of this prognostic analysis was progression-free survival (PFS), defined as the time from initial diagnosis to biochemical recurrence (BCR) or the last follow-up. The BCR mentioned in the text serves as the event definition criterion for PFS.

ResultInitial assessment of BaP toxicity

Toxicological profiling using the ProTox 3.0 database indicated that Benzo[a]pyrene (BaP) presents a significant toxicological risk, demonstrating particularly high endocrine disruption activity and carcinogenic potential. Specifically, BaP showed high prediction scores for carcinogenicity (0.88) and mutagenicity (0.89). For endocrine disruption, BaP had a high probability of interacting with key receptors, including the aryl hydrocarbon receptor (AhR, 0.96), androgen receptor (AR, 0.78), estrogen receptor alpha (ERα, 0.64), and aromatase (0.83), comprehensive details are provided in Supplementary Table S4. These statistically significant predictions confirm a moderate-to-high health risk for BaP and provide a crucial toxicological foundation for our subsequent investigation into its mechanisms.

Identification of BaP-PCa targets

We consolidated target predictions from multiple databases, identifying 1,177 unique genes associated with BaP (Supplementary Table S5). Differential expression analysis in prostate tumor tissues revealed 4,865 differentially expressed genes (Supplementary Table S5). A Venn diagram analysis identified 443 overlapping genes at the intersection of BaP-related and prostate cancer (PCa)-related genes (Figure 1A; Supplementary Table S5), suggesting their potential role as therapeutic targets in BaP-mediated PCa development and progression.

Panel A shows a Venn diagram comparing PCa and BaP target gene sets with an overlap of 443 genes. Panel B displays a network of genes with yellow nodes indicating key targets. Panel C lists identified gene targets in yellow ovals. Panel D presents a network graph with central red nodes highlighting core genes. Panel E is a grouped horizontal bar chart summarizing biological processes (BP), cellular components (CC), and molecular functions (MF) enriched for overlapping genes, with values and category colors. Panel F is a dot plot showing enriched signaling pathways with color indicating –log10(p-value) and dot size reflecting gene count.

Identification of BaP-PCa overlapping targets and protein-protein interaction network analysis. (A) Venn diagram showing the intersection of BaP-associated genes (1,177 genes) and PCa-related differentially expressed genes (4,865 genes), identifying 443 overlapping genes as potential BaP-PCa targets. (B) Protein-protein interaction (PPI) network of the 443 overlapping targets, constructed using the STRING database with a confidence score threshold ≥0.4. (C) Core PPI network subnetwork extracted through topological analysis, highlighting hub proteins with high connectivity. (D) Topological analysis of the PPI network using Centiscape 2.0, ranking proteins by closeness centrality; higher values indicate greater hub importance. (E) Gene Ontology (GO) enrichment analysis of the 94 core genes, showing significantly enriched biological processes (BP), cellular components (CC), and molecular functions (MF). (F) Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis of the 94 core genes, revealing key signaling pathways potentially involved in BaP-mediated PCa progression.

Construction of the BaP-PCa PPI network

The 443 overlapping genes were used to construct a PPI network using the STRING database (confidence score ≥0.4). After removing unconnected nodes, the final network contained 421 proteins. Visualization with Cytoscape 3.10.3 and subsequent topological analysis identified 94 core proteins within the network. Five key hub proteins, ALB, BCL2, PPARG, AGT, and IL1B, were identified as having the highest connectivity (Figures 1B–D). This network visualization clarifies the interactions among key targets and provides a foundation for exploring the molecular mechanisms linking BaP to PCa.

Functional enrichment analysis of BaP-PCa targets

GO enrichment analysis of the 94 core genes revealed their significant involvement in biological processes such as inflammatory response, cell proliferation, oxidative stress, and hormone secretion and transport (Figure 1E). KEGG pathway analysis indicated enrichment in several key pathways, including the MAPK, estrogen, Rap1, Ras, PI3K-Akt, and NF-kappa B signaling pathways, as well as pathways in chemical carcinogenesis (Figure 1F). These results suggest that BaP may promote PCa by disrupting inflammatory and cell proliferation processes, endocrine functions, and multiple established oncogenic pathways.

Development and validation of a BaP-PCa prognostic model

Using the 443 BaP-PCa targets, univariate Cox regression identified 34 genes significantly associated with PFS in the TCGA-PRAD and GSE cohorts (Supplementary Figure S1/S2). After systematically evaluating 101 algorithm combinations, the Elastic Net model [alpha = 0.1] demonstrated the best predictive performance. This optimal model incorporated 17 key genes (Figures 2A,B; Supplementary Table S6) and achieved a mean C-index of 0.714. Kaplan-Meier analysis confirmed that patients with high-risk scores had significantly shorter PFS in the TCGA-PRAD (Figure 2C), GSE (Figure 2D), and GSE116918 (Figure 2E) cohorts. Time-dependent ROC curves validated the model’s predictive accuracy, with 1-, 3-, and 5-year AUC values of 0.77, 0.77, and 0.72 in TCGA-PRAD (Figure 2F); 0.77, 0.74, and 0.73 in the GSE cohort (Figure 2G); and 0.68 and 0.74 for 3 and 5 years in GSE116918 (Figure 2H). Furthermore, the risk score was positively correlated with advanced clinical stage (Figures 2I,J). These findings highlight the prognostic value of this 17-gene signature and suggest these targets are critically involved in BaP-driven PCa progression.

Panel A shows a heatmap ranking multiple survival analysis models by C-index scores across three cohorts; panel B lists the top-performing models. Panels C, D, and E display Kaplan-Meier survival curves comparing high- and low-risk groups based on selected models. Panels F, G, and H are receiver operating characteristic curves for model performance. Panels I and J present boxplots comparing model output distributions across tumor stages and nodal status groups.

Development and validation of a BaP-PCa target-related prognostic model using machine learning. (A) Performance evaluation of 101 machine learning algorithm combinations for prognostic model construction. The Elastic Net model (α = 0.1) exhibited the highest average C-index (0.714) and was selected as the optimal model. (B) Coefficients of the 17 key genes included in the final Elastic Net model. (C–E) Kaplan-Meier survival curves for progression-free survival (PFS) comparing high-risk and low-risk groups stratified by median risk score in the TCGA-PRAD cohort (C) the combined GSE cohort (GSE21032 + GSE70770) (D) and the GSE116918 cohort (E). Log-rank p-values are shown. (F–H) Time-dependent receiver operating characteristic (ROC) curves for predicting 1-, 3-, and 5-year PFS in the TCGA-PRAD cohort (F) the combined GSE cohort (G) and the GSE116918 cohort (H). Area under the curve (AUC) values are indicated. (I,J) Correlation between risk score and clinical features, showing that higher risk scores are significantly associated with advanced pathological T stage (I) and higher Gleason score (J). *p < 0.05, **p < 0.01, ***p < 0.001.

Molecular docking of BaP with the 17-gene signature

We used molecular docking to evaluate the binding between BaP and the core BaP-PCa proteins. After excluding proteins with unsuitable binding pockets or a binding energy (ΔG) of zero, ten proteins were identified as forming stable complexes with BaP (ΔG < −5 kcal/mol). These were: ALDH1A3 (−9.8 kcal/mol, Figure 3A), CDC20 (−9.2, Figure 3B), CA2 (−7.2, Figure 3C), EXO1 (−9.2, Figure 3D), GRIN1 (−8.5, Figure 3E), PDE4D (−8.4, Figure 3F), PLK1 (−9.4, Figure 3G), RRM2 (−11.5, Figure 3H), and TTK (−9.6, Figure 3I). Although FOXN3 also exhibited a favorable energy score (−8.7 kcal/mol), it was excluded due to the absence of stabilizing hydrogen bonds with BaP. These nine proteins that stably bind BaP in silico are proposed as candidate key targets that may mediate the potential effects of BaP on PCa, based on computational predictions.

Nine-panel scientific illustration showing molecular docking of yellow-highlighted aromatic compounds in green protein binding sites, each labeled A through I. Dashed lines indicate atomic distances, with interacting residues and values annotated. Protein structures appear as semi-transparent ribbons in the background for spatial context.

Molecular docking of BaP with nine candidate target proteins identified by machine learning. (A) ALDH1A3 (binding energy: −9.8 kcal/mol). (B) CDC20 (−9.2 kcal/mol). (C) CA2 (−7.2 kcal/mol). (D) EXO1 (−9.2 kcal/mol). (E) GRIN1 (−8.5 kcal/mol). (F) PDE4D (−8.4 kcal/mol). (G) PLK1 (−9.4 kcal/mol). (H) RRM2 (−11.5 kcal/mol). (I) TTK (−9.6 kcal/mol). For each protein, the left panel shows the overall docking conformation of BaP within the protein binding pocket, and the right panel shows detailed interactions including hydrogen bonds (green dashed lines) and hydrophobic interactions. Binding free energy (ΔG) was calculated using AutoDockTools 1.5.7; values < −5.0 kcal/mol indicate stable complex formation.

Prognostic value and expression of the 9 core targets

We next assessed the association between the 9 core targets and PCa prognosis using Kaplan-Meier survival analysis across three independent cohorts (TCGA-PRAD, GSE, and GSE116918). Only five targets were consistently associated with patient outcomes. High expression of CDC20, EXO1, PLK1, and RRM2 was correlated with shorter PFS in all three cohorts, while high PDE4D expression was associated with longer PFS (Figure 4). Analysis of the HPA database revealed that protein levels of CDC20 and RRM2 were significantly elevated in tumor tissues (Figures 5A,B; Supplementary Table S7). PLK1 and PDE4D were highly expressed in both normal and tumor tissues (Figures 5C,D), and EXO1 data were unavailable. Given that RRM2 exhibited the strongest predicted binding affinity with BaP in docking studies, was consistently associated with poor prognosis across all cohorts, and showed elevated expression at both the transcript and protein levels, we selected it as a candidate for further exploratory analysis.

Grouped survival analysis charts display Kaplan-Meier survival probability curves for two patient groups, labeled "L" in red and "H" in blue, for five genes: CDC20, EXO1, PLK1, RRM2, and PDE4D across panels A to O. Each panel includes hazard ratios, p-values, confidence intervals, and the number of patients at risk at various time points, showing survival trends over time for each gene. Shaded areas indicate confidence intervals.

Prognostic significance of five core genes consistently associated with progression-free survival across three independent cohorts. (A–C) CDC20: Kaplan-Meier survival curves for PFS in the TCGA-PRAD cohort (A) the combined GSE cohort (B) and the GSE116918 cohort (C). (D–F) EXO1: survival curves in TCGA-PRAD (D) GSE (E) and GSE116918 (F). (G–I) PLK1: survival curves in TCGA-PRAD (G) GSE (H) and GSE116918 (I). (J–L) RRM2: survival curves in TCGA-PRAD (J) GSE (K) and GSE116918 (L). (M–O) PDE4D: survival curves in TCGA-PRAD (M) GSE (N) and GSE116918 (O). Patients were stratified into high-expression (red) and low-expression (blue) groups based on median gene expression levels. Log-rank p-values are shown for each comparison. High expression of CDC20, EXO1, PLK1, and RRM2 was consistently associated with shorter PFS across all three cohorts, while high PDE4D expression was associated with longer PFS.

Histological panels A, B, C, and D each contain two circular tissue samples labeled normal and tumor. Brown staining levels vary by panel; panel C shows strong staining in both samples, while other panels display weaker staining or slight differences between normal and tumor tissues.

Protein expression validation of core genes in normal and prostate cancer tissues using the Human Protein Atlas (HPA) database. Representative immunohistochemistry (IHC) staining images from normal prostate tissue (left) and prostate cancer tissue (right) for (A) CDC20, (B) RRM2, (C) PDE4D, and (D) PLK1. Scale bars represent 100 μm. CDC20 and RRM2 show markedly increased staining intensity in tumor tissues compared to normal tissues, consistent with their mRNA expression patterns. PDE4D and PLK1 show comparable expression levels between normal and tumor tissues.

The impact of RRM2 on the PCa microenvironment

Given prior evidence linking BaP to immunosuppression (Zhang Z. et al., 2025; Kang et al., 2022) and emerging studies implicating RRM2 in immune modulation (Tang et al., 2021; Li et al., 2022; Lee et al., 2023), we explored whether RRM2 expression is associated with immune infiltration patterns in PCa. Our CIBERSORT analysis revealed modest but statistically significant correlations between RRM2 expression and the infiltration levels of Tregs (r = 0.14, p < 0.001) and M2 macrophages (r = 0.21, p < 0.001) (Figures 6A-C). These observations suggest that RRM2-high tumors may exhibit a tendency toward an immunosuppressive microenvironment, although the modest correlation coefficients indicate that this association is subtle and likely context-dependent. Additionally, we observed upregulation of multiple immune checkpoint molecules in the RRM2-high group (Figure 6D). Collectively, these findings generate the hypothesis that RRM2 could be involved in immune modulation in PCa; however, experimental validation, including functional studies directly linking BaP exposure to RRM2-mediated immune remodeling, is required to establish a causal relationship.

Composite scientific figure with five panels labeled A to E. Panel A displays a grouped box plot comparing immune cell scores between low (red) and high (blue) groups across multiple cell types, showing statistical significance. Panel B presents a scatter plot with marginal histograms showing a positive correlation between M2 macrophage scores and RRM2 values. Panel C displays a similar scatter plot indicating a positive correlation between regulatory T cell scores and RRM2 values. Panel D includes stacked box plots comparing marker scores between two groups for several genes, highlighting significant differences. Panel E shows violin plots comparing gene expression between tumor (red) and normal (blue) tissues for various genes, marking statistical significance above several plots.

Association of RRM2 expression with tumor immune microenvironment and pan-cancer expression analysis. (A) Immune cell infiltration profiles in RRM2-high (n = 247) and RRM2-low (n = 247) groups from the TCGA-PRAD cohort, estimated using the CIBERSORT algorithm. Box plots show the relative abundance of 22 immune cell types. Statistical comparisons were performed using the Wilcoxon rank-sum test. (B) Pearson correlation analysis between RRM2 expression and regulatory T cell (Treg) infiltration (r = 0.14, p < 0.001). (C) Pearson correlation analysis between RRM2 expression and M2 macrophage infiltration (r = 0.21, p < 0.001). (D) Differential expression of immune checkpoint genes between RRM2-high and RRM2-low groups. Box plots show expression levels of 10 immune checkpoint molecules; 8 of 10 show significant upregulation in the RRM2-high group (*p < 0.05, **p < 0.01, ***p < 0.001). (E) Pan-cancer analysis of RRM2 expression across 33 TCGA cancer types. Box plots show RRM2 expression levels in tumor tissues (red) compared to matched normal tissues (blue) for each cancer type. Statistical significance: *p < 0.05, **p < 0.01, ***p < 0.001. ACC, adrenocortical carcinoma; BLCA, bladder urothelial carcinoma; BRCA, breast invasive carcinoma; etc.

Pan-cancer analysis of RRM2

A pan-cancer analysis using the SangerBox database revealed that RRM2 is significantly overexpressed in nearly all cancer types examined (Figure 6E). Furthermore, high RRM2 expression was associated with shorter PFS in multiple cancers, including GBMLGG, KIPAN, KIRP, PRAD, and others (Supplementary Figure S3). These results underscore the broad and significant role of RRM2 in tumorigenesis and progression, positioning it as a potential key molecular target in BaP-driven carcinogenesis.

Screening of potential natural product interventions

Given the increasing importance of natural products in anticancer drug discovery (Slika et al., 2022), we selected seven naturally occurring bioactive compounds: baicalin, ginsenoside, quercetin, curcumin, resveratrol, capsaicin, and berberine. The selection was based on the following criteria: (a) documented antitumor activity against prostate cancer as reported in the literature; (b) representation of diverse structural classes (including flavonoids, alkaloids, polyphenols, etc.) to evaluate the influence of structural variability on binding affinity; and (c) favorable safety profiles and promising potential for clinical application. Molecular docking analyses confirmed that all seven compounds bind stably to the RRM2 protein (Figures 7A–G). The calculated binding energies were as follows: baicalin (−10.8 kcal/mol), quercetin (−9.9 kcal/mol), berberine (−9.2 kcal/mol), curcumin (−8.5 kcal/mol), ginsenoside (−7.9 kcal/mol), resveratrol (−7.3 kcal/mol), and capsaicin (−6.5 kcal/mol). Given that baicalin exhibited the strongest binding affinity (lowest docking energy) and has been previously reported to possess therapeutic potential against prostate cancer, it was selected for subsequent analyses.

Seven molecular graphics labeled A to G show yellow molecules docked with protein structures in light purple, highlighting interaction sites and distances with green and yellow dashed lines. Panel H displays western blot images and two bar graphs for RRM2 protein expression in PC3 and DU145 cell lines under control, BaP, and BaP plus Baicalin conditions, demonstrating changes in protein expression levels.

Molecular docking of seven natural bioactive compounds with RRM2 and experimental validation of RRM2 expression modulation. (A–G) Molecular docking results showing the binding conformations of natural compounds with RRM2: (A) Baicalin (binding energy: −10.8 kcal/mol), (B) Quercetin (−9.9 kcal/mol), (C) Berberine (−9.2 kcal/mol), (D) Curcumin (−8.5 kcal/mol), (E) Ginsenoside (−7.9 kcal/mol), (F) Resveratrol (−7.3 kcal/mol), (G) Capsaicin (−6.5 kcal/mol). For each compound, the left panel shows the overall docking conformation within the RRM2 binding pocket, and the right panel shows detailed molecular interactions including hydrogen bonds (green dashed lines). (H) Western blot analysis of RRM2 protein expression in DU145 cells under different treatment conditions. Cells were treated with vehicle control, BaP (10 μM), baicalin (100/150 μM), or combination of BaP and baicalin for 48 h β-actin was used as a loading control. Bar graph shows quantitative analysis of RRM2 protein levels normalized to β-actin, expressed as fold change relative to control. Data are presented as mean ± SD from three independent experiments. **p < 0.01, ***p < 0.001 compared to control; #p < 0.01 compared to BaP alone.

In vitro effects of BaP and baicalin on PCa cells

Western blot analysis revealed that BaP treatment further increased RRM2 protein expression levels in PCa cells (DU145 and PC3) (Figure 7H). Subsequent functional assays demonstrated that BaP intervention significantly promoted PCa cell proliferation (EdU assay, Figure 8A), invasion (Transwell assay, Figure 8B), and migration (wound healing assay, Figure 8C). Conversely, baicalin intervention not only reversed the BaP-induced upregulation of RRM2 expression but also counteracted the BaP-mediated enhancement of proliferation, migration, and invasion in PCa cells. These findings further underscore that BaP may promote malignant progression of PCa through upregulating RRM2 expression, while baicalin may serve as a potential interventional agent against this effect.

Comments (0)

No login
gif