A Cross-Cohort Validated Plasma Lipid Biomarker Assay for Early Breast Cancer Detection Using Machine Learning

Abstract

Early detection of breast cancer remains essential for improving clinical outcomes, and complementary non-invasive approaches are needed to support existing screening methods, particularly for women with dense breast tissue. We have previously reported plasma lipid biomarker discovery using untargeted high-resolution liquid chromatography tandem mass spectrometry (LC–MS/MS). In this study, we performed biomarker confirmation and developed machine-learning models applied to targeted plasma lipid measurements for the non-invasive detection of early-stage breast cancer across international cohorts with independent external validation. Targeted LC–MS/MS was used to quantify candidate lipid panels in plasma samples from European discovery cohorts (n = 554) and an independent Australian cohort (n = 266) used for external validation. Data-driven feature selection identified a 15-lipid panel with strong performance in European cohorts (AUC ≥ 0.94). External validation prior to confidence stratification yielded 76% sensitivity, 64% specificity, and an AUC of 0.81 in the Australian validation cohort. Clinical assay development requires iterative panel and model testing to support translational feasibility and performance in the intended-use population. An analytically viable panel, excluding lipids requiring complex and costly synthesis, achieved comparable accuracy with improved assay robustness. Confidence-based analysis showed enhanced performance for predictions made with moderate to high confidence, with sensitivity up to 89% and AUC up to 0.85, suggesting that ongoing research should focus on strategies to enhance diagnostic model confidence. Importantly, model predictions were independent of breast density, tumour size, grade, subtype, and morphology, indicating biological specificity of the lipid signature. These results demonstrate that calibrated machine-learning models applied to plasma lipid biomarkers can support non-invasive breast cancer detection. Expanding training datasets to include greater diversity will further improve performance in the ongoing development of this lipid-based detection approach.

Competing Interest Statement

The authors have declared no competing interest.

Funding Statement

This study was funded by BCAL Diagnostics Ltd, Sydney NSW 2113, Australia

Author Declarations

I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.

Yes

The details of the IRB/oversight body that provided approval or exemption for the research described are given below:

This study was conducted in accordance with the Declaration of Helsinki. Recruitment of subjects in Europe was approved by the Institutional Review Board of Ochsner Health, New Orleans, LA, USA (protocol code 2015.101). The sponsor of the biospecimen collection study was Capital Biosciences, Gaithersburg, MD, USA. Recruitment of subjects in Australia was approved by the Sydney Local Health District Human Research Ethics Committee (protocol code 2020/ETH02051).

I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.

Yes

I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).

Yes

I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.

Yes

Data Availability

All data produced in the present study are available upon reasonable request to the authors

Comments (0)

No login
gif