Researchers have demonstrated that incorporating messenger ribonucleic acid (mRNA) data can significantly improve the pre-operative prediction of biochemical recurrence (BCR) in prostate cancer (PCa) patients. The study, published in PLOS ONE, highlights the potential of machine learning (ML) models utilizing mRNA variables to outperform traditional nomograms based solely on clinical information.
Enhancing BCR Prediction with mRNA
Radical prostatectomy (RP) remains a primary treatment for PCa, but recurrence rates remain high, with BCR occurring in 20-40% of patients post-operatively. Predicting treatment failure pre-operatively could facilitate earlier decisions regarding primary and adjuvant therapies. Recent technological advancements now allow for routine acquisition of patient genetic data, enabling a precision medicine approach to inform therapeutic decisions.
The study retrospectively analyzed a cohort of 135 patients with clinical follow-up data and mRNA information comprising over 26,000 features. Researchers compared the performance of ML models, including random survival forest (RSF), boosted Cox, and regularized Cox models, against reference nomograms using only routine clinical information. The results indicated that including mRNA information significantly improved pre-operative BCR prediction.
Machine Learning Models Outperform Traditional Methods
The machine learning-based time-to-event models significantly outperformed reference nomograms that used only routine clinical information. The gain obtained from inclusion of genetic information is observed in terms of discrimination, calibration, and predictive performance. The best-performing pipeline for each modeling strategy is highlighted in bold.
Specifically, pipelines using univariate Cox feature selection yielded significantly higher discrimination compared to those using correlation-based filtering only (p < 0.001), and combining both filtering strategies was optimal for boosted Cox and RSF models (p < 0.001). Overall, RSF achieved the highest discrimination performance, yielding statistically significant improvement over all other models (p < 0.001).
Key mRNA Variables Identified
Model stability and feature selection analysis identified several key mRNA variables frequently selected across the top-performing models. These included DNAH8, ABCC11, ESM1, and PI15. PSA level was the only variable consistently picked by all models. The predictive ability of DNAH8 in PCa has been observed previously for assessing poor prognosis. Increased levels of ESM1 were previously linked to progression and development of metastasis. PI15 has been identified as a biomarker for discrimination of metastatic progression.
Clinical Implications and Future Directions
This study demonstrates the potential of genetic information for improved pre-operative prediction of time-to-BCR in PCa. Incorporating mRNA measurements in pre-operative assessment would require an extraction from tumor-biopsy samples. As these samples are already required for routine clinical diagnosis the additional workload would relate solely to the mRNA analysis which has become more cost-reasonable and feasible in recent years. ML methodologies are needed to leverage the predictive capabilities of mRNA data and thus a shift is needed from conventional nomograms. Improving their interpretability will be key in enabling clinical integration of such alternatives.
Further validation of these findings on mRNA data acquired pre-operatively will be explored in follow-on work. External validation of the findings on a larger multi-center dataset or multiple single-centered datasets including more varied populations will be key for future development and general clinical application.