Researchers at Nanjing Drum Tower Hospital have developed machine learning models that use readily available blood inflammation markers to predict treatment response in breast cancer patients receiving neoadjuvant chemotherapy (NAC). The study, published in Frontiers in Oncology, analyzed 209 patients and found that systemic inflammation markers can effectively predict pathological complete response (pCR), a key indicator of treatment success.
Inflammation Markers Show Predictive Power
The research team examined four systemic inflammation markers derived from routine blood tests: neutrophil-to-lymphocyte ratio (NLR), platelet-to-lymphocyte ratio (PLR), lymphocyte-to-monocyte ratio (LMR), and neutrophil-to-monocyte ratio (NMR). These markers reflect the balance between pro-tumor inflammatory responses and anti-tumor immune activity.
Among the 209 patients studied, 29 achieved pCR while 180 did not. The analysis revealed that patients without lymph node metastasis were more likely to achieve pCR (P=0.008), and HER2-positive patients showed significantly higher pCR rates (P=0.030).
Multivariate analysis identified three independent predictors of pCR: lymph node metastasis status (OR 0.347, 95% CI: 0.140-0.862, P=0.023), NLR (OR 0.376, 95% CI: 0.143-0.990, P=0.048), and LMR (OR 2.828, 95% CI: 1.081-7.400, P=0.034). Patients with lower NLR values, higher LMR values, and absence of lymph node metastasis demonstrated better treatment responses.
Machine Learning Optimization Reveals Best Algorithm
The researchers tested three machine learning algorithms - Support Vector Machine (SVM), Random Forest (RF), and K-Nearest Neighbors (KNN) - to construct pCR prediction models. The Random Forest algorithm demonstrated superior performance with the lowest root mean square error (RMSE) of 0.109 and the highest correlation coefficient (r) of 0.94.
"RF outperformed SVM and KNN in terms of handling data complexity, robustness, and stability across different datasets," the authors noted. The Random Forest model's ensemble approach using multiple decision trees with bootstrap aggregating reduced overfitting and improved robustness, making it particularly effective for high-dimensional clinical data.
Clinical Implications for Survival Outcomes
The study's survival analysis revealed significant prognostic value for these inflammation markers. During a median follow-up of 68 months, patients with lower NLR values showed longer disease-free survival (P=0.029) and overall survival (P=0.041). Higher LMR values were associated with longer disease-free survival (P=0.044).
The absence of lymph node metastasis correlated with both longer disease-free survival (P=0.005) and overall survival (P<0.001). These findings suggest that systemic inflammation markers not only predict immediate treatment response but also provide valuable prognostic information for long-term outcomes.
Addressing Clinical Limitations
The researchers acknowledged several limitations of current biomarker approaches. "Many studies are trying to explore tumor biomarkers for breast cancer prognosis, but due to economic and technical limitations, most of them remain in the laboratory stage and have not been applied to clinical large-scale," they explained.
In contrast, peripheral blood inflammation markers offer significant advantages as they are "convenient, cheap and fast to operate" through widely available blood tests. This accessibility makes them particularly valuable for clinical implementation compared to more complex molecular biomarkers.
Model Performance and Validation
The Random Forest model achieved optimal hyperparameter settings with n_estimators=200, max_depth=10, and max_features=2 through cross-validation and grid search. The model's superior performance stems from its ability to automatically select key features and capture complex nonlinear relationships between predictors and outcomes.
The study found that inflammation plays a crucial role throughout cancer treatment, not only promoting cancer progression but also suppressing anti-tumor immune responses. Neutrophils and monocytes can promote tumor angiogenesis while lymphocytes play critical roles in anti-tumor immunity, explaining why these ratios serve as effective predictive markers.
Future Clinical Applications
The researchers emphasized that their model provides "a simple and cost-effective tool for personalized treatment strategies." The Random Forest prediction model could effectively evaluate NAC efficacy in breast cancer patients, supporting precision medicine approaches.
However, the study's single-center retrospective design and relatively long enrollment period present limitations. The authors noted that "breast cancer treatment methods have continuously evolved, and the emergence of targeted therapies has led to changes in NAC regimens" during the study period.
Future research will focus on expanding sample sizes, incorporating deep learning techniques, and conducting multi-center prospective trials to enhance the model's generalizability and accuracy. The goal is to develop more sophisticated prediction models that can better adapt to evolving treatment protocols and diverse patient populations.