Precision Medicine for L/GCMN and Melanoma 1
- Conditions
- Melanoma (Skin Cancer)Nevi and Melanomas
- Interventions
- Other: Gradient Boosting Survival Analysis (GBSA),Other: Concordance index
- Registration Number
- NCT06608420
- Lead Sponsor
- Fundacion Clinic per a la Recerca Biomédica
- Brief Summary
The primary objective of this study is to create a highly multidimensional and multicentric database for melanoma that encompasses cohorts of children, adolescent and young adults. This database will be used to perform survival analysis and evaluate sentinel lymph node (SLNB) positivity in CAYA. The secondary objectives to be met are the following:
* Adaptation and optimization of algorithms: work on optimizing existing precision medicine algorithms, which are currently being used in adult patient care, for their application within pediatric and young adult populations.
* Implementation of transfer learning: given the limitations associated with pediatric and young adult data, the investigators intend to utilize transfer learning techniques. The study will employ a sequential waterfall methodology, whereby machine learning models trained on adult patient data will be fine-tuned using the more limited data from younger cohorts.
* Integration of expert medical opinion: to integrate physician's scientific domain knowledge into the decision support system. This will be facilitated through the comprehensive examination of existing literature, as well as the evaluation of variable risk contributions within each patient group.
* AI-based prognostic models: to develop artificial intelligence-based models for the quantitative prognosis of melanoma across the three age groups: adults, young adults, and children.
- Detailed Description
Precis-Mel 1 is a unicentric observational study using retrospectively collected data. The proposed procedure is to start using data including demographic and family data, genetic data, medical procedures and cancer treatment, cutaneous biopsy, etc. to build a multidimensional dataset and apply AI algorithms that can produce survival curves and sentinel lymph node (SLNB) positivity in CAYA. The approach to be used is presented in the following sub-sections:
* Data engineering: the multidimensional dataset is meticulously integrated via DBT and SQL queries on a PostgreSQL database. This results in a model-ready comprehensive table, maintaining the crucial temporal dimension of patient histories. Identifiers are assigned to maintain the integrity of the data trail and the connection between various patient events such as metastasis and death. Python-based transformations ensure that sequential patient events are contextually enriched by preceding occurrences. Operations include arithmetic aggregations, extremum calculations and string manipulations. Events are discretized over a standardized temporal frame (1-3 months) for uniform staging reference, also serving to consolidate any misaligned data instances.
* Model development: our approach employs survival analysis to address the unique challenges of our dataset, particularly censoring, where an event of interest, like death, does not occur within the observation window. Based on our previous experience in modelling this problem, the investigators prefer to use Gradient Boosting Survival Analysis (GBSA), a non-deep learning method, as it effectively addresses data scarcity issues. GBSA adapts the gradient boosting machine algorithm for survival analysis, particularly accommodating censored data. In survival analysis, patients are represented by a triplet (xi, δi, Ti), where xi is the feature vector, Ti is the time to event, and δi indicates whether the observation is censored. Our goal is to estimate the survival function S(t), representing the probability of a patient surviving beyond time t, and the hazard function λ(t), indicating the instantaneous probability of an event occurring at time t. To adapt it for the survival modelling domain, our model utilizes the gradient boosting approach with a modified loss function, the negative log partial likelihood. This allows us to effectively estimate the survival function.
* Performance metrics: the investigators measure model performance using the concordance index (c-index), a metric particularly suited for survival analysis. The c-index assesses the predictive accuracy of our model by comparing predicted and observed event times. A high c-index indicates that our model effectively predicts the order of patient hazard given its input features.
Recruitment & Eligibility
- Status
- RECRUITING
- Sex
- All
- Target Recruitment
- 6000
- Melanoma patients of any age with histopathological confirmed melanoma
- Not having a melanoma diagnosis
- Not having signed the informed consent
- Records prior to the year 2012 (as data might not accurately reflect current practices and treatment outcomes)
Study & Design
- Study Type
- OBSERVATIONAL
- Study Design
- Not specified
- Arm && Interventions
Group Intervention Description Melanoma patients Gradient Boosting Survival Analysis (GBSA), The training dataset will consist of 6000 adult melanoma patients while the adaptation dataset for children, adolescents and young adults (CAYA) will be of N = 120. Melanoma patients Concordance index The training dataset will consist of 6000 adult melanoma patients while the adaptation dataset for children, adolescents and young adults (CAYA) will be of N = 120.
- Primary Outcome Measures
Name Time Method Patient prognosis curves 24 months The main outcome of the study will be to obtain prognosis indicators, mainly survival curves and sentinel lymph node (SLNB) positivity, by training artificial intelligence-based models using tabular clinical data in children, adolescents and young adults (CAYA).
- Secondary Outcome Measures
Name Time Method
Trial Locations
- Locations (1)
Hospital Clínic de Barcelona (Dermatology service)
🇪🇸Barcelona, Catalonia, Spain