Study to Develop a Tool to Estimate the Kidney Function in Databases Without Laboratory Data
- Conditions
- Renal Function
- Interventions
- Other: No Intervention
- Registration Number
- NCT03605810
- Lead Sponsor
- Bayer
- Brief Summary
Scientific analyses are frequently performed on e.g. health insurance databases to study the usage and effectiveness of drugs in real life.
Kidney function is known to have an influence on a patients disease development and/or drug levels in blood.
However, often direct measures for kidney function are not available in databases.
This study plans to develop tools to classify the renal function of patients, which helps scientists to identify patient cohorts (groups of patients sharing same characteristics) for scientific analyses.
- Detailed Description
Renal impairment is a common comorbidity in patients with diverse main underlying diseases and a pathology accompanying increasing age. Renal function might be an important modifier of treatment effects.
Population-based administrative claims databases are increasingly used in large-scale comparative outcomes studies of drug treatments. However, claims databases often lack information on laboratory tests results limiting their usefulness in Real-World Evidence(RWE) research of patients with renal impairment.
There is a need to develop methods for identification of patients with renal dysfunction from healthcare administrative claims-based proxies.
The main objective of this study is the development of algorithms/models to predict eGFR values and/or classes for patients at certain time point based on entries in claims database (demographic characteristics, clinical diagnoses, procedures and drug treatments) for a general population and a variety of use-cases (atrial fibrillation, coronary artery disease, type 2 diabetes mellitus patients sub-populations). To achieve this, modern data-driven machine learning techniques will be applied to discover relationships between renal status, measured by eGFR, and longitudinal patient-level data.
Evaluation of models' performance (out of sample validation, benchmark test, performance differences between eGFR value prediction algorithms and classification models tailored for the pre-defined eGFR classes) will be done as well.
Recruitment & Eligibility
- Status
- COMPLETED
- Sex
- All
- Target Recruitment
- 5132200
Not provided
Not provided
Study & Design
- Study Type
- OBSERVATIONAL
- Study Design
- Not specified
- Arm && Interventions
Group Intervention Description Atrial fibrillation (AF) sub-population No Intervention To be included in the AF sub-population patients need to satisfy the inclusion criteria for the eGFR-population; have two inpatient or outpatient diagnoses for AF or atrial flutter on two different days within the study period irrespective of time points when eGFR is measured. Patients with at least one inpatient or outpatient diagnosis or procedure code for mitral stenosis and prosthetic valves within the study period will be excluded. eGFR-population No Intervention To be included in the eGFR-population, patients have to have at least one recorded eGFR value in the OPTUM CDM database between January 1, 2007 and December 31, 2016, be adults (\>18 years of age at the time of eGFR test) and have at least 370/180 days (180 days serves as sensitivity analysis) of continuous enrollment in medical and pharmacy insurance plans since eGFR test date. Coronary artery disease (CAD) sub-population No Intervention To be included in the CAD sub-population patients need to satisfy the inclusion criteria for the eGFR-population; have at least one inpatient CAD diagnosis within the study period irrespective of time points when eGFR is measured. Type 2 diabetes mellitus (T2DM) sub-population No Intervention To be included in the T2DM sub-population patients need to satisfy the inclusion criteria for the eGFR-population; have at least two inpatient or outpatient diagnosis of T2DM on two different days within the study period irrespective of time points when eGFR is measured.
- Primary Outcome Measures
Name Time Method Performance of classification to predict eGFR From eGRF values starting and lasting 180d + 370d For numeric models cross-validated performance is measured as correlation via r\*2.
Class based performances are measured as cross-validated sensitivities given pre-defined false discovery rates with following definition for positives and negatives:
Observed eGFR class X:
* positive: eGFR measured at begin of time frame is in class X
* negative: eGFR measured at begin of time frame is not in class X
Class predicted by model:
* positive: eGFR predicted is class X
* negative: eGFR predicted is not class X
- Secondary Outcome Measures
Name Time Method
Trial Locations
- Locations (1)
US OPTUM CDM database
🇺🇸Whippany, New Jersey, United States