Stroke Prediction Through Internet Search Queries
- Conditions
- StrokeAcute Myocardial Infarction
- Registration Number
- NCT04755959
- Lead Sponsor
- Tel-Aviv Sourasky Medical Center
- Brief Summary
Cerebrovascular disease (stroke) is a leading cause of mortality and disability. Common risk assessment tools for stroke are based on the Framingham equation, which relies on traditional cardiovascular risk factors (e.g., hypertension, dyslipidemia, diabetes, smoking, atrial fibrillation). These risk assessment tools calculate the likelihood for a general vascular "event" such as stroke and myocardial infarction in the near decade, but do not assess the risk for an impending event although that would enable taking immediate preventive action (e.g. anticoagulants for atrial fibrillation; control of hypertension). Covert cerebrovascular disease is linked to subtle cognitive and motor deficits and increased risk for stroke. We hypothesize that it is possible to identify subjects with impending stroke based on their internet communication features 0-12 months prior to the actual occurrence of acute clinical stroke. Based on this we have previously developed an internet-based algorithm that accurately identifies people at risk of stroke through cognitive changes manifested in their search queries. The purpose of this study is to validate the model and train a new model by analyzing Google queries of patients hospitalized in the Tel-Aviv Sourasky Medical Center with stroke. Acute myocardial infarction and unaffected spouses will serve as controls.
- Detailed Description
In this study we will analyze Google queries of patients hospitalized in the Tel-Aviv Sourasky Medical Center with stroke, between the years 2016-2020. following the formal completion of the signed informed consent form, consenting subjects will personally request their search data from Google Takeout service from up to 2 years before to one year after the stroke and provide access to the researchers by creating a one-time file of these data and sharing them with the researchers. Following an informed consent data of Google queries will extracted by the participants using google "Take Out" service, from up to 2 years before to one year after the stroke. The control groups will consist of patients diagnosed with acute myocardial infarction (MI) and MI/stroke -free patients' spouses. Recruitment will be primarily from two institutionally approved data bases of stroke and acute MI. A total of 450 participants will be recruited, 150 in each group, based on prior experience for the minimally-sized dataset of query logs needed to construct a model and test its performance.
Anonymity will be guaranteed through several modalities; access to the Google Takeout data will be limited to the minimal amount required to perform the research and will be given only to members of the data analysis team in the Tel-Aviv University, while access to the medical data will be provided only to researchers from the Tel-Aviv Sourasky Medical Center. Additionally, members of the Tel-Aviv University Partner Team will be obligated to refrain from any positive attempts to identify subjects participating in the trial. All data shared with Tel-Aviv University will be stored on a local, encrypted, hard disk. The data will be deleted from Tel-Aviv University's computers at the end of the project.
Our previously developed Machine Learning model, which was able to predict stroke in subjects as compared to age matched controls with an area under curve (AUC) of 0.972 which translates to a positive predictive value of 52.7% at a false-positive rate of 1%, will be used to predict, for each day, the likelihood that a person will undergo a stroke event on that day. This will be compared to the known date of stroke or MI. The measure of performance will be the Receiver Operating Curve (ROC) of the detection and the corresponding Area Under Curve (AUC). Additionally, a new model will be trained using the collected data and tested similarly, albeit using 10-fold cross-validation. Stroke detection sensitivity and specificity will be derived.
Recruitment & Eligibility
- Status
- UNKNOWN
- Sex
- All
- Target Recruitment
- 450
-
One of the following clauses a, b or c:
- Discharge from the Tel Aviv-Sourasky Med Center with an imaging study- supported diagnosis of stroke at any date starting 6/2016 to 03/2020.
- Discharge from the Tel Aviv-Sourasky Med Center with a clinical diagnosis of acute myocardial infarction supported by biochemical, electrocardiographic and/or imaging study, at any date starting 6/2016 to 03/2020.
- Spouse of patient in clause "a" or "b", provided that the spouse did not undergo a clinical stroke within the same timeframe or earlier.
-
The use of internet on a regular basis prior to the event due to which the subject has been included in this study.
-
Consent to allow the Tel-Aviv University Partner Team (Prof. Ran Gilad-Bachrach and authorized lab members) access to their Google queries from 2 years before to one year after the stroke event.
- A diagnosis of TIA or stroke not supported by imaging studies
- Diagnosed prior cognitive impairment
- Diagnosed co-existing neurodegenerative diseases
- Diagnosed and not fully treated hormonal or nutritional deficiency, with the exception of post-menopausal state or male hypogonadism.
Study & Design
- Study Type
- OBSERVATIONAL
- Study Design
- Not specified
- Primary Outcome Measures
Name Time Method Sensitivity of stroke detecion Three years The sensitivity of the algorithm to detect subjects who developed stroke, based on an internet query of their communications 0-12 months prior to the date of the stroke.
Specificity of stroke detection Three years The specificity of the detection rate, based on comparisons with the query profile in a concurrent cohort with acute myocardial infarction within the same time frame as well as with the stroke patients' stroke-free spouses.
- Secondary Outcome Measures
Name Time Method