MedPath

An Interpretable Fundus Diseases Report Generating System Based On Weakly Labelings

Not yet recruiting
Conditions
Retinal Diseases
Choroidal Disease
Pathological Myopia
Age Related Macular Degeneration
Diabetic Retinopathy
Choroidal Neovascularization
Registration Number
NCT06918028
Lead Sponsor
Zhongshan Ophthalmic Center, Sun Yat-sen University
Brief Summary

To establish a multimodal fundus image report generation model to realize an interpretable system for multiple fundus diseases, multimodal image analysis, diagnosis, and treatment decision automatic reporting based on weakly labeled training data. Construct an interpretable feature fusion network for the clinical and imaging features of fundus lesions, and we hope to extract new imaging markers that can predict the occurrence and progression of various fundus lesions at an early stage, and ultimately verify them in real clinical data, further providing possible directions for exploring the molecular mechanisms of refractory fundus lesions, and may also provide new ideas for the precise prevention and treatment of fundus lesions.

Detailed Description

1. AI models for multimodal fundus imaging in the diagnosis of retinal diseases. AI models have demonstrated significant potential in assisting the diagnosis of various retinal diseases based on multimodal fundus imaging. In recent years, AI has rapidly advanced in the field of fundus disease imaging diagnostics. High-accuracy diagnostic models can be developed using large datasets of precisely annotated single-modal images. Fundus photography, which provides clinicians with an initial diagnostic impression, is widely accessible and can be obtained using simple imaging devices or even mobile devices. For instance, Cen et al. trained a deep learning system using 249,620 precisely annotated fundus photographs to diagnose 39 common retinal diseases, achieving diagnostic accuracies exceeding 90% for each condition.

However, fundus photography alone offers limited disease information, making it challenging to differentiate between diseases with similar manifestations. In addition, its diagnostic accuracy is heavily based on image quality and clinician expertise, which may lead to missed or misdiagnosed cases.

Optical coherence tomography (OCT) provides a three-dimensional analysis of the retinal layers, clearly revealing the severity and location of pathologies such as intraretinal and subretinal fluid. OCT has become a standard diagnostic and differential diagnostic tool for retinal diseases and is essential to guide the precise treatment and follow-up of conditions such as age-related macular degeneration and diabetic macular edema. AI-assisted OCT analysis can further enhance the follow-up and personalized treatment of retinal diseases. For example, Fauw et al. utilized 14,884 OCT images to diagnose more than 10 retinal diseases and map the location of the lesions.

The accurate diagnosis of retinal diseases also relies on dynamic and functional evidence. Fundus fluorescein angiography (FFA) and indocyanine green angiography (ICGA) are indispensable for the location, characterization, and evaluation of the vascular function of the lesion. However, due to the complexity of interpreting angiographic images, the application of AI in FFA and ICGA analysis has only recently gained traction.

Moreover, most cases require multimodal imaging, including OCT, fundus photography, and angiography, to comprehensively locate and analyze the disease pathology. Additionally, integrating clinical data and patient medical history is crucial for an accurate diagnosis. Therefore, there is an urgent need to develop new AI models capable of integrating multi-modal data to assist clinicians in accurately diagnosing complex retinal diseases.

2. AI-Assisted Generation of Complex Medical Reports. The complexity and specialized nature of fundus imaging make image interpretation challenging, and the shortage of clinicians capable of generating precise reports further increases the workload of ophthalmologists and hinders early diagnosis and treatment of retinal diseases. AI-assisted report generation has the potential to address these challenges. Compared to simple image recognition and classification, developing AI models that generate textual interpretations from images is more complex, as it requires the machine to mimic a human-like understanding of image content. "Image-to-text" models incorporate various deep learning algorithms, including computer vision and natural language processing. Early models required millions of natural images to achieve satisfactory text generation, which is impractical for medical imaging due to limited datasets and extensive textual information. However, clinical reports generated by clinicians, based on comprehensive clinical data, detailed image analysis, and experience, can serve as high-quality training datasets for such models.

Currently, significant research efforts are focused on areas with large datasets and standardized report formats, such as chest X-rays, chest CT scans, and brain MRI. The widely used report generation databases include Open-IU, MIMIC-CXR, and PadChest, with MIMIC-CXR containing more than 270,000 chest radiograph reports. In ophthalmology, due to the complexity of fundus imaging interpretation and the relatively smaller size of the data set, research in this area is limited, particularly for highly specialized imaging modalities such as fundus angiography. Our team has successfully developed a fundus fluorescein angiography report generation dataset (FFA-IR) based on angiography images and the corresponding reports. Our report generation model can produce accurate bilingual reports (Chinese and English) for common and rare retinal diseases, with accuracy comparable to that of human retinal specialists, while significantly reducing report generation time.

3. Weak Annotation: A Promising Approach for Training New Disease Diagnosis and Classification Models Traditional AI-assisted diagnostic models for medical imaging typically rely on large-scale, precisely annotated, high-quality images, as the accuracy of the training dataset is critical for model performance. When high annotation accuracy cannot be guaranteed, increasing the dataset size is often the only way to improve model performance. However, for complex datasets like fundus angiography images, the large number of diseases and variable manifestations exponentially increase the annotation difficulty. Additionally, clinical reports often contain uncertainties due to incomplete clinical data or varying levels of clinician expertise, leading to inaccuracies in diagnostic reports.

Weak annotation offers a promising solution to reduce annotation costs, improve annotation efficiency, and improve model generalizability. In natural image processing, weak annotation has been widely studied, with large-scale natural image datasets enabling the training of such models. In medical AI research, Guo et al. pioneered the use of weakly annotated datasets derived from brain CT reports, automatically extracting low-quality keyword information to accurately identify and locate four common brain pathologies. This approach demonstrated excellent generalizability across different centers and imaging devices. However, these systems were limited to broad categories of diseases and single-modal images, lacking the ability to diagnose specific diseases or generate detailed reports.

Based on preliminary findings and literature, we propose a hypothesis: Can weakly annotated imaging reports, combined with AI deep learning algorithms such as knowledge graphs and Transformer, be used to build an interpretable, multi-modal, multi-disease fundus report generation system? This project aims to refine existing AI models and develop a system that helps generate imaging diagnoses and reports for multiple retinal diseases. The results will not only reduce the workload of ophthalmologists, but also promote the widespread adoption of advanced fundus imaging techniques, ultimately improving the early diagnosis and treatment of blinding retinal diseases.

Recruitment & Eligibility

Status
NOT_YET_RECRUITING
Sex
All
Target Recruitment
9999
Inclusion Criteria
  • Disease group: All multimodal fundus examination images containing fundus lesions, examined from January 2011 to December 2023, including fundus photography, as well as OCT, OCTA, FFA, ICGA, B-ultrasound and corresponding imaging reports. Images could be either clear or unclear, reports are either complete or incomplete.
  • Normal group: All multimodal fundus examination images without fundus lesions, examined from January 2011 to December 2023, including fundus photography, as well as OCT, OCTA, FFA, ICGA, B-ultrasound and corresponding imaging reports. Images could be either clear or unclear, reports are either complete or incomplete.
Exclusion Criteria
  • Disease group: 1. The image has serious quality problems; 2. The diagnostic report lacks key information.
  • Normal group: 1. The image has serious quality problems; 2. The diagnostic report lacks key information.

Study & Design

Study Type
OBSERVATIONAL
Study Design
Not specified
Primary Outcome Measures
NameTimeMethod
Fundus Fluorescein Angiography (FFA) images with corresponding reportBaseline

Fundus Fluorescein Angiography(FFA) images with corresponding report were collected. FFA allows dynamic observation of changes in retinal blood vessels and lesions.

Indocyanine Green Angiography (ICGA) images with reportBaseline

Indocyanine Green Angiography (ICGA) images with corresponding report were collected. ICGA allows dynamic observation of changes in choroidal blood vessels and lesions.

Fundus photography with corresponding reportBaseline

Fundus photography images with corresponding report were collected. Fundus photography provides observation of morphological manifestations of the retina, the retinal blood vessels, the optic nerve, as well as lesions on the retina.

Optical coherence tomography (OCT) images with corresponding reportBaseline

Optical coherence tomography (OCT) images with corresponding report were collected. OCT images provides observation of changes in retinal thickness, morphology and manifestations of lesions in each retinal or choroidal layer, as well as lesions in the macular area and optic nerve

Optical coherence tomography angiography (OCTA) images with corresponding reportBaseline

Optical coherence tomography angiography (OCTA) images with corresponding report were collected. OCTA images provides observation of the density, morphology, and manifestations of retinal blood vessels, as well as the morphology and manifestations of retinal lesions

Secondary Outcome Measures
NameTimeMethod
Bilingual Evaluation Understudy (BLEU) analysisThrough study completion, an average of 1 year

Bilingual Evaluation Understudy (BLEU) measures how closely a machine-generated text matches the reference texts to quantify the similarity between the generated text and reference texts.

Metric for Evaluation of Translation with Explicit Ordering (METEOR) analysisThrough study completion, an average of 1 year

Metric for Evaluation of Translation with Explicit Ordering (METEOR) considers precision, recall, alignment, and includes stemming and synonymy to quantify the similarity between the generated text and reference texts.

Recall-Oriented Understudy for Gisting Evaluation (ROUGE) analysisThrough study completion, an average of 1 year

Recall-Oriented Understudy for Gisting Evaluation (ROUGE) focuses on recall and measures the overlap of n-grams, contiguous sequences of n items (words, characters, or symbols) extracted from a given sample of text, between the generated and reference texts to quantify the similarity between the generated text and reference texts.

Pearson correlation analysisThrough study completion, an average of 1 year

To determine the degree of linear relationship between human evaluations and automated assessments.

Intersection-Over-Union (IOU) analysisThrough study completion, an average of 1 year

Also known as the Jaccard similarity coefficient, between the attention map regions of lesion images and the ground truth annotations to evaluate the accuracy of model interpretations.

Trial Locations

Locations (1)

Zhongshan Ophthalmic Center

🇨🇳

Guangzhou, Guangdong, China

© Copyright 2025. All Rights Reserved by MedPath