MedPath

Platform for Medical Information Extraction From Incomplete Data

Conditions
Liver Cancer
Registration Number
NCT01813942
Lead Sponsor
National Taiwan University Hospital
Brief Summary

In order to perform research smoothly, the process of information extraction is required for translating data in clinical text into available format for analysis and statistic. In medical research, the problem of missing data occurs frequently. It is important to develop the method with better imputation performance in the stability and accuracy. The purposes of this project are to provide the data integration and extraction methods for handling the structured and unstructured data sources in more efficient ways, to provide the validation scheme for facilitating the data reviewing of extracted results produced by information extraction modules, to increase the quality of clinical data by comparing the data from different data sources and correcting data errors and inconsistent, to handle the clinical data with the properties of time series and incompleteness, to increase accuracy of data analysis and increase quality of health care by improving the completeness and correctness of clinical data, to provide flexibility of methods in the platform. In the project, the disease topic is focused on the liver cancer patients' clinical data and we hope the methods in the projects can be extended to handle other diseases by replacing these knowledge models in the future.

Detailed Description

Because of the increasing adoption of Electronic Medical Record (EMR) systems, the data access of EMR is more and more convenient. However, there still have difficulties in analyzing all the clinical data directly due to a large number of records using the narrative format. In order to perform research smoothly, the process of information extraction is required for translating data in clinical text into available format for analysis and statistic. In medical research, the problem of missing data occurs frequently. It is important to develop the method with better imputation performance in the stability and accuracy. The purposes of this project are to provide the data integration and extraction methods for handling the structured and unstructured data sources in more efficient ways, to provide the validation scheme for facilitating the data reviewing of extracted results produced by information extraction modules, to increase the quality of clinical data by comparing the data from different data sources and correcting data errors and inconsistent, to handle the clinical data with the properties of time series and incompleteness, to increase accuracy of data analysis and increase quality of health care by improving the completeness and correctness of clinical data, to provide flexibility of methods in the platform. In the project, the disease topic is focused on the liver cancer patients' clinical data and we hope the methods in the projects can be extended to handle other diseases by replacing these knowledge models in the future.

Recruitment & Eligibility

Status
UNKNOWN
Sex
All
Target Recruitment
10000
Inclusion Criteria

Not provided

Read More
Exclusion Criteria

Not provided

Read More

Study & Design

Study Type
OBSERVATIONAL
Study Design
Not specified
Primary Outcome Measures
NameTimeMethod
The number of patients correctly identified by recurrence predictive model3 years

The recurrence predictive model is developed using the incomplete data set, this model is used for predicting the recurrent status of patient who received the specific treatment for liver cancer. The number of patients correctly identified by recurrence predictive model is regarded as the primary outcome measure.

Secondary Outcome Measures
NameTimeMethod

Trial Locations

Locations (1)

National Taiwan University Hospital

🇨🇳

Taipei, Taiwan

© Copyright 2025. All Rights Reserved by MedPath