Enhancing Electronic Health Record Data Quality

By Michael Awood

September 6, 2023

Electronic Health Record (EHR) data in biomedical research has seen a significant surge, especially during the COVID-19 pandemic. These rich data records are perfect for complex analyses, including machine learning and artificial intelligence. However, the quality of this data has drawn much concern.

Despite the critical role EHR data plays in healthcare and its frequent use in biomedical research, its quality is often overlooked. This presents a need for a standardised approach to evaluate EHR data quality.

In 1996, Wang and Strong proposed a data quality framework that covered intrinsic, contextual, representational, and accessible data quality. They highlighted that poor data quality could have social and economic impacts. Although their study was not healthcare-specific, their framework focused on the needs of data users, providing a unique perspective.

A 2013 review identified five aspects of EHR data quality – completeness, correctness, concordance, plausibility, and currency. They evaluated these aspects through seven methods, including a gold standard comparison and data element agreement. However, the definitions of these methods and dimensions often overlapped, indicating the need for a more standardised approach. Other data quality frameworks have been suggested, but they differ in their recording and discussion methods. This shows a lack of consensus and adoption.

The development of automated tools for data quality assessment holds the potential to address the data quality challenge. These tools could streamline the process and enhance efficiency. The article suggests that future research should aim at creating tools that improve data integrity and reliability in patient care and research. The potential impact of this work is vast, with implications for disease tracking, patient care, and the advancement of medical science.

It is important that EHRs reflect true and accurate data to minimise potential downstream inefficiencies – due to poor data. This data also feeds into prediction analytics or algorithms for various AI systems, where poor data could result in incorrect analytics and poor outcomes. At present, there are numerous entry points to an EHR due to the multitude of records. Data sources include researchers, medical providers, and increasingly, patients through patient-reported outcome measures (PROMs). Utilising the framework by Wang and Strong could provide a better understanding of data needs, leading to improved health records and health information exchanges (HIEs).

Reference url

Recent Posts

TAVI Reimbursement Expansion for Low- and Intermediate-Risk Patients: A New Era in Cardiovascular...

By João L. Carapinha

March 19, 2026

The TAVI reimbursement expansion announced by Zorginstituut Nederland on March 10, 2026, includes transcatheter aortic valve implantation (TAVI) in...
PRIME Scheme Evaluation: Insights from Enhanced Development Tools Pilot
The PRIME Scheme Evaluation of the European Medicines Agency’s (EMA) PRIority Medicines (PRIME), launched in 2016, confirms early and enhanced regulatory support accelerates medicines for unmet needs. A two-year pilot (April 2023–March 2025) tested three new features—regulatory roadmap and produc...
EU Approves Imfinzi Gastric Cancer Treatment as Perioperative Immunotherapy

By João L. Carapinha

March 17, 2026

Imfinzi gastric cancer treatment has received European Commission approval for adults with resectable, early-stage and locally advanced (Stages II, III, IVA) gastric and gastroesophageal junction (GEJ) cancers. AstraZeneca’s Imfinzi (durvalumab), a PD-L1 inhibitor, combined with FLOT chemotherapy...