TY - JOUR
T1 - A comparative study of clinical trial and real-world data in patients with diabetic kidney disease
AU - Kurki, Samu
AU - Halla-aho, Viivi
AU - Haussmann, Manuel
AU - Lähdesmäki, Harri
AU - Leinonen, Jussi V.
AU - Koskinen, Miika
N1 - Funding Information:
This study was funded by Bayer Oy (Finland).
Publisher Copyright:
© 2024, The Author(s).
PY - 2024/1/19
Y1 - 2024/1/19
N2 - A growing body of research is focusing on real-world data (RWD) to supplement or replace randomized controlled trials (RCTs). However, due to the disparities in data generation mechanisms, differences are likely and necessitate scrutiny to validate the merging of these datasets. We compared the characteristics of RCT data from 5734 diabetic kidney disease patients with corresponding RWD from electronic health records (EHRs) of 23,523 patients. Demographics, diagnoses, medications, laboratory measurements, and vital signs were analyzed using visualization, statistical comparison, and cluster analysis. RCT and RWD sets exhibited significant differences in prevalence, longitudinality, completeness, and sampling density. The cluster analysis revealed distinct patient subgroups within both RCT and RWD sets, as well as clusters containing patients from both sets. We stress the importance of validation to verify the feasibility of combining RCT and RWD, for instance, in building an external control arm. Our results highlight general differences between RCT and RWD sets, which should be considered during the planning stages of an RCT-RWD study. If they are, RWD has the potential to enrich RCT data by providing first-hand baseline data, filling in missing data or by subgrouping or matching individuals, which calls for advanced methods to mitigate the differences between datasets.
AB - A growing body of research is focusing on real-world data (RWD) to supplement or replace randomized controlled trials (RCTs). However, due to the disparities in data generation mechanisms, differences are likely and necessitate scrutiny to validate the merging of these datasets. We compared the characteristics of RCT data from 5734 diabetic kidney disease patients with corresponding RWD from electronic health records (EHRs) of 23,523 patients. Demographics, diagnoses, medications, laboratory measurements, and vital signs were analyzed using visualization, statistical comparison, and cluster analysis. RCT and RWD sets exhibited significant differences in prevalence, longitudinality, completeness, and sampling density. The cluster analysis revealed distinct patient subgroups within both RCT and RWD sets, as well as clusters containing patients from both sets. We stress the importance of validation to verify the feasibility of combining RCT and RWD, for instance, in building an external control arm. Our results highlight general differences between RCT and RWD sets, which should be considered during the planning stages of an RCT-RWD study. If they are, RWD has the potential to enrich RCT data by providing first-hand baseline data, filling in missing data or by subgrouping or matching individuals, which calls for advanced methods to mitigate the differences between datasets.
UR - http://www.scopus.com/inward/record.url?scp=85182669008&partnerID=8YFLogxK
U2 - 10.1038/s41598-024-51938-3
DO - 10.1038/s41598-024-51938-3
M3 - Article
C2 - 38243002
AN - SCOPUS:85182669008
SN - 2045-2322
VL - 14
JO - Scientific Reports
JF - Scientific Reports
IS - 1
M1 - 1731
ER -