Study pipeline

Turning RWD into RWE

How do study teams turn has mostly been collected for healthcare administration, delivery, or payment purposes and not with the goal of conducting research. Moreover, the RWD sources differ from one another in many ways, including healthcare setting, source population, local policies and culture, context and quality. For RWD sources to be used to answer research questions in a pharmacoepidemiologic study, the data need to go through a stepwise process engaging multiple experts and tools at each step. Patient-level data always stay local at the DEAP level, and only aggregated data are shared.

 

How do study teams turn RWD into RWE? Learn about each step of the pipeline by clicking on the highlighted parts of the figure below.

Turning RWD into RWE
Syntactic harmonisation

DEAPs first ETL their local data into the ConcePTION CDM to standardise the format. Then, they complete quality checks with the INSIGHT tool.

Read more
Semantic harmonisation

The PI prepares study variables by extracting relevant concepts from the data, such as outcomes, exposure, and confounders.

Read more
Application of epidemiological design

The study team applies the epidemiological design, developing a script that will use the variables to generate the analytical dataset.

Read more
Statistical analysis

Statisticians obtain estimates appropriate to answer the research question.

Read more
Results pooling

The DEAPs upload results from their data source to a secure digital research environment, where the results are pooled for final analysis.

Read more
Dissemination

The study team produces a final study report detailing background information, methodology and results, while also presenting at conferences and drafting manuscripts.

Read more

Real-World Data sources

The Data Expert & Access Partners (DEAPs) are the local experts in the VAC4EU network and currently have access to 14 different real-world data sources (RWD) from 8 different European countries, which collectively cover health data on 180 million people. These data sources include primary care records, hospital records, vaccine and pregnancy registers, pharmacy dispensing records, death registries, claims data, and many other types of records.

Syntactic harmonisation

As the studies conducted in VAC4EU use multiple RWD sources, and thus, multiple different formats, syntactic harmonisation is required. During the ETL process, participating DEAPs first extract the relevant sample from their data source, then transform and load it onto the ConcePTION CDM.

Semantic harmonisation

The variables the researchers are interested in are not necessarily recorded in the same languages and coding systems across RWD sources. Additionally, the samples extracted for these studies often include millions of people. The process of creating study variables out of this enormous volume of diverse data is called semantic harmonisation, since we are now concerned with definitions, concepts and meaning rather than format. Study teams use a set of forms to help define AESI, CodeMapper to support mapping vocabularies, BRIDGE to specify variables and time anchoring, and several algorithms to prepare the variables. VAC4EU also has a pipeline to support internal nested validation studies.

Application of epidemiological design

At this point in the pipeline, the exact steps may differ depending on which epidemiological design has been selected to answer the research question. The study team uses the study variables to select the study population, sample controls, match, censor, identify time windows, weight, and transform study concepts into time anchored study variables. Often several analytical datasets are produced for different study objectives.

Statistical analysis

These include but are not limited to descriptive statistics, distributions, rates, and models. By now, any information in the data which would identify the individuals in the dataset has been removed, compliant with local privacy regulations. These regulations also sometimes require masking and secondary disclosure control. The script also includes post-processing, consolidating the results into a set of tables and figures.

Results pooling

The statisticians now share the full study script with the DEAPs via GitHub. The complete study script includes the final components of steps 2, 3 and 4 and graphical representation of these. The script also includes post-processing, consolidating the results into a set of tables and figures. This allows each DEAP to review the results from their data source with greater ease. The DEAPs upload the results from their data source to a secure digital research environment. Currently we use the anDREa platform through UMC Utrecht. On the DRE, results from different DEAPs are pooled in a table or meta-analysis.

Dissemination

The study team produces a final study report detailing background information, methodology and results, while also presenting at conferences and drafting manuscripts.

Real-World Evidence

Evidence generated from these studies helps regulators and public health bodies make timely, informed decisions about vaccine safety and effectiveness, especially during public health emergencies or when new vaccines are introduced. For example, at the end of 2020 when several different vaccines had been developed for COVID-19, VAC4EU was contacted by multiple market authorisation holders to conduct post-authorisation safety studies, delivering regular reports monitoring vaccine safety throughout 2021.

In addition to regulatory use, VAC4EU promotes transparency and scientific collaboration by sharing findings through peer-reviewed publications, public resources, and conference presentations. Through this process, we enable evidence-based decision-making on vaccines across Europe.