Enhancing the Analytic Capacity of Colorec_AstraZe_2006_78 using a Statistical Linkage Method to Append Socioeconomic and Health Care Access Variables from the Medical Expenditure Panel Survey

Unique Dataset IDColorec_Multipl_2006_251
SponsorMultipleData ProviderRTI InternationalTotal Study Enrolled Patients0Comparator (Control) Arm Enrolled IDN/ URLN/A
Study PhaseClinical Study Phase III, Clinical Study Phase IIBBlinding MethodOtherType(s) of dataOnly comparator arm dataIntervention TypeOtherDataset TypeOther

Clinical Trial Title

Trial Summary and Conditions


Data Summary

The enhanced dataset includes all linkages achieved between the comparator arm patients in PDS dataset, Colorec_AstraZe_2006_78, and colo-rectal cancer survivors from the Medical Expenditure Panel Survey (MEPS). In addition to the set of linkages, the dataset also includes observations that represent the MEPS cancer survivors that were eligible for linkage but for which no linkages were formed. Variables in the dataset include demographic characteristics, health status and perceptions, health care access, health insurance, medical care use and expenditures, and a variable that indicates the set of criteria used to achieve the linkage. It is recommended that data uses read the documentation accompanying the dataset before beginning analysis to assist with understanding the data structure and linkage methods. The documentation also includes instructions on how to access the source data files for the MEPS cancer survivors. A crosswalk has been provided to explain differences between the MEPS content on the linked dataset and content in the source files.

Study Objectives

This data enhancement project seeks to further advance the mission of the PDS platform by enabling new explorations into the potential influence of health care access, socioeconomic factors, and health behaviors on the patient-level efficacy and outcomes data contained in the PDS online service. This was achieved using a statistical linkage method in which patient-level records from PDS dataset, Colorec_AstraZe_2006_78, were matched with colo-rectal cancer survivors from the nationally representative Medical Expenditure Panel Survey (MEPS). Linkage criteria were based on age, sex, race, tumor location, and a quality of life assessment called the EQ-5D index score. The use of the EQ-5D score as a linkage criterion reduces the multitude of many-to-many exact matches that would have occurred using only age, race, and sex. A sixth optional linkage criterion based on BMI category has also been provided so that users may restrict the set of linkages based on age, sex, race, tumor location, and EQ-5D further as desired. The addition of the MEPS data to the patient-level data within the PDS enclave will facilitate hypothesis-generating research efforts that explore the level of variation in patient outcomes potentially attributable to differentials in access to basic health care services and their utilization, to socioeconomic characteristics, and to health behaviors and preferences. It will support exploratory analyses designed to examine questions such as How are variations in cancer patients' access to health care and income impacting patient outcomes in specific phase III clinical trials? What variations in patient outcomes are associated with specific demographic, socioeconomic, and health-related factors? Are the demographic characteristics of those cancer patients enrolled in specific phase III clinical trials comparable to cancer patients with the same disease in the general population?

Outcome Measures


Available Downloads:

To gain access to the data and analytic tools click here.

PROTOCOL: Documentation of Data and Methods.docx

CRF: Codebook_linked_pds78_20180429_final FINAL.rtf

DATA DICTIONARY: MEPS Variable Crosswalk.xlsx

DATA (COMPARATOR ARM): linked_pds78_20180429_final.sas7bdat