For Researchers: Access Data
As a cancer researcher, you already know that high-quality clinical trial data is critical to investigating new hypotheses. But this vital data from completed clinical trials typically has been trapped within the virtual walls of those organizations that have conducted the trials.
The Project Data Sphere® (PDS) platform breaks down those walls and brings high-quality data from completed cancer clinical trials to a single, easily accessible, research platform. By accessing data on the PDS platform, you’ll gain the ability to aggregate data that spans both tumor types and research organizations, and which has previously been locked away.
Research hypotheses, however, don’t run on just high-quality data. That’s why the PDS platform also provides advanced analytical tools from SAS to support your explorations and investigations. These tools, available at no cost to registered researchers, provide not only core data preparation and statistical processing capabilities, but advanced modeling, simulation, machine learning, and visualization solutions in support of your research activities.
Apply for Access
Access to the patient-level clinical trial data and embedded analytical tools is granted through an application process. Prospective users will need to complete a short online form and consent to the Project Data Sphere Cancer Research Platform Agreement (available for review as part of the registration process) in order to apply to become platform authorized users.
The authorization review and approval process are typically completed within seven business days, if not sooner. Click here to apply for access.
The data content and analytical tools provided within the Project Data Sphere cancer research platform are accessible to researchers affiliated with life sciences companies, hospitals, institutions and other organizations, as well as independent researchers. Prospective research platform users will need to complete a short online form and consent to the Project Data Sphere Cancer Research Platform Agreement (available for review as part of the registration process) in order to apply to become platform authorized users.
The online application form includes a request for a description of any planned research and the accompanying research goals. This optional research information enables Project Data Sphere to develop metrics regarding how the platform is intended to be used and has NO impact on the applicant approval process.
The authorization review and approval process are typically completed within seven business days, if not sooner.
Data within the Project Data Sphere cancer research platform is available at two levels. All visitors to data.projectdatasphere.org, regardless of whether they have registered to be authorized users, are able to search for data sets of interest based upon a variety of filters (e.g., data provider, tumor type, etc.) or via free-text search terms. The information available, at this level, from these data sets includes descriptive information regarding the data set itself such as a summary of the study, the study title, description, type of data available (comparator, active treatment, or both), etc.
Authorized users additionally have access to the data set content, including the study protocol, annotated Case Report Form, data dictionary, and the patient-level data itself. The documents associated with each data set can be viewed directly through the platform application, and all data sets can be viewed and analyzed using the embedded platform analytical tools. In most cases, and based upon the requirements of the data provider, the patient-level data sets can be downloaded for review and investigation within the authorized user’s preferred analytical environment.
Authorized users seeking access to data sets provided to the Project Data Sphere cancer research platform via integration with the NCTN/NCORP Data Archive must, per NCI’s required business processes, complete an additional authorization process before data sets and supporting content can be accessed. This authorization process is directly available within the Project Data Sphere cancer research platform.
Data sets provided to the Project Data Sphere cancer research platform include four key components:
Each clinical trial has a master plan called a protocol. This plan explains how the trial will be conducted and outlines the criteria by which patients are to be selected for the trial, the procedures and tests that patients will receive and the types of data that will be collected from the patients.
Annotated case report form (CRF):
A CRF is the paper or electronic instrument used to record the patient data in a clinical trial. An “annotated” CRF indicates how the recorded information relates to the structure of the stored electronic patient data.
The data dictionary describes the details of the electronic patient data on a field-by-field basis, indicating in which data tables individual fields can be found, how the individual data tables are related and various levels of detail regarding the data fields themselves.
Patient-level data sets:
The patient-level data sets represent the individual data points that have been captured for each patient. Through careful understanding of the research protocol, the annotated CRF and the data dictionary, data scientists can apply analytical tools to the patient-level data sets and discover new scientific insights. The patient-level data sets available within the Project Data Sphere cancer research platform can be investigated individually, or can be aggregated for more comprehensive investigation. Although industry data standards such as CDISC SDTM and ADaM (www.cdisc.org) are now widely adopted, there may be considerable differences with regard to how the data sets provided to the Project Data Sphere platform are structured. These differences are based upon a variety of factors, including each provider organizations’ interpretations of the standards, the maturity of the data standards at the time each trial was completed and whether the trial was considered for registration purposes. Through ongoing curation and standardization efforts, Project Data Sphere is continually working to increase the efficiency with which researchers can aggregate and investigate patient-level data sets that span data providers, research domains (industry and academia) and data standards eras.