Our Data-Sharing Philosophy

Photo of Bill Louv
"The time seems right to pull together representatives for what we see as a federation of data-sharing entities. Our goal is clear: to collaborate with leaders of other data-sharing platforms in this federation."
Bill Louv, President of Project Data Sphere


We see power in data – power to generate solutions for cancer patients. We have dedicated ourselves to ensuring that valuable clinical trial data does not remain locked in information siloes. To that end, we solicit de-identified patient-level data and make it freely available on the Project Data Sphere® platform.

But we share more than data. We and other like-minded organizations share the belief that secondary use of patient data can generate unique insights to improve patient outcomes. 

Collectively, we can do more to advance the utility of data sharing.

By creating a federation of data sharing platforms with clinical, imaging, and genomic data, we can amplify the value of data that has been provided by patients. This federation, more than simply polite collaboration, can drive toward common approaches to serve both patients and our respective organizations. In order for the federation to generate sustainable value, creation and maintenance of common standards and best practices is critical:

  • Metadata: Investigators need to first find data of interest, and well-defined metadata that is consistent across data sharing platforms enables this data of interest to be readily found wherever it may reside. 
  • Data: Enormous progress has been made with regard to data standards, whether we are referring to CDISC, HL7, DICOM, etc. But there is still too much room for interpretation in representing data within these standards, and not enough adoption.  By demonstrating how standardized data can be effectively and efficiently integrated across data sharing platforms to solve real world healthcare research challenges, true progress can be measured. 
  • De-identification: There are well-developed standards for de-identification of clinical trial data, although their application varies across data providers. De-identification of images and genomic data represent new challenges that must be addressed consistently across organizations.
  • Patient Linkage: De-identification of different types of data are frequently performed asynchronously using different strategies, making it impossible to successfully integrate patient data from different domains after the de-identification process. New approaches to de-identification (and anonymization, which is related but critically different) must be developed to ensure the vitality of protected patient data.

A federation of data sharing platforms, working toward standardization and best practices across domains, is critical to enabling patient data to be fully brought to bear in discovering new biomarkers, developing new patient treatment strategies, understanding new safety concerns, and ultimately in improving outcomes for patients. These types of challenges will be solved by enabling data sharing organizations to focus on their respective specialties while avoiding data redundancy, and always keeping the patient in mind.

Project Data Sphere is driven to align other data sharing platforms with this vision.  At the same time, we understand that this is not a trivial task! Creation and maintenance of standards and best practices requires strong governance, financial commitment, and a long strategic view. Through our leadership, and the leadership of other data-sharing organizations, we will together achieve this vision.