HIPC Data Sharing Plan

Introduction

The Human Immunology Project Consortium (HIPC) was established with the specific goal to share data as widely and freely as possible in order to promote new research and generate new hypotheses. In addition to individual Center projects, the HIPC program includes an Infrastructure and Opportunities Fund (IOF) supporting pilot projects as well as shared research infrastructure for ongoing development of the immunology project network which may include, for example, the development of shared databases, sample repositories, bioinformatics tools, sample sparing assays, centralized laboratory resources, and other collaborative activities. It is essential that data be made available to HIPC investigators, as well as those outside of the HIPC Centers, for the purposes of data mining. The method of data storage must enable easy comparison of standardized results and meta-data. The ease with which data can be shared and compared is critical to the success and mission of the HIPC Centers.

The HIPC Data Sharing Plan is designed to enable the widest dissemination of data, while also protecting the privacy of the participants and the utility of the data, by de-identifying and masking potentially sensitive data elements. This approach is fully compliant with the NIH public data sharing policy (http://grants.nih.gov/grants/policy/data_sharing).

The HIPC Data Sharing Plan applies to all HIPC investigators (see definition below). HIPC Principal Investigators (PIs) are responsible for providing copies of this plan to their Center investigators as well as to any other collaborators who receive HIPC funds, including IOF support.

Relevant Definitions:

  • "Data" refers to all data associated with a HIPC-funded research project or clinical study, including the meta-data needed to interpret and mine the results.
  • A "data set, "for the purposes of this document, is pre-defined in a formal written Data Set Completion Plan developed for each study by HIPC investigators in cooperation with NIAID Program Officers. When a previously defined data set is completed, the timeline for making the data available to the public begins (see section b., "Defining Data Sets," below). Note that data may be deposited into the Immunology Database and Analysis Portal (ImmPort, www.Immport.org) prior to the completion of a pre-defined data set. In this case, the timeline for making the data set public does not begin until the defined data set has been completed.
  • A "HIPC investigator" is any Project Leader, Core Leader or other investigator funded under one or more of the HIPC U01 or U19 grants, as well as other investigators supported by the IOF.
  • A "HIPC Principal Investigator (PI)" is the PI of a HIPC U01 or U19 grant.

Timeline for Public Data Sharing

Completed data sets, as described below, will remain private to the HIPC investigator who generated the data, and his/her designee, for 3 years in order to allow for analysis and publication by the HIPC investigator, OR until acceptance of the first publication by the HIPC investigator. When the earlier of these two conditions is met, the data set will be moved to a public space for access by outside investigators. NOTE that abstracts and oral presentations are not considered to be publications for the purposes of this HIPC Data Sharing Policy.

The timeline for public availability will not begin until the data set is complete, although portions of a previously defined data set should be deposited into ImmPort prior to that point, in accordance with the timeline outlined in the Data Set Completion Plan.

Requests for embargo periods in excess of 3 years can be made with appropriate justification, but are expected to be very rare.

Data Management

Central Database: HIPC investigators agree to deposit their data into the Immunology Database and Analysis Portal (ImmPort) system (www.Immport.org) according to a timeline determined together with the NIAID Program Officer for each study. To fulfill the HIPC data sharing objectives, the investigators will enter all study data and meta-data into ImmPort. If any additional or alternative databases are identified by the HIPC Steering Committee, these will also be acceptable platforms for data sharing.

  1. Standards for Complex Data Sets: To support the proper use and interpretation of data and meta-data, standards for minimum information will be applied. These standards will be defined by the HIPC Bioinformatics/Biostatistics Subcommittee, using existing standards as available and appropriate.
  2. Defining Data Sets: Given that each data set is unique, HIPC investigators and NIAID Program Officers will determine, in advance, through the Data Set Completion Plan:
    1. Requirements to clean and close data and/or data sets,
    2. Elements that comprise a complete data set,
    3. Center specific requirements for IRB approval to release data for public use, and
    4. Expected timeline for the completion of each data set.

    Completion of each pre-defined data set will begin the 3 year timeline for making data publically available.

    NOTE: Investigators receiving HIPC IOF funds will define the data sets for their projects in cooperation with the HIPC investigator who sponsored the IOF request and the NIAID Program Officer for that "parent" HIPC grant.

  3. Data in ImmPort:
    1. Data Deposition: In order to minimize bottlenecks within the database, HIPC investigators will deposit data into their private database space as they are cleaned and closed, which in most cases will be prior to data set completion. The 3 year timeline for making data available to the public will not begin until a data set is complete. The frequency for orderly deposition of data into the private space in ImmPort is to be defined in cooperation with the NIAID Program Officer. Prior to the end of the current funding period, the Center's PI will work with the NIAID Program Officer to set the timeline for final data deposition.
    2. Access to Private Data: The HIPC investigator determines the right of data access to his/her private data (incomplete data sets or data not yet released to the public). In addition, NIAID program staff and database staff will have access to all data submitted into ImmPort. NIAID staff will access the data for administrative purposes only. Database staff will have access to the data for quality control and oversight purposes only. The confidentiality agreement between the central database team and NIAID governs data privacy in ImmPort or other central databases.

      In some cases, NIAID staff may need to use private HIPC data for presentations to other Federal Agencies, or in response to Congressional inquiries. Such use of data is intended to promote NIAID initiatives, the HIPC program, and HIPC research goals. In these cases, NIAID will notify in advance those HIPC investigators responsible for generating the data, and provide an explanation of why access to the private data is needed. NIAID staff will make every effort to maintain the confidentiality of private data except for situations where public reporting (e.g. transcripts or slide presentations) is required by Federal policy or law.
    3. Voluntary Data Sharing: Individual HIPC investigators may choose to share their data with other HIPC investigators and non-HIPC investigators (including potential industry partners) if they obtain permission to do so in advance from all the parties who generated the data. Prior HIPC Steering Committee approval is not required, but NIAID strongly recommends that a confidentiality disclosure agreement (CDA) be in place prior to such data sharing. The terms of these CDAs should also cover the confidentiality and distribution of any post-meeting documents (e.g. transcripts, meeting notes, or slide presentations). In particular, HIPC investigators are strongly encouraged to create CDAs to share private data for the purpose of building the HIPC database and creating data standards. Sharing private data under this type of agreement would be independent of the timeline for public data sharing established for each complete data set in the Data Set Completion Plan.
  4. Quality Control: The HIPC investigators will work with the central database team to ensure data correctness by validating, curating, and verifying data submitted to ensure quality and biological soundness.

Protecting Human Subject Data

It is the responsibility of HIPC investigators to protect the rights of human subjects, the privacy of human subject information, and the confidentiality of such data at all times. Prior to sharing, data shall be stripped of all identifiers to eliminate risks of unauthorized disclosure of personal identifiers. Accordingly, NIH policy will be adhered to, as outlined in the following:

"NIH Data Sharing Policy and Implementation Guidance: Human Subjects and Privacy Issues":
http://grants.nih.gov/grants/policy/data_sharing/data_sharing_guidance.htm#hs

"Policy for Sharing of Data Obtained in NIH Supported or Conducted Genome-Wide Association Studies (GWAS)": http://grants.nih.gov/grants/guide/notice-files/NOT-OD-07-088.html

Non-Shareable Data

In cases where some data are not amenable to sharing (participant confidentiality concerns, third-party licensing or use agreements, etc.), the issue must be raised immediately with NIAID program staff in order to identify non-shareable data. NIAID staff may approve a limited exemption from data sharing in some circumstances.

Download HIPC Data Sharing Plan