What WCM-supported data storage tools integrate with WIDRR?
The following list of data storage tools are WCM-supported and/or WCM-implemented. They comply with the new Data Retention policies, and integrate with WIDRR
What does the Cornell University Research Data Retention policy mean for WCM researchers?
Section 1.3.8 details the responsibilities of WCM faculty regarding collection and retention of research data and should be read carefully by WCM researchers.
Your main responsibilities include:
Data management plan: WCM faculty must create, abide by, and fund a data management plan that specifies where they will deposit data at the close-out of research.
What is close-out of research? For funded research, close-out is whichever comes first of these two events:
The end of the grant or contract agreement OR
60 days prior to faculty member leaving institution
Data retention: Faculty must enter the required metadata information and method description into the WCM Institutional Data Repository for Research (WIDRR). Please see the FAQ, “What kind of data do I need to deposit?” for further details.
When does the data retention need to occur? After publication or within three yearsafter final project closeout of all funded or unfunded research.
How long does the data need to be retained? Primary data and supporting images must be available for at least six years after publication. If data and images are used in subsequent publication, or the original publication is cited in another publication or grant application by the same faculty member(s), the data must be available for an additional six years from the date of the most recent citation
What kind of data do I need to deposit?
Faculty must enter a data catalog record (marked public or private) into the WCM Institutional Data Repository for Research (WIDRR). That record should include the location of the datasets (including raw data) and provide the methods file sufficient for replication and audit of the research. It is not mandatory to deposit the datasets into WIDRR if it is available in an existing data repository.
There is a drop-down choice in WIDRR with choices of data repositories, but you can manually enter it as well. Please contact the library (https://library.weill.cornell.edu/ask-us) to add other data repositories to the drop-down options.
If the datasets are stored in a WCM secure location (e.g., Box, OneDrive, network fileshare), please supply the accessible share link.
Associate a dataset to a milestone(s) and a project.
Provide the methods file describing all the steps of the analysis starting from the raw data input file up to the published data output file. The faculty must specify all the software, parameters and code used in the methods file. Intermediate experiments and data that are not necessary for replication of the research do not need to be preserved.
Examples of what types of data to provide:
Basic and translational science
URL to dbGaP for BAM file(s)
Path to the WCM file share with proteomics data set
File of mass spectrometry output
Microscopy image files
Original and post-processed animal radiology images
Python script used to perform analysis
Text file describing steps for experiments
Copy of paper lab notebooks
Link to their electronic lab notebook (e.g., LabArchives, OneNote) containing the raw data
Clinical science:
Path to WCM OneDrive location containing clinical trials data
URL to NIH All of Us Research Program Researcher Workbench data set
File containing research-ready EHR data with protected health information (PHI)
REDCap de-identified data set, codebook, and/or case report form
SAS, STATA, and R files containing code (plus data files)
Word document detailing how to transform raw EHR data into research-ready data
How long does data need to be retained?
Primary data and supporting images must be available for at least six years after publication. If data and images are used in subsequent publication, or cited in a subsequent publication or grant application by faculty, then data must be available for an additional six years.
What is raw data?
WCM definition of raw data for retention purposes is the following:
“Data considered as raw data are any final file generated by instruments used to collect raw data prior to any additional filtration, data cleaning, or analysis work.”
WCM leaves to the PI the responsibility to determine which appropriate format to use for their raw according to the standards in their respective fields (i.e., the raw data format usually required by journals in the field).
Should I archive the raw data of a paper I authored but not as first or last author (co-author)?
If you are collaborating with a study team external to WCM, including institutions located abroad, and this collaboration results in publication(s), you are responsible for archiving the raw data that pertains to your contributions in the publication(s) (e.g., if your contribution was a figure, then provide the raw data & methods file for that figure). The same logic applies to grant applications.
IT Glossary
Type an acronym or term you would like a definition for.