Adequate documentation and organization of research data are key for reliable and reproducible research, enabling researchers to track their progress, share their insights with others, and build upon existing discoveries. Understanding and prioritizing the systematic organization and documentation of research outputs (including data and code) is more than a practical need. It is also a fundamental responsibility for those dedicated to advancing scientific knowledge.
File Management & Organization
Efficient data management is facilitated through the implementation of clear and structured file organization and naming methods, resulting in streamlined file retrieval, enhanced productivity, and improved collaboration while also serving as a safeguard against data loss. While recognizing that individual projects may have distinct requirements, we recommend that researchers adhere to established conventions and best practices for optimal file management and organization:
- Create a logical folder structure
- Give preference to open formats
- Develop and adopt a consistent file naming convention
- Consider using version control software such as Git to track changes and collaborate more seamlessly.
Documentation
READMEs are a common way to document the contents and structure of a project folder and/or a dataset. A README is a file that serves as a brief and informative document to help users, developers, and contributors understand your project's purpose, usage, and setup. It should provide essential information about your research project and files (including data, code, and supplementary materials). README files are typically written in plain text, Markdown, or another lightweight markup language, making them easy to read and maintain. We suggest you use one of the templates below and customize it to your project needs:
README Template - Markdown (.md)
Research has become increasingly reliant on computing, and many projects generate code to clean and analyze data, run simulations, or automate processes. Properly documenting code is not merely a good practice; it's an essential step in ensuring integrity and transparency in research.
Metadata
Rows and columns of numbers and characters have little to no meaning unless they are documented in some fashion. Metadata---the details about what, where, when, why, and how the data were collected, processed, and interpreted---provide the information that enables data and files to be discovered, used, and properly cited. Metadata is an important part of the Data Life Cycle because it enables data reuse long after the original collection. We often refer to metadata "as a love note to your future self", which you and potential reusers will greatly appreciate having available to be able to understand and interpret what's inside your data files.
The goal is to have enough information for the researcher to understand the data, interpret the data, and then reuse the data. Therefore, the following questions should be addressed in the documentation:
- What was measured?
- What were the units of measure?
- Who measured it?
- When was it measured?
- Where was it measured?
- How was it measured?
- How is the data structured?
- Why was the data collected?
- Who should get credit for this data?
- How can this data be reused?
- What license governs the data?
Metadata Standards
A metadata standard serves as a specialized manual for your metadata, specifying a common structure and language for describing and managing data or information. It organizes metadata components into collections tailored for particular objectives, assigning them consistent names and definitions so that data can be more easily interpreted by both humans and computers. Additionally, a metadata standard may incorporate guidelines regarding obligatory content, required syntax, and the use of a regulated vocabulary.
Some standards include Dublin Core, the Data Documentation Initiative (DDI), Ecological Metadata Language (EML), Darwin Core. You may browse Research Data Alliance's metadata standards inventory at: https://rdamsc.bath.ac.uk
Before selecting a metadata standard, ask yourself:
- Is there a particular standard baked into the system or repository you plan to share your data?
- What standards are similar projects and others in your field using?
- How big is the user community around this metadata standard?
Need help with your project? Contact: rds@library.ucsb.edu
Recommended Resources
- DataONE Tutorials
-
Version Control
-
File Naming
-
Project Folder Organization
- Code Documentation
- Metadata