DATA MANAGEMENT
Comprehensive Data Management Planning

Overview

One of the challenges of the South Dakota Biofilm Science and Engineering Center is the integration of large-scale, diverse datasets and analytic tools into a comprehensive framework to help provide a basis for a system level understanding of 2D materials – Biofilm interactions. To facilitate sharing of data among participants and with the broader scientific community, we will adopt open-source data formats. Importantly, we will employ the FAIR Data Principles in order to ensure that the data generated through this project is findable, accessible, interoperable and reusable. We will follow the research community’s minimum information about biofilm experiment (MIABiE) standards for storing, sharing and publishing our biofilm experimental data. Other Minimum Information Standards will be used to standardize the description of other generated research data.

The 2D-BEST team has developed a workflow to facilitate data management, sharing, and publishing. Once it is determined that experimental data should be shared with the project team, researchers follow the requisite protocol to move the data, along with its standard description, to shared storage (see Figure 1). This project is leveraging the University of South Dakota Data Store (SDDS) and Globus in this data management plan. The SDDS provides high-reliability, high-availability, network-accessible storage for South Dakota researchers and collaborators. Globus is a not-profit service for secure, reliable research data management.

Sharing Research Data

Sharing Data
Figure.1 - Workflow to share data with project team

Determine the types and sources of data that will be shared (e.g. samples, physical collections, code, software, curriculum materials and other materials).

For experimental data, determine if there is a minimum information standard for your data type.

Minimum information standards are sets of guidelines and formats for reporting data derived by specific high-throughput methods. Their purpose is to ensure the data generated by these methods can be easily verified, analysed and interpreted by the wider scientific community. Ultimately, they facilitate the transfer of data from journal articles (unstructured data) into databases (structured data) in a form that enables data to be mined across multiple data sets. Minimal information standards are available for a vast variety of experiment types including microarray (MIAME), RNAseq (MINSEQE), metabolomics (MSI) and proteomics (MIAPE). You can explore the data standards available for your field at FAIRsharing.org. (EMBL-EBI Training)

For experimental, qualitative, raw, processed data:  

  • Identify if there is minimum information standard for your data type.
  • Download the readme template here to assist you in describing your data. This data readme template was provided by the Cornell University Research Data Service Group.
  • Using the minimum standard and readme template, describe your data. At this point, it will be likely that the readme file will be incomplete.
  • The readme file will be transfered and stored with the data.

The South Dakota Data Store will be leveraged to enable 2D-BEST researchers to share their project data with collaborators. This data storage will be made available to researchers through Globus. ​Globus is a non-profit service for secure, reliable research data management. With Globus, researchers can move, share, & discover data via a single interface through a web browser.​

To Register with Globus:  

  • Go to http://globus.org.
  • Click the Log In Log In option in the upper right corner.
  • Find your organization (e.g. USD, SDSU, MSU, UNO); optionally sign in with ORCID ID. Note: USD, SDSU, MSU, and UNO with allow you to log in using your institutional credentials. If you are not able to locate your institution in the list, you can register for an ORCID ID.

Organizing the 2D-BEST data will be important in order to provide easy accessibility to collaborators. Figure 2 offers an example of how that dataset collections could be categorized. After you have determined what data you will share and have completed the readme file that describes the data, the next step is to transfer the data to shared collection using the Globus platform.

To transfer data to shared storage:  

  • Identify where in the data collection organization you will place your data. The best way to approach this task is to contact one of the project data administrators. Below is a list of the 2D-BEST Data Administrators
    Institution Contact Email
    South Dakota School of Mines & Technology Shankarachary Ragi Shankarachary.Ragi@sdsmt.edu
    South Dakota State University Sen Subramanian Senthil.Subramanian@sdstate.edu
    University of Nebraska, Omaha Parvathi Chundi pchundi@unomaha.edu
    University of South Dakota Carol Lushbough Carol.Lushbough@usd.edu
    University of South Dakota Etienne Gnimpieba EtienneGnimpieba.usd.edu

  • With help from a data administrator, create a data collection folder if one does not already exist.
  • Upload your data into the identified collection
    Data Organization
    Figure.2 - 2D-BEST Data Organization Example

Globus allows you to create and manage groups of Globus users that can then facilitate the sharing of collections, folders and files with these groups. If your requisite use group does not exist, the following steps describe how to create one.

  • Click on the Groups link in the left-hand panel in Globus.
    User Groups
    Figure 3 - Accessing Globus Group Functionality
  • Click on the Create new group link in upper right corner.
    Create Groups
    Figure 4 - Creating Globus User Group
  • Enter the details describing your user group. The "group members only" option indicates that all members of this group will be specifically invited by you.
    Enter Group Details
    Figure 5 - Entering User Group Details
  • Once the group has been created, you have the option of inviting members.
    Enter Group Details
    Figure 6 - Inviting Members to a Join a User Group
  • To invite a user to become a member of your group, they must be registered in Globus. Users are located using their email address. The system will send an email to the person being invited and he/she has the option of accepting.
    Enter Group Details
    Figure 7 - Searching for a Member to Add
  • To access user's detail, simply click on the name of the member. This will enable you to modify the member's role, status and notes.
    Enter Group Details
    Figure 8 - Accessing User Details
  • The default role for members added to a group is "Member". This allows the user to access group resources. You can decide to assign "Manger" or "Administrator" role to the member which would allow them to access, add, remove resources. Additionally, the "Administrator" role provides the ability to manage groups.
    Defining User Roles
    Figure 9 - Defining User Roles

Once data has been moved to the data store and a group has been created, you can share your data with the group. The following steps outline this process.

  • To share a file or folder with a user or group, select the file or folder and then click the user group icon User Group in the right hand panel. Then click the permissions option Permissions in the upper right corner.
    Defining User Roles
    Figure 10 - Sharing Data with Users
  • The "Add Permissions - Share With" interface (Figure 11) provides the functionality to associate a file or folder with a user or group. For our example, we verify that we are sharing the /Omics/Genomics/ folder and that we want to share it with the Lushbough Lab group. Therefore we click the group radio button. We also want to specify if the group member will have read, write, or read & write permissions. Click the select group button User Group and select your user group.
    Selcting Group to Share
    Figure 11 - Associating Group with Folders or Files
  • After the "Add Permissions - Share With" form is completed, click the "Add Permissions" button.
    Adding Permissions
    Figure 12 - Adding Permissions

Publishing Research Data

We are still in the early days of publishing data. Our plan will be to leverage USD's Research, Engage, Design (RED) services to facilitate project data publishing. RED is a service of the University of South Dakota University Libraries that promotes and shares the scholarship, creative works, and data created by South Dakota faculty, students, and institutional partners .

Publishing Data
Figure.13 - Workflow for Publishing Research Data