[go: up one dir, main page]

Next Issue
Volume 6, October
Previous Issue
Volume 6, August
 
 

Data, Volume 6, Issue 9 (September 2021) – 6 articles

Cover Story (view full-size image): Trends in the sciences are indicative of data management becoming a feature of the mainstream research process. In this context, the European Commission introduced an Open Research Data (ORD) pilot at the start of its Horizon 2020 Research and Innovation programme. With Horizon 2020 gradually coming to an end and Horizon Europe having recently started, an important facet of the new EU research cycle is to support research data management and open access to research data according to the principle “as open as possible, as closed as necessary”. With this in mind, a review of projects that participated in the Horizon 2020 ORD pilot has been undertaken in anticipation of identifying best practices and providing insights into the formulation and implementation of effective data management plans. View this paper.
  • Issues are regarded as officially published after their release is announced to the table of contents alert mailing list.
  • You may sign up for e-mail alerts to receive table of contents of newly released issues.
  • PDF is the official format for papers published in both, html and pdf forms. To view the papers in pdf format, click on the "PDF Full-text" link, and use the free Adobe Reader to open them.
Order results
Result details
Section
Select all
Export citation of selected articles as:
13 pages, 3072 KiB  
Data Descriptor
Dataset of Flow-Induced Vibrations on a Pipe Conveying Cold Water
by Francisco Villa, Cherlly Sánchez, Marcela Vallejo, Juan S. Botero-Valencia and Edilson Delgado-Trejos
Data 2021, 6(9), 100; https://doi.org/10.3390/data6090100 - 17 Sep 2021
Cited by 3 | Viewed by 3039
Abstract
Analysis of flow-induced pipe vibrations has been applied in a variety of applications, such as flowrate inference and leak detection. These applications are based on a functional relationship between the vibration features estimated in the pipe walls and the dynamics related to the [...] Read more.
Analysis of flow-induced pipe vibrations has been applied in a variety of applications, such as flowrate inference and leak detection. These applications are based on a functional relationship between the vibration features estimated in the pipe walls and the dynamics related to the flow of the substance. The dataset described in this document is comprised of signals acquired using an accelerometer attached to a pipe conveying cold water at specific flowrate values. Tests were carried out under numerals of the ISO 4064-1/2: 2016 standard and were performed in two measurement benches designed for flowmeter calibration, and a total of 80 flowrate values, from 25 L/h to 20,000 L/h, were considered. For each flowrate value, 3 to 6 samples were taken, so that the resulting dataset has a total of 382 signals that contain acceleration values in three axes and a timestamp in microseconds. Full article
(This article belongs to the Section Information Systems and Data Management)
Show Figures

Figure 1

Figure 1
<p>Accelerometer axis: red indicates <span class="html-italic">x</span>-axis, green indicates <span class="html-italic">y</span>-axis and blue indicates <span class="html-italic">z</span>-axis.</p>
Full article ">Figure 2
<p>Pump system: (<b>a</b>) Centrifugal pumps and accumulator tank; (<b>b</b>) Control panel.</p>
Full article ">Figure 3
<p>Pump system diagram.</p>
Full article ">Figure 4
<p>Instrumentation diagram of the micro measurement bench.</p>
Full article ">Figure 5
<p>Micro measurement bench: (<b>a</b>) Measurement lines with flowmeters as configured in a calibration test; (<b>b</b>) Measurement lines as configured in the test performed for this dataset.</p>
Full article ">Figure 6
<p>Instrumentation diagram of the macro measurement bench.</p>
Full article ">Figure 7
<p>Macro measurement bench: (<b>a</b>) measurement line; (<b>b</b>) flowmeters.</p>
Full article ">Figure 8
<p>Verification of the IMU positioning.</p>
Full article ">
13 pages, 2860 KiB  
Data Descriptor
Technical Data of Heterologous Expression and Purification of SARS-CoV-2 Proteases Using Escherichia coli System
by Rafida Razali, Vijay Kumar Subbiah and Cahyo Budiman
Data 2021, 6(9), 99; https://doi.org/10.3390/data6090099 - 16 Sep 2021
Cited by 7 | Viewed by 3023
Abstract
The SARS-CoV-2 coronavirus expresses two essential proteases: firstly, the 3Chymotrypsin-like protease (3CLpro) or main protease (Mpro), and secondly, the papain-like protease (PLpro), both of which are considered as viable drug targets for the inhibition of viral replication. In order to perform drug discovery [...] Read more.
The SARS-CoV-2 coronavirus expresses two essential proteases: firstly, the 3Chymotrypsin-like protease (3CLpro) or main protease (Mpro), and secondly, the papain-like protease (PLpro), both of which are considered as viable drug targets for the inhibition of viral replication. In order to perform drug discovery assays for SARS-CoV-2, it is imperative that efficient methods are established for the production and purification of 3CLpro and PLpro of SARS-CoV-2, designated as 3CLpro-CoV2 and PLpro-CoV2, respectively. This article expands the data collected in the attempts to express SARS-CoV-2 proteases under different conditions and purify them under single-step chromatography. Data showed that the use of E. coli BL21(DE3) strain was sufficient to express 3CLpro-CoV2 in a fully soluble form. Nevertheless, the single affinity chromatography step was only applicable for 3CLpro-CoV2 expressed at 18 °C, with a yield and purification fold of 92% and 49, respectively. Meanwhile, PLpro-CoV2 was successfully expressed in a fully soluble form in either BL21(DE3) or BL21-CodonPlus(DE3) strains. In contrast, the single affinity chromatography step was only applicable for PLpro-CoV2 expressed using E. coli BL21-CodonPlus(DE3) at 18 or 37 °C, with a yield and purification fold of 86% (18 °C) or 83.36% (37 °C) and 112 (18 °C) or 71 (37 °C), respectively. The findings provide a guide for optimizing the production of SARS-CoV-2 proteases of E. coli host cells. Full article
Show Figures

Figure 1

Figure 1
<p>The three-dimensional model structures of (<b>a</b>) 3CLpro-CoV2 (PDB ID: 6WTM) and (<b>b</b>) PLpro-CoV2 (PDB ID: 6W9C). The domain organization and catalytic residues of both proteases were also indicated for clarity.</p>
Full article ">Figure 2
<p>The primary structure of SARS-CoV-2 proteases: (<b>a</b>) 3CLpro-CoV2 and (<b>b</b>) PLpro-CoV2. The linker sequence for connecting MBP and 3CLpro is LINGDGAGLEVLSAVLQ. The 6His-tag sequences for 3CLpro-CoV2 and PLpro-CoV2 are GPHHHHHH and HHHHHH, respectively. The figures are not drawn to scale.</p>
Full article ">Figure 3
<p>Expression profile of 3CLpro-CoV2 in <span class="html-italic">E. coli</span> BL21 (DE3) under 15% SDS-PAGE. Lane 1: The cell before IPTG induction; Lane 2: The cell after IPTG induction; Lane 3: Soluble fraction of the cell obtained after the sonication; Lane 4: Insoluble fraction of the cell obtained after the sonication. The area that corresponds to the 3CLpro-CoV2 band is indicated by a red box: (<b>a</b>) The expression profile under condition 1; (<b>b</b>) The expression profile under condition 2. Details of the conditions are shown in <a href="#data-06-00099-t002" class="html-table">Table 2</a>.</p>
Full article ">Figure 4
<p>Expression check of PLpro-CoV2 under 15% SDS-PAGE. Lane 1: The cell before IPTG induction; Lane 2: The cell after IPTG induction; Lane 3: Soluble fraction of the cell obtained after the sonication; Lane 4: Insoluble fraction of the cell obtained after the sonication. The area that corresponds to the PLpro-CoV2 band is indicated by a red box: (<b>a</b>) The expression profile under condition 5; (<b>b</b>) The expression profile under condition 6; (<b>c</b>) The expression profile under condition 9; (<b>d</b>) The expression profile under condition 10. Details of the conditions are shown in <a href="#data-06-00099-t002" class="html-table">Table 2</a>.</p>
Full article ">Figure 5
<p>The 15% SDS-PAGE analysis of purified 3CLpro-CoV2. Lane M: Protein marker; Lane 1: Purified protein after Ni<sup>2+</sup>-NTA chromatography: (<b>a</b>) Purified 3CLpro-CoV2 expressed under condition 1; (<b>b</b>) Purified 3CLpro-CoV2 expressed under condition 2. The band that corresponds to the 3CLpro-CoV2 is indicated by an arrow. Details of the conditions are shown in <a href="#data-06-00099-t002" class="html-table">Table 2</a>.</p>
Full article ">Figure 6
<p>The 15% SDS-PAGE analysis of purified PLpro-CoV2. Lane M: Protein marker; Lane 1: Purified protein after Ni<sup>2+</sup>-NTA chromatography: (<b>a</b>) Purified PLpro-CoV2 expressed under condition 5; (<b>b</b>) Purified PLpro-CoV2 expressed under condition 6; (<b>c</b>) Purified PLpro-CoV2 expressed under condition 9; (<b>d</b>) Purified PLpro-CoV2 expressed under condition 10. Details of the conditions are shown in <a href="#data-06-00099-t002" class="html-table">Table 2</a>.</p>
Full article ">Figure 6 Cont.
<p>The 15% SDS-PAGE analysis of purified PLpro-CoV2. Lane M: Protein marker; Lane 1: Purified protein after Ni<sup>2+</sup>-NTA chromatography: (<b>a</b>) Purified PLpro-CoV2 expressed under condition 5; (<b>b</b>) Purified PLpro-CoV2 expressed under condition 6; (<b>c</b>) Purified PLpro-CoV2 expressed under condition 9; (<b>d</b>) Purified PLpro-CoV2 expressed under condition 10. Details of the conditions are shown in <a href="#data-06-00099-t002" class="html-table">Table 2</a>.</p>
Full article ">Figure 7
<p>The formation of yellow color in the reaction cocktails of (<b>a</b>) 3CLpro-CoV2 and (<b>b</b>) PLpro-CoV2.</p>
Full article ">
6 pages, 1817 KiB  
Data Descriptor
Seismic Envelopes of Coda Decay for Q-coda Attenuation Studies of the Gargano Promontory (Southern Italy) and Surrounding Regions
by Marilena Filippucci, Salvatore Lucente, Salvatore de Lorenzo, Edoardo Del Pezzo, Giacomo Prosser and Andrea Tallarico
Data 2021, 6(9), 98; https://doi.org/10.3390/data6090098 - 13 Sep 2021
Cited by 1 | Viewed by 1824
Abstract
Here, we describe the dataset of seismic envelopes used to study the S-wave Q-coda attenuation quality factor Qc of the Gargano Promontory (Southern Italy). With this dataset, we investigated the crustal seismic attenuation by the Qc parameter. We collected this dataset [...] Read more.
Here, we describe the dataset of seismic envelopes used to study the S-wave Q-coda attenuation quality factor Qc of the Gargano Promontory (Southern Italy). With this dataset, we investigated the crustal seismic attenuation by the Qc parameter. We collected this dataset starting from two different earthquake catalogues: the first regarding the period from April 2013 to July 2014; the second regarding the period from July 2015 to August 2018. Visual inspection of the envelopes was carried out on recordings filtered with a Butterworth two-poles filter with central frequency fc = 6 Hz. The obtained seismic envelopes of coda decay can be linearly fitted in a bilogarithmic diagram in order to obtain a series of single source-receiver measures of Qc for each seismogram component at different frequency fc. The analysis of the trend Qc(fc) gives important insights into the heterogeneity and the anelasticity of the sampled Earth medium. Full article
(This article belongs to the Section Spatial Data Science and Digital Earth)
Show Figures

Figure 1

Figure 1
<p>Plot of the first envelope file in <a href="#data-06-00098-t003" class="html-table">Table 3</a>, as an example.</p>
Full article ">Figure 2
<p>Plot of the first envelope file in <a href="#data-06-00098-t006" class="html-table">Table 6</a>, as an example.</p>
Full article ">Figure 3
<p>Three-component seismograms at station OT01, as an example. Over each record, the origin time in absolute time is overwritten; the X-axis is time (s), the Y-axis is amplitude (counts/s). The P-wave marker (IPU0) and S-wave marker (IS) are overwritten.</p>
Full article ">Figure 4
<p>Three-component seismograms at station OT01, filtered with <math display="inline"> <semantics> <mrow> <msub> <mi>f</mi> <mi>c</mi> </msub> <mo>=</mo> <mn>6</mn> </mrow> </semantics> </math> Hz and band-width [4.24; 8.48] Hz, as an example. Over each record, the origin time in absolute time is overwritten; the X-axis is time (s), the Y-axis is amplitude (counts/s).</p>
Full article ">Figure 5
<p>Envelopes of the filtered seismograms in <a href="#data-06-00098-f004" class="html-fig">Figure 4</a>. Over the first record, the T3 and T4 markers are overwritten; the X-axis is time (s), the Y-axis is amplitude (counts/s).</p>
Full article ">
11 pages, 2352 KiB  
Article
BioCPR–A Tool for Correlation Plots
by Vidal Fey, Dhanaprakash Jambulingam, Henri Sara, Samuel Heron, Csilla Sipeky and Johanna Schleutker
Data 2021, 6(9), 97; https://doi.org/10.3390/data6090097 - 8 Sep 2021
Cited by 7 | Viewed by 5121
Abstract
A gene is a sequence of DNA bases through which genetic information is passed on to the next generation. Most genes encode for proteins that ultimately control cellular function. Understanding the interrelation between genes without the application of statistical methods can be a [...] Read more.
A gene is a sequence of DNA bases through which genetic information is passed on to the next generation. Most genes encode for proteins that ultimately control cellular function. Understanding the interrelation between genes without the application of statistical methods can be a daunting task. Correlation analysis is a powerful approach to determine the strength of association between two variables (e.g., gene-wise expression). Moreover, it becomes essential to visualize this data to establish patterns and derive insight. The most common method for gene expression visualization is to use correlation heatmaps in which the colors of the plot represent strength of co-expression. In order to address this requirement, we developed a visualization tool called BioCPR: Biological Correlation Plots in R. This tool performs both correlation analysis and subsequent visualization in the form of an interactive heatmap, improving both usability and interpretation of the data. BioCPR is an R Shiny-based application and can be run locally in Rstudio or a web browser. Full article
Show Figures

Figure 1

Figure 1
<p>Correlation heatmap of the subset of the TCGA dataset. The smaller panel on the top left of the plot shows the color key and a histogram of correlation coefficients. Positive correlations are indicated in shades of red, whereas negative correlations are indicated in shades of blue. Significant correlations, depending on the <span class="html-italic">p</span>-value, are indicated by asterisks.</p>
Full article ">Figure 2
<p>Correlation matrix of the subset of the TCGA dataset.</p>
Full article ">Figure 3
<p>Correlation heatmap of the subset of the TCGA dataset that has been filtered using the BioCPR tool. See <a href="#data-06-00097-f001" class="html-fig">Figure 1</a> for a description of the color scheme.</p>
Full article ">Figure 4
<p>Dendrogram that has been filtered from the correlation heatmap in <a href="#data-06-00097-f003" class="html-fig">Figure 3</a>. See <a href="#data-06-00097-f001" class="html-fig">Figure 1</a> for a description of the color scheme.</p>
Full article ">
19 pages, 508 KiB  
Article
Lessons Learnt from Engineering Science Projects Participating in the Horizon 2020 Open Research Data Pilot
by Timothy Austin, Kyriaki Bei, Theodoros Efthymiadis and Elias P. Koumoulos
Data 2021, 6(9), 96; https://doi.org/10.3390/data6090096 - 6 Sep 2021
Cited by 3 | Viewed by 3369
Abstract
Trends in the sciences are indicative of data management becoming established as a feature of the mainstream research process. In this context, the European Commission introduced an Open Research Data pilot at the start of the Horizon 2020 research programme. This initiative followed [...] Read more.
Trends in the sciences are indicative of data management becoming established as a feature of the mainstream research process. In this context, the European Commission introduced an Open Research Data pilot at the start of the Horizon 2020 research programme. This initiative followed the success of the Open Access pilot implemented in the prior (FP7) research programme, which thereafter became an integral component of Horizon 2020. While the Open Access phenomenon can reasonably be argued to be one of many instances of web technologies disrupting established business models (namely publication practices and workflows established over several centuries in the case of Open Access), initiatives designed to promote research data management have no established foundation on which to build. For Open Data to become a reality and, more importantly, to contribute to the scientific process, data management best practices and workflows are required. Furthermore, with the scientific community having operated to good effect in the absence of data management, there is a need to demonstrate the merits of data management. This circumstance is complicated by the lack of the necessary ICT infrastructures, especially interoperability standards, required to facilitate the seamless transfer, aggregation and analysis of research data. Any activity aiming to promote Open Data thus needs to overcome a number of cultural and technological challenges. It is in this context that this paper examines the data management activities and outcomes of a number of projects participating in the Horizon 2020 Open Research Data pilot. The result has been to identify a number of commonly encountered benefits and issues; to assess the utilisation of data management plans; and through the close examination of specific cases, to gain insights into obstacles to data management and potential solutions. Although primarily anecdotal and difficult to quantify, the experiences reported in this paper tend to favour developing data management best practices rather than doggedly pursue the Open Data mantra. While Open Data may prove valuable in certain circumstances, there is good reason to claim that managed access to scientific data of high inherent intellectual and financial value will prove more effective in driving knowledge discovery and innovation. Full article
(This article belongs to the Section Information Systems and Data Management)
Show Figures

Figure 1

Figure 1
<p>Data management platform workflow for the DECOAT and REPAIR3D projects.</p>
Full article ">Figure 2
<p>Data access model in IRES-led activities.</p>
Full article ">
19 pages, 2945 KiB  
Article
TRIPOD—A Treadmill Walking Dataset with IMU, Pressure-Distribution and Photoelectric Data for Gait Analysis
by Justin Trautmann, Lin Zhou, Clemens Markus Brahms, Can Tunca, Cem Ersoy, Urs Granacher and Bert Arnrich
Data 2021, 6(9), 95; https://doi.org/10.3390/data6090095 - 26 Aug 2021
Cited by 10 | Viewed by 5871
Abstract
Inertial measurement units (IMUs) enable easy to operate and low-cost data recording for gait analysis. When combined with treadmill walking, a large number of steps can be collected in a controlled environment without the need of a dedicated gait analysis laboratory. In order [...] Read more.
Inertial measurement units (IMUs) enable easy to operate and low-cost data recording for gait analysis. When combined with treadmill walking, a large number of steps can be collected in a controlled environment without the need of a dedicated gait analysis laboratory. In order to evaluate existing and novel IMU-based gait analysis algorithms for treadmill walking, a reference dataset that includes IMU data as well as reliable ground truth measurements for multiple participants and walking speeds is needed. This article provides a reference dataset consisting of 15 healthy young adults who walked on a treadmill at three different speeds. Data were acquired using seven IMUs placed on the lower body, two different reference systems (Zebris FDMT-HQ and OptoGait), and two RGB cameras. Additionally, in order to validate an existing IMU-based gait analysis algorithm using the dataset, an adaptable modular data analysis pipeline was built. Our results show agreement between the pressure-sensitive Zebris and the photoelectric OptoGait system (r = 0.99), demonstrating the quality of our reference data. As a use case, the performance of an algorithm originally designed for overground walking was tested on treadmill data using the data pipeline. The accuracy of stride length and stride time estimations was comparable to that reported in other studies with overground data, indicating that the algorithm is equally applicable to treadmill data. The Python source code of the data pipeline is publicly available, and the dataset will be provided by the authors upon request, enabling future evaluations of IMU gait analysis algorithms without the need of recording new data. Full article
Show Figures

Figure 1

Figure 1
<p>Experimental setup. IMU positions are highlighted in orange.</p>
Full article ">Figure 2
<p>Folder structure of the dataset.</p>
Full article ">Figure 3
<p>Data pipeline components and data flow. The data loader, reference loader, gait event detector and trajectory estimator were implemented as exchangeable classes, allowing the future implementation of multiple different algorithms and compatibility with different data formats.</p>
Full article ">Figure 4
<p>The estimation of gait parameters based on the combination of gait events and estimated 3D trajectory.</p>
Full article ">Figure 5
<p>The evolution of Zebris readings over time and final heel detection. Green and blue lines: heel positions detected for the left and right foot, respectively.</p>
Full article ">Figure 6
<p>A comparison of stride length (<b>left</b>) and stride time (<b>right</b>) measured with the OptoGait and Zebris system. The different colors represent different participants.</p>
Full article ">Figure 7
<p>A comparison of stance time (<b>left</b>) and swing time (<b>right</b>) measured with the OptoGait and Zebris systems. The two measurement systems differ substantially due to different sensor placement—one above and one below the treadmill belt. The different colors represent different participants.</p>
Full article ">Figure 8
<p>Comparison of the estimated and measured stride length (<b>left</b>) and stride time (<b>right</b>) from IMU data and the Zebris system. The different colors represent different trials.</p>
Full article ">Figure 9
<p>A comparison of the estimated and measured stride length (<b>left</b>) and stride time (<b>right</b>) from IMU data and the OptoGait system. The different colors represent different trials.</p>
Full article ">Figure 10
<p>A comparison of the estimated and measured stance time (<b>left</b>) and swing time (<b>right</b>) from IMU data and the Zebris system. The different colors represent different trials.</p>
Full article ">Figure 11
<p>A comparison of the estimated and measured stance time (<b>left</b>) and swing time (<b>right</b>) from IMU data and the OptoGait system. The different colors represent different trials.</p>
Full article ">Figure A1
<p>Bland–Altman plots of the estimated and measured stride length (<b>left</b>) and stride time (<b>right</b>) from IMU data and the Zebris system. The different colors represent different trials.</p>
Full article ">Figure A2
<p>Bland–Altman plots of the estimated and measured stride length (<b>left</b>) and stride time (<b>right</b>) from IMU data and the OptoGait system. The different colors represent different trials.</p>
Full article ">Figure A3
<p>An example file structure of a .json.gz files with Zebris data.</p>
Full article ">Figure A4
<p>Illustration of pedobarographic data format for two particular data samples.</p>
Full article ">
Previous Issue
Next Issue
Back to TopTop