Many research funders mandate researchers to create and maintain data management plans (DMPs) for research projects that describe how research data is managed to ensure its reusability. A DMP, being a static textual document, is difficult to act upon and can quickly become obsolete and impractical to maintain. A new generation of machine-actionable DMPs (maDMPs) was therefore proposed by the Research Data Alliance to enable automated integration of information and updates. maDMPs open up a variety of use cases enabling interoperability of research systems and automation of data management tasks.
In this article, we describe a system for machine-actionable data management planning in an institutional context. We identify common use cases within research that can be automated to benefit from machine-actionability of DMPs. We propose a reference architecture of an maDMP support system that can be embedded into an institutional research data management infrastructure. The system semi-automates creation and maintenance of DMPs, and thus eases the burden for the stakeholders responsible for various DMP elements. We evaluate the proposed system in a case study conducted at the largest technical university in Austria and quantify to what extent the DMP templates provided by the European Commission and a national funding body can be pre-filled. The proof-of-concept implementation shows that maDMP workflows can be semi-automated, thus workload on involved parties can be reduced and quality of information increased. The results are especially relevant to decision makers and infrastructure operators who want to design information systems in a systematic way that can utilize the full potential of maDMPs.
AbstractIn this reflective methodological paper we focus on affordances and challenges of video data. We compare and analyze two research settings that use the latest video technology to capture classroom interactions in mathematics education, namely, The Social Unit of Learning (SUL) project of the University of Melbourne and the MathTrack project of the University of Helsinki. While using these two settings as examples, we have structured our reflections around themes pertinent to video research in general, namely, research methods, data management, and research ethics. SUL and MathTrack share an understanding of mathematics learning as social multimodal practice, and provide possibilities for zooming into the situational micro interactions that construct collaborative problem-solving learning. Both settings provide rich data for in-depth analyses of peer interactions and learning processes. The settings share special needs for technical support and data management, as well as attention to ethical aspects from the perspective of the participants’ security and discretion. SUL data are especially suitable for investigating interactions on a broad scope, addressing how multiple interactional processes intertwine. MathTrack, on the other hand, enables exploration of participants’ visual attention in detail and its role in learning. Both settings could provide tools for teachers’ professional development by showing them aspects of classroom interactions that would otherwise remain hidden.
Radio telemetry, one of the most widely used techniques for tracking wildlife and fisheries populations, has a false-positive problem. Bias from false-positive detections can affect many important derived metrics, such as home range estimation, site occupation, survival, and migration timing. False-positive removal processes have relied upon simple filters and personal opinion. To overcome these shortcomings, we have developed BIOTAS (BIOTelemetry Analysis Software) to assist with false-positive identification, removal, and data management for large-scale radio telemetry projects.
BIOTAS uses a naïve Bayes classifier to identify and remove false-positive detections from radio telemetry data. The semi-supervised classifier uses spurious detections from unknown tags and study tags as training data. We tested BIOTAS on four scenarios: wide-band receiver with a single Yagi antenna, wide-band receiver that switched between two Yagi antennas, wide-band receiver with a single dipole antenna, and single-band receiver that switched between five frequencies. BIOTAS has a built in a k-fold cross-validation and assesses model quality with sensitivity, specificity, positive and negative predictive value, false-positive rate, and precision-recall area under the curve. BIOTAS also assesses concordance with a traditional consecutive detection filter using Cohen’s $$\kappa$$
Overall BIOTAS performed equally well in all scenarios and was able to discriminate between known false-positive detections and valid study tag detections with low false-positive rates (< 0.001) as determined through cross-validation, even as receivers switched between antennas and frequencies. BIOTAS classified between 94 and 99% of study tag detections as valid.
As part of a robust data management plan, BIOTAS is able to discriminate between detections from study tags and known false positives. BIOTAS works with multiple manufacturers and accounts for receivers that switch between antennas and frequencies. BIOTAS provides the framework for transparent, objective, and repeatable telemetry projects for wildlife conservation surveys, and increases the efficiency of data processing.
The increased utilization of metrology resources and expanded application of its’ approaches in the development of internationally agreed upon measurements can lay the basis for regulatory harmonization, support reproducible research, and advance scientific understanding, especially of dietary supplements and herbal medicines. Yet, metrology is often underappreciated and underutilized in dealing with the many challenges presented by these chemically complex preparations. This article discusses the utility of applying rigorous analytical techniques and adopting metrological principles more widely in studying dietary supplement products and ingredients, particularly medicinal plants and other botanicals. An assessment of current and emerging dietary supplement characterization methods is provided, including targeted and non-targeted techniques, as well as data analysis and evaluation approaches, with a focus on chemometrics, toxicity, dosage form performance, and data management. Quality assessment, statistical methods, and optimized methods for data management are also discussed. Case studies provide examples of applying metrological principles in thorough analytical characterization of supplement composition to clarify their health effects. A new frontier for metrology in dietary supplement science is described, including opportunities to improve methods for analysis and data management, development of relevant standards and good practices, and communication of these developments to researchers and analysts, as well as to regulatory and policy decision makers in the public and private sectors. The promotion of closer interactions between analytical, clinical, and pharmaceutical scientists who are involved in research and product development with metrologists who develop standards and methodological guidelines is critical to advance research on dietary supplement characterization and health effects.
Ethiopia Population-based HIV Impact Assessment findings showed that in Addis Ababa, only 65.2% of people living with HIV (PLHIV) know their status. We present the enhanced HIV/AIDS data management and systematic monitoring experience in Addis Ababa City Administration Health Bureau (AACAHB).
AACAHB established a command-post with leadership and technical team members from the health bureau, 10 sub-city health offices, and non-governmental stakeholders. The command-post improved governance, standardized HIV program implementation, and established accountability mechanism. A web-based database was established at each health facility, sub-city, and AACAHB level. Performance was scored (green, ≥75%; yellow, 50–74%; red, < 50%). The command-post reviewed performance on weekly basis. A mentorship team provided a weekly site-level support at underperforming public and private health facilities. At facility level, quality of data on recording tools such as registers, and individual medical records were maintained through continued review, feedback mechanisms and regular consistency check of data. Percentage and 95% confidence interval were computed to compare the improvement in program performance over time.
After 6 months of intervention period, the monthly New HIV case finding in 47 health facilities increased from 422 to 734 (1.7 times) and treatment initiation increased from 302 to 616 (2 times). After 6 months, the aggregate scoring for HIV testing at city level improved from yellow to green, HIV case finding improved from red to green, and treatment initiation improved from red to yellow. An increasing trend was noted in HIV positive case finding with statistically significant improvement from 43.4% [95% Confidence Interval: 40.23–46.59%] in May 2019 to 74.9% [95% Confidence Interval: 72.03–77.6%] in September 2019. Similarly, significant improvement was recorded for new HIV treatment from 30.9% [95% Confidence Interval: 28.01–33.94%] in May 2019 to 62.5% [95% Confidence Interval: 59.38–65.6%] in September 2019.
Regular data driven HIV program review was institutionalized at city, sub-city and health facility levels which further improved HIV program monitoring and performance. The performance of HIV case finding and treatment initiation improved significantly via using intensified monitoring, data driven performance review, targeted site-level support based on the gap, and standardized approaches.
In economic growth, the gradual increase in the effect of information technology makes the enterprise economic information management increasingly important for the survival and development of the enterprises. This paper designs an enterprise economic information management system for the complex internal economic information management business and process of enterprises. It provides daily office, information access, document preview, and transmission. The proposed design (i) copes with the inconsistency and irregularity of enterprise economic information data, (ii) quickly obtains valuable information from these massive high-frequency data, and (iii) improves the economic benefits of data assets and data management efficiency. The printing function systematizes the information management for departments such as enterprise economic information, personnel, and production. The main focus of this research includes the mode, framework, and function of the whole system software. Moreover, it also comprises of the use of Internet platform big data technology to realize the practicality, stability, and security of the system database algorithm, which has been practically used by enterprises to improve office efficiency and meet the needs of daily management of enterprises. Based on the analysis of the current status of enterprise big data application, this paper constructs an enterprise economic informational management system based on big data and also describes in detail the key technologies of enterprise economic informational data management from three aspects: NoSQL-based big data storage management, Hadoop-based economic informational big data informational and economic informational big data analysis, and mining algorithm. Provide theoretical basis and basic technical support for online decision analysis.
PurposeBig data has posed problems for businesses, the Information Technology (IT) sector and the science community. The problems posed by big data can be effectively addressed using cloud computing and associated distributed computing technology. Cloud computing and big data are two significant past-year problems that allow high-efficiency and competitive computing tools to be delivered as IT services. The paper aims to examine the role of the cloud as a tool for managing big data in various aspects to help businesses.Design/methodology/approachThis paper delivers solutions in the cloud for storing, compressing, analyzing and processing big data. Hence, articles were divided into four categories: articles on big data storage, articles on big data processing, articles on analyzing and finally, articles on data compression in cloud computing. This article is based on a systematic literature review. Also, it is based on a review of 19 published papers on big data.FindingsFrom the results, it can be inferred that cloud computing technology has features that can be useful for big data management. Challenging issues are raised in each section. For example, in storing big data, privacy and security issues are challenging.Research limitations/implicationsThere were limitations to this systematic review. The first limitation is that only English articles were reviewed. Also, articles that matched the keywords were used. Finally, in this review, authoritative articles were reviewed, and slides and tutorials were avoided.Practical implicationsThe research presents new insight into the business value of cloud computing in interfirm collaborations.Originality/valuePrevious research has often examined other aspects of big data in the cloud. This article takes a new approach to the subject. It allows big data researchers to comprehend the various aspects of big data management in the cloud. In addition, setting an agenda for future research saves time and effort for readers searching for topics within big data.