data repository
Recently Published Documents


TOTAL DOCUMENTS

1326
(FIVE YEARS 686)

H-INDEX

35
(FIVE YEARS 11)

2022 ◽  
Vol 16 (3) ◽  
pp. 1-37
Author(s):  
Robert A. Sowah ◽  
Bernard Kuditchar ◽  
Godfrey A. Mills ◽  
Amevi Acakpovi ◽  
Raphael A. Twum ◽  
...  

Class imbalance problem is prevalent in many real-world domains. It has become an active area of research. In binary classification problems, imbalance learning refers to learning from a dataset with a high degree of skewness to the negative class. This phenomenon causes classification algorithms to perform woefully when predicting positive classes with new examples. Data resampling, which involves manipulating the training data before applying standard classification techniques, is among the most commonly used techniques to deal with the class imbalance problem. This article presents a new hybrid sampling technique that improves the overall performance of classification algorithms for solving the class imbalance problem significantly. The proposed method called the Hybrid Cluster-Based Undersampling Technique (HCBST) uses a combination of the cluster undersampling technique to under-sample the majority instances and an oversampling technique derived from Sigma Nearest Oversampling based on Convex Combination, to oversample the minority instances to solve the class imbalance problem with a high degree of accuracy and reliability. The performance of the proposed algorithm was tested using 11 datasets from the National Aeronautics and Space Administration Metric Data Program data repository and University of California Irvine Machine Learning data repository with varying degrees of imbalance. Results were compared with classification algorithms such as the K-nearest neighbours, support vector machines, decision tree, random forest, neural network, AdaBoost, naïve Bayes, and quadratic discriminant analysis. Tests results revealed that for the same datasets, the HCBST performed better with average performances of 0.73, 0.67, and 0.35 in terms of performance measures of area under curve, geometric mean, and Matthews Correlation Coefficient, respectively, across all the classifiers used for this study. The HCBST has the potential of improving the performance of the class imbalance problem, which by extension, will improve on the various applications that rely on the concept for a solution.


2022 ◽  
Vol 9 (1) ◽  
pp. 174-186
Author(s):  
Febry Khunto Sasongko ◽  
Diah Kristina ◽  
Abdul Asib

This article discusses the strategies used by five non-millennial teachers (aged 54-59 years old) of a junior high school in coping with the online teaching during the COVID-19 pandemic, in Ngawi, East Java, Indonesia. The teachers were interviewed, and the data were transcribed and analyzed by creating a data repository, expanding the codes, describing the coded data, and drawing conclusions. The results revealed that the teachers had several strategies used, which were to increase students’ interest in learning, provide students with knowledge and attention, create efficient learning resources, and use SIMPEL (Sistem Informasi Manajemen Pembelajaran or Learning Management Information System), which is specifically available only in Ngawi. SIMPEL was specially developed by the Ngawi district education office, to ensure that the learning processes in Ngawi Regency continue to run optimally during the COVID-19 outbreak. SIMPEL substituted the use of online YouTube videos and materials because the materials were already provided by the system, decreasing the need for the teachers to depend on other resources. Despite these teachers also using other online platforms, hence issues such as the slow internet connection, running out of quotas and blackouts, hindered their efforts to use these platforms at times. Hence, WAG was the most used media to conduct their online learning due to its simplicity and availability. These teachers continued to strive to learn digital technologies ever since they changed from their previous face-to-face teaching strategies.


2022 ◽  
Vol 9 (1) ◽  
Author(s):  
Marie C. Henniges ◽  
Robyn F. Powell ◽  
Sahr Mian ◽  
Clive A. Stace ◽  
Kevin J. Walker ◽  
...  

AbstractThe vascular flora of Britain and Ireland is among the most extensively studied in the world, but the current knowledge base is fragmentary, with taxonomic, ecological and genetic information scattered across different resources. Here we present the first comprehensive data repository of native and alien species optimized for fast and easy online access for ecological, evolutionary and conservation analyses. The inventory is based on the most recent reference flora of Britain and Ireland, with taxon names linked to unique Kew taxon identifiers and DNA barcode data. Our data resource for 3,227 species and 26 traits includes existing and unpublished genome sizes, chromosome numbers and life strategy and life-form assessments, along with existing data on functional traits, species distribution metrics, hybrid propensity, associated biomes, realized niche description, native status and geographic origin of alien species. This resource will facilitate both fundamental and applied research and enhance our understanding of the flora’s composition and temporal changes to inform conservation efforts in the face of ongoing climate change and biodiversity loss.


JAMIA Open ◽  
2022 ◽  
Vol 5 (1) ◽  
Author(s):  
Arnaud Serret-Larmande ◽  
Jonathan R Kaltman ◽  
Paul Avillach

Abstract Reproducibility in medical research has been a long-standing issue. More recently, the COVID-19 pandemic has publicly underlined this fact as the retraction of several studies reached out to general media audiences. A significant number of these retractions occurred after in-depth scrutiny of the methodology and results by the scientific community. Consequently, these retractions have undermined confidence in the peer-review process, which is not considered sufficiently reliable to generate trust in the published results. This partly stems from opacity in published results, the practical implementation of the statistical analysis often remaining undisclosed. We present a workflow that uses a combination of informatics tools to foster statistical reproducibility: an open-source programming language, Jupyter Notebook, cloud-based data repository, and an application programming interface can streamline an analysis and help to kick-start new analyses. We illustrate this principle by (1) reproducing the results of the ORCHID clinical trial, which evaluated the efficacy of hydroxychloroquine in COVID-19 patients, and (2) expanding on the analyses conducted in the original trial by investigating the association of premedication with biological laboratory results. Such workflows will be encouraged for future publications from National Heart, Lung, and Blood Institute-funded studies.


2022 ◽  
Author(s):  
Christophe Genthon ◽  
Dana E. Veron ◽  
Etienne Vignon ◽  
Jean-Baptiste Madeleine ◽  
Luc Piard

Abstract. The air at the surface of the high Antarctic Plateau is very cold, dry and clean. In such conditions the atmospheric moisture can significantly deviate from thermodynamic equilibrium conditions, and supersaturation with respect to ice can occur. Most conventional humidity sensors for meteorological applications cannot report supersaturation in this environment. A simple approach for measuring supersaturation using conventional instruments, one being operated in a heated airflow, is presented. Since 2018, this instrumental setup was deployed at 3 levels in the lower ~40 m above the surface at Dome C on the high Antarctic Plateau. The 3-year 2018–2020 record (Genthon et al. 2021) is presented and analyzed for features such as the frequency of supersaturation with respect to ice, diurnal and seasonal variability, and vertical distribution. As supercooled liquid water droplets are frequently observed in clouds at the temperatures met on the high Antarctic Plateau, the distribution of relative humidity with respect to liquid water at Dome C is also discussed. It is suggested that, while not strictly mimicking the conditions of the high troposphere, the surface atmosphere on the Antarctic Plateau is a convenient natural laboratory to test parametrizations of cold microphysics predominantly developed to handle the genesis of high tropospheric clouds. Data are distributed on the PANGAEA data repository at https://doi.pangaea.de/10.1594/PANGAEA.939425 (Genthon et al., 2021).


2022 ◽  
Vol 2159 (1) ◽  
pp. 012010
Author(s):  
L Uribe ◽  
J Villamizar ◽  
G Morantes ◽  
A Cerquera ◽  
E Prada ◽  
...  

Abstract There are several coronary diseases that human beings can suffer from, which in themselves generate health deterioration and can lead to the development of other diseases that diminish the quality of life. Ischemic diseases are unique in that they are evidenced by blockages generated by the accumulation of fat that impedes circulation, triggering heart and brain-related problems. By means of fractional Brownian motion in relation to Hurst’s parameter, an analysis of a data of 137 patients aged between 30 and 71 years, who present some type of ischemic disease such as mixed, restricted, effort angina and angina pectoris, is performed. The data used was European, which is found in the PhysioNet open-access medical research data repository, managed by the Massachusetts Institute of Technology Computational Physiology Laboratory. This data shows the Hurst coefficient calculations associated with each type of ischemic heart disease.


2022 ◽  
pp. 83-109
Author(s):  
K. S. Sastry Musti ◽  
Marcio Van der Merwe

Application of multi-criteria decision analysis (MCDA) methods to various aspects of energy systems is of significant interest. This chapter first proposes a simple yet user-friendly MS-Excel tool with four popular MCDA methods. The tool can be effectively used to apply MCDA techniques and to determine the rankings for the alternatives. This MS-Excel tool is made available on Mendeley data repository. The chapter explains the overall MCDA computational processes, algorithms, and provides details on using the tool itself with the help of two case studies to demonstrate its effectiveness and applicability.


Author(s):  
P. Navaraja ◽  
E. Deepika ◽  
R. Birundha ◽  
S. Aarthi

There have been revolutionary developments in the healthcare industry with the advancement of technology over the past years. Technology has widespread health records that may have been digitized into electronic health records. Internet of Things, Cloud Computing, Block chain technology, lab-on-chip, non-invasive and minimally invasive surgeries and so on has simplified several dreadful diseases. The research as well as healthcare industry has been greatly impacted by these new technologies. In such case accessibility of health data from one provider to another at the right time remains a major challenge, especially when patients are in a critical condition having access to fragmented health records from multiple sources into a single chain. The proposed system aims to exchange health information on a block chain platform to build a smart e-health system. In this system, block chain is a clinical data repository that provides patients a complete, distributed ledger record containing records of all the events and seamless access to their electronic health records through healthcare providers. As an important feature, this system provides high security and integrity through cryptographic hash functions.


2021 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
SiZhe Xiao ◽  
Tsz Yan Ng ◽  
Tao T. Yang

PurposeThe purpose of this paper is to look at the journey and experience of the University of Hong Kong (HKU) Research Data Management (RDM) practice to respond to the needs of researchers in an academic library.Design/methodology/approachThe research data services (RDS) practice is based on the FAIR data principle. And the authors designed the RDM Stewardship framework to implement the RDS step by step.FindingsThe HKU Libraries developed and implemented a set of RDS under a research data stewardship framework in response to the recent evolving research needs for RDM amongst the academic communities. The services cover policy and procedure settings for research data planning, research data infrastructure establishment, data curation services and provision of online resources and instructional guidelines.Originality/value This study provides an example of an approach to respond to the needs of the academic libraries about how to start the RDS including the data policy, data repository, data librarianship and data curation.


Sign in / Sign up

Export Citation Format

Share Document