scholarly journals COVID-19 County Level Severity Classification with Cl Imbalanced Class: A NearMiss Under-sampling Approach

Author(s):  
Timothy Oladunni ◽  
Sourou Tossou ◽  
Yayehyrad Haile ◽  
Adonias Kidane

COVID-19 pandemic that broke out in the late 2019 has spread across the globe. The disease has infected millions of people. Thousands of lives have been lost. The momentum of the disease has been slowed by the introduction of vaccine. However, some countries are still recording high number of casualties. The focus of this work is to design, develop and evaluate a machine learning county level COVID-19 severity classifier. The proposed model will predict severity of the disease in a county into low, moderate, or high. Policy makers will find the work useful in the distribution of vaccines. Four learning algorithms (two ensembles and two non-ensembles) were trained and evaluated. Class imbalance was addressed using NearMiss under-sampling of the majority classes. The result of our experiment shows that the ensemble models outperformed the non-ensemble models by a considerable margin.

2020 ◽  
Vol 8 (2) ◽  
pp. 89-93 ◽  
Author(s):  
Hairani Hairani ◽  
Khurniawan Eko Saputro ◽  
Sofiansyah Fadli

The occurrence of imbalanced class in a dataset causes the classification results to tend to the class with the largest amount of data (majority class). A sampling method is needed to balance the minority class (positive class) so that the class distribution becomes balanced and leading to better classification results. This study was conducted to overcome imbalanced class problems on the Indian Pima diabetes illness dataset using k-means-SMOTE. The dataset has 268 instances of the positive class (minority class) and 500 instances of the negative class (majority class). The classification was done by comparing C4.5, SVM, and naïve Bayes while implementing k-means-SMOTE in data sampling. Using k-means-SMOTE, the SVM classification method has the highest accuracy and sensitivity of 82 % and 77 % respectively, while the naive Bayes method produces the highest specificity of 89 %.


2021 ◽  
Author(s):  
Timothy Oladunni ◽  
Justin Stephan ◽  
Lala Aicha Coulibaly

SARS-Cov-2 is not to be introduced anymore. The global pandemic that originated more than a year ago in Wuhan, China has claimed thousands of lives. Since the arrival of this plague, face mask has become part of our dressing code. The focus of this study is to design, develop and evaluate a COVID-19 fatality rate classifier at the county level. The proposed model predicts fatality rate as low, moderate, or high. This will help government and decision makers to improve mitigation strategy and provide measures to reduce the spread of the disease. Tourists and travelers will also find the work useful in planning of trips. Dataset used in the experiment contained imbalanced fatality levels. Therefore, class imbalance was offset using SMOTE. Evaluation of the proposed model was based on precision, F1 score, accuracy, and ROC curve. Five learning algorithms were trained and evaluated. Experimental results showed the Bagging model has the best performance.


Author(s):  
Muhammad Shoaib Farooq

Purpose Although entrepreneurial behaviour is considered a key element for economic development, yet very less is known about the determinants of factors leading towards entrepreneurial intention and behaviour. In order to bridge this gap, the purpose of this paper is to investigate the role of social support and entrepreneurial skills in determining entrepreneurial behaviour of individuals. Developing on the base of the theory of planned behaviour (TPB), this study investigates the relationship between social support, entrepreneurial skills and entrepreneurial behaviour along with existing constructs of the TPB (i.e. attitude, subjective norms, perceived behavioural control and entrepreneurial intention). Design/methodology/approach Data was collected from 281 respondents using a simple random sampling method, and the variance-based partial least-squares, structural equation modelling (PLS-SEM) approach was used for testing the proposed conceptual model. Findings Findings of this study have validated the proposed model, which have an explanatory power of 68.3 per cent. Moreover, findings reveal that social support and entrepreneurial skills have a significant impact on entrepreneurial intention of individuals. However, an unanticipated and non-significant relation between subjective norms and entrepreneurial intention is also found. Research limitations/implications Due to the limited scope of this study, a multi-group analysis is not possible, which is considered as a limitation of this study. Moreover, due to time constraints, this study is conducted within a specified time-frame; however, a longitudinal study over a period of three to six years can overcome this limitation. Practical implications Findings of this study are expected to have substantial implications for policy makers, future researchers and academicians. Outcomes of this study can help to better understand the cognitive phenomenon of nascent entrepreneurs. Moreover, it is expected that this study can serve as a torch-bearer for policy makers to develop better entrepreneurial development programmes, policies and initiatives for promoting self-employment behaviour. Originality/value Findings of this study are a unique step forward and offer new insights towards a better understanding of the determinants of entrepreneurial behaviour. Moreover, this study extends Ajzen’s (1991) TPB in the context of entrepreneurial behaviour. By introducing and investigating the impact of two new variables, i.e. social support and entrepreneurial skills in the TPB and by validating the proposed model with PLS-SEM approach, this study makes a sizeable theoretical, methodological and contextual contribution in the overall body of knowledge.


Sensors ◽  
2022 ◽  
Vol 22 (2) ◽  
pp. 429
Author(s):  
Linhui Li ◽  
Xin Sui ◽  
Jing Lian ◽  
Fengning Yu ◽  
Yafu Zhou

The structured road is a scene with high interaction between vehicles, but due to the high uncertainty of behavior, the prediction of vehicle interaction behavior is still a challenge. This prediction is significant for controlling the ego-vehicle. We propose an interaction behavior prediction model based on vehicle cluster (VC) by self-attention (VC-Attention) to improve the prediction performance. Firstly, a five-vehicle based cluster structure is designed to extract the interactive features between ego-vehicle and target vehicle, such as Deceleration Rate to Avoid a Crash (DRAC) and the lane gap. In addition, the proposed model utilizes the sliding window algorithm to extract VC behavior information. Then the temporal characteristics of the three interactive features mentioned above will be caught by two layers of self-attention encoder with six heads respectively. Finally, target vehicle’s future behavior will be predicted by a sub-network consists of a fully connected layer and SoftMax module. The experimental results show that this method has achieved accuracy, precision, recall, and F1 score of more than 92% and time to event of 2.9 s on a Next Generation Simulation (NGSIM) dataset. It accurately predicts the interactive behaviors in class-imbalance prediction and adapts to various driving scenarios.


Author(s):  
Noviyanti Santoso ◽  
Wahyu Wibowo ◽  
Hilda Hikmawati

In the data mining, a class imbalance is a problematic issue to look for the solutions. It probably because machine learning is constructed by using algorithms with assuming the number of instances in each balanced class, so when using a class imbalance, it is possible that the prediction results are not appropriate. They are solutions offered to solve class imbalance issues, including oversampling, undersampling, and synthetic minority oversampling technique (SMOTE). Both oversampling and undersampling have its disadvantages, so SMOTE is an alternative to overcome it. By integrating SMOTE in the data mining classification method such as Naive Bayes, Support Vector Machine (SVM), and Random Forest (RF) is expected to improve the performance of accuracy. In this research, it was found that the data of SMOTE gave better accuracy than the original data. In addition to the three classification methods used, RF gives the highest average AUC, F-measure, and G-means score.


Humanomics ◽  
2016 ◽  
Vol 32 (3) ◽  
pp. 230-247 ◽  
Author(s):  
Permata Wulandari ◽  
Salina Kassim ◽  
Liyu Adhi Kasari Sulung ◽  
Niken Iwani Surya Putri

Purpose This paper aims to highlight on the unique aspects of Islamic microfinance based on the experience of Baitul Maal Wa Tamwil (BMT) in Indonesia. Design/methodology/approach It adopts the content analysis approach and focuses on three phases of financing, namely, pre-financing, financing and post-financing using coding and model buildings. Data are collected through in-depth interview with a sample of representatives of BMTs that offer product based on Islamic principle for the poor located in Jakarta, Bogor, Depok, Tanggerang and Bekasi (JABODETABEK), Sulawesi Selatan, Yogyakarta and Nusa Tenggara Barat (sample chosen based on the most concentrated areas of Islamic microfinance that offered product based on Islamic principles). Ultimately, a model based on the unique features of Islamic microfinance will be developed based on the findings of the content analysis. Findings The proposed model incorporates the peculiarities of the poor people in pre-financing, financing and post-financing activities of micro-financing products to serve as a reference for policy makers. The paper also found that each region has unique product preferences depending on the poor’s characteristics. Research limitations/implications This study is only conducted in four areas with BMT representation, namely, Jakarta, Bogor, Depok, Tangerang, Bekasi (often abbreviated as JABODETABEK), Sulawesi Selatan, Yogyakarta and Nusa Tenggara Barat) in Indonesia. Despite the limited scope, the findings have wide applications to the Islamic microfinancing in general. Originality/value The paper adds value to the literature on Islamic microfinance by enabling researchers and practitioners to understand the model of three step financing (pre-financing, financing and post-financing) in Islamic microfinance in Indonesia. Although not a new issue, the paper provides the practice of pre-financing, financing and post-financing processes which may differ from the practices of Islamic microfinance in other settings because of different cultural influences unique to every region.


Author(s):  
Changxu Dong ◽  
Yanna Zhao ◽  
Gaobo Zhang ◽  
Mingrui Xue ◽  
Dengyu Chu ◽  
...  

Epilepsy is a chronic brain disease resulted from the central nervous system lesion, which leads to repeated seizure occurs for the patients. Automatic seizure detection with Electroencephalogram (EEG) has witnessed great progress. However, existing methods paid little attention to the topological relationships of different EEG electrodes. Latest neuroscience researches have demonstrated the connectivity between different brain regions. Besides, class-imbalance is a common problem in EEG based seizure detection. The duration of epileptic EEG signals is much shorter than that of normal signals. In order to deal with the above mentioned two challenges, we propose to model the multi-channel EEG data using the Attention-based Graph ResNet (AGRN). In particular, each channel of the EEG signal represents a node of the graph and the inter-channel relations are modeled via the adjacency matrix in the graph. The loss function of the ARGN model is re-designed using focal loss to cope with the class-imbalance problem. The proposed ARGN with focal model could learn discriminative features from the raw EEG data. Experiments are carried out on the CHB-MIT dataset. The proposed model achieves an average accuracy of 98.70%, a sensitivity of 97.94%, a specificity of 98.66% and a precision of 98.62%. The Area Under the ROC Curve (AUC) is 98.69%.


2021 ◽  
Vol 9 (1) ◽  
pp. 52-68
Author(s):  
Lipika Goel ◽  
Mayank Sharma ◽  
Sunil Kumar Khatri ◽  
D. Damodaran

Often, the prior defect data of the same project is unavailable; researchers thought whether the defect data of the other projects can be used for prediction. This made cross project defect prediction an open research issue. In this approach, the training data often suffers from class imbalance problem. Here, the work is directed on homogeneous cross-project defect prediction. A novel ensemble model that will perform in dual fold is proposed. Firstly, it will handle the class imbalance problem of the dataset. Secondly, it will perform the prediction of the target class. For handling the imbalance problem, the training dataset is divided into data frames. Each data frame will be balanced. An ensemble model using the maximum voting of all random forest classifiers is implemented. The proposed model shows better performance in comparison to the other baseline models. Wilcoxon signed rank test is performed for validation of the proposed model.


Sign in / Sign up

Export Citation Format

Share Document