class weight
Recently Published Documents


TOTAL DOCUMENTS

22
(FIVE YEARS 12)

H-INDEX

3
(FIVE YEARS 1)

2022 ◽  
Author(s):  
Bens Pardamean ◽  
Arif Budiarto ◽  
Bharuno Mahesworo ◽  
Alam Ahmad Hidayat ◽  
Digdo Sudigyo

Abstract Background: Sleep is commonly associated with physical and mental health status. Sleep quality can be determined from the dynamic of sleep stages during the night. Data from the wearable device can potentially be used as predictors to classify the sleep stage. Robust Machine Learning (ML) model is needed to learn the pattern within wearable data to be associated with the sleep-wake classification, especially to handle the imbalanced proportion between wake and sleep stages. In this study, we incorporated a publicy available dataset consists of three features captured from a consumer wearable device and the labelled sleep stages from a polysomnogram. We implemented Random Forest, Support Vector Machine , Extreme Gradiet Boosting Tree, Densed Neural Network (DNN), and Long Short-Term Memory (LSTM), complemented by three strategies to handle the imbalanced data problem. Results: In total, we included more than 24,815 rows of preprocessed data from 31 samples. The proportion of minority-majority data is 1:10. In classifying this extreme imbalanced data, the DNN model was found to have the best performance compared to the previous best model, which is based on basic Multi-Layer Perceptron. Our best model successfully achieved a 12% higher specificity score (prediction score for minority class) and 1% improvement on the sensitivity score (prediction score for majority class) by including all features in the model. This achievement was affected by the implementation of custom class weight and oversampling strategy. In contrast, when we only used two features, XGB achieved a specificity improvement only by 1%, while keeping the sensitivity at the same level.Conclusions: The non-linear operation within the DNN model could successfully learn the hidden pattern from the combination of three features. Additionally, the class weight parameter avoided the model ignoring the minority class by giving more weight for this class in the loss function. The feature engineering process seemed to obscure the time-series characteristics within the data. This is why LSTM, as one of the best methods for time-series data, failed to perform well in this classification task.


Author(s):  
Hilmy Bahy Hakim ◽  
Fitri Utaminingrum ◽  
Agung Setia Budi

SARS-CoV-2 causes an infection called COVID-19, which is caused by a new coronavirus. One of the symptomps that dangerous to the patients is developing pneumonia in their lungs. To detect pneumonia symptoms, one of the newest methods is using CNN (Convolution Neural Networks). The problem is when able to detect pneumonia, the patient's survivability, which knowing this will be helpful to decide the priority for each patient, is still in question. The CNN used in this research to classify the patient’s future condition, but met some major problems that the dataset is very few and unbalance. The image augmentation was used to multiply the dataset, and class weight was applied to prevent miscalculation on minority class. 6 CNN architectures used to find the best model. The result VGG19 architecture has the best overall accuracy, in training, it has 80% accuracy, 89% accuracy invalidation, and 82% f1 score accuracy on classifying the testing dataset means the best model if looking for accuracy on prediction, but this cost a prediction time that longest compared to other CNN architectures. MobileNet is the fastest, but it cost much worse on prediction accuracy, only 55%. The ResNet50 model has balanced prediction accuracy/time, it got 77% f1 accuracy, and also 8.49 seconds of prediction time, 9 seconds less than VGG19.


2021 ◽  
Vol 8 (1) ◽  
Author(s):  
Nouar AlDahoul ◽  
Hezerul Abdul Karim ◽  
Abdulaziz Saleh Ba Wazir

AbstractNetwork Anomaly Detection is still an open challenging task that aims to detect anomalous network traffic for security purposes. Usually, the network traffic data are large-scale and imbalanced. Additionally, they have noisy labels. This paper addresses the previous challenges and utilizes million-scale and highly imbalanced ZYELL’s dataset. We propose to train deep neural networks with class weight optimization to learn complex patterns from rare anomalies observed from the traffic data. This paper proposes a novel model fusion that combines two deep neural networks including binary normal/attack classifier and multi-attacks classifier. The proposed solution can detect various network attacks such as Distributed Denial of Service (DDOS), IP probing, PORT probing, and Network Mapper (NMAP) probing. The experiments conducted on a ZYELL’s real-world dataset show promising performance. It was found that the proposed approach outperformed the baseline model in terms of average macro Fβ score and false alarm rate by 17% and 5.3%, respectively.


2021 ◽  
pp. 016555152199102
Author(s):  
Naif Radi Aljohani ◽  
Ayman Fayoumi ◽  
Saeed-Ul Hassan

We argue that citations, as they have different reasons and functions, should not all be treated in the same way. Using the large, annotated dataset of about 10K citation contexts annotated by human experts, extracted from the Association for Computational Linguistics repository, we present a deep learning–based citation context classification architecture. Unlike all existing state-of-the-art feature-based citation classification models, our proposed convolutional neural network (CNN) with fastText-based pre-trained embedding vectors uses only the citation context as its input to outperform them in both binary- (important and non-important) and multi-class (Use, Extends, CompareOrContrast, Motivation, Background, Other) citation classification tasks. Furthermore, we propose using focal-loss and class-weight functions in the CNN model to overcome the inherited class imbalance issues in citation classification datasets. We show that using the focal-loss function with CNN adds a factor of [Formula: see text] to the cross-entropy function. Our model improves on the baseline results by achieving an encouraging 90.6 F1 score with 90.7% accuracy and a 72.3 F1 score with a 72.1% accuracy score, respectively, for binary- and multi-class citation classification tasks.


2020 ◽  
Vol 22 (12) ◽  
pp. 2188-2195
Author(s):  
Xiaozhao Yousef Yang

Abstract Introduction There is growing attention to social mobility’s impact on tobacco use, but few studies have differentiated the two conceptually distinct mechanisms through which changes in social class can affect tobacco smoking: the class status effect and the mobility effect. Aims and Methods I applied Diagonal Reference Modeling to smoking and heavy smoking among respondents of the 1991 China Health and Nutrition Survey who were revisited two decades later in 2011 (n = 3841, 49% male, baseline mean age was 38 years). I divided the sample into six social classes (non-employment, self-employed, owners, workers, farmers, and retirees) and measured social mobility by changes in income and occupational prestige. Results About 61.7% of men were smokers and those from the classes of workers, owners, and self-employees consumed more cigarettes compared to the unemployed, but women smokers (3.7%) tend to be from the lower classes (unemployed and farmers). Controlling for social class, each 1000 Yuan increase in annual income led to smoking 0.03 more cigarettes (p < .05) and 1% increase (p < .05) in the likelihood of heavy smoking among men, but the income effect is null for women. Upwardly mobile men (a 10-points surge in occupational prestige) smoked like their destination class (weight = 78%), whereas men with downward mobility were more similar to peers in the original class (weight = 60%). Conclusions Contrary to the social gradient in smoking in other industrial countries, higher class status and upward mobility are each associated with more smoking among Chinese men, but not among women. Implications Tobacco control policies should prioritize male smoking at workplaces and the instrumental purposes of using tobacco as gifts and social lubricant. Taxation may counter the surge in smoking brought by individuals’ income increase after upward mobility. Caution should be paid to women joining the similar social gradient in smoking as they gain foothold in the labor market.


The modern society accesses various network services through different devices. However, the services afford by the service provider faces various challenges and threats. The services are facing different network threats towards degrading the service performance or the entire network. Number of approaches discussed earlier to restrict the illegal access from malicious users which uses different properties in service level, packet level, user level features. However, they suffer to achieve higher performance in intrusion detection. To improve the performance in intrusion detection an novel tree based ensemble learner algorithm has been proposed in this paper. The method incorporates Random Forest and Random Trees, which are identified as NP complete. The method maintains the list of ensembles which are indexed under trees. At the classification, the Tabu Search algorithm has been used which measures the ensemble class weight (ECW) which has been used to perform classification. According to the result of intrusion detection, an alert has been generated to the administrator. The proposed algorithm improves the performance of intrusion detection.


IEEE Access ◽  
2020 ◽  
Vol 8 ◽  
pp. 67059-67074
Author(s):  
Wei Zhang ◽  
Jun Huang ◽  
Hao Nan Chen ◽  
Md. Fazla Elahe ◽  
Min Jin

2019 ◽  
Vol 8 (4) ◽  
pp. 7318-7322

The problem of bipolar disorder has been well studied and analyzed. To perform the detection of presence of BD, there are number of approaches available and the result of detection has been used in several ways. In order to improve the performance in BD detection and utilize the result in gauging the performance of students, a behavioral pattern base psychotic analysis model has been presented in this paper. The method maintains the behaviors, habits and interests of different students in different period of time. The student behaviors includes mood change, depression, sudden laughs, uninterested, short temper, lack of concentration, adamant, frustration, energy, sleep and so on. Such behaviors has been tracked for number of students for prolong period and stored in the behavior set. By reading the behavior set and with the identified samples of BD, the method generates set of behavioral patterns. The behavioral pattern has been generated for three different classes like lower, medium and high. For each class of behavioral pattern, the method generates set of fuzzy rules. Using the fuzzy rule, each student has been analyzed for their behavioral pattern in different time window. Based on the patterns, the method estimates BDCW (Bipolar Disorder Class Weight). Based on the weight measure, the presence of BD has been identified and classified under different class. Identified results have been used to generate academic pattern and helps to generate analysis result to improve the student performance. The proposed approach improve the performance of student development, monitoring and health development.


The problem of web document clustering has been well studied. Web documents has been grouped based various features like textual, topical and semantic features. Number of approaches has been discussed earlier for the clustering of web documents. However the method does not produce promising results towards web document clustering. To overcome this, an efficient hierarchical semantic relational coverage based approach is presented in this paper. The method extracts the features of web document by preprocessing the document. The features extracted have been used to measure the semantic relational coverage measure in different levels. As the documents are grouped in a hierarchical manner, the method estimates the relational coverage measure in each level of the cluster. Based on the semantic relational measure at different level, the method estimates the topical semantic support measure. Using these two, the method computes the class weight. The estimated class weight has been used to perform document clustering. The proposed method improves the performance of document clustering and reduces the false classification ratio.


Sign in / Sign up

Export Citation Format

Share Document