Classification of Medical Text Data Using Convolutional Neural Network-Support Vector Machine Method

2020 ◽  
Vol 10 (7) ◽  
pp. 1746-1753
Author(s):  
Lan Liu ◽  
Xiankun Sun ◽  
Chengfan Li ◽  
Yongmei Lei

Conventional methods of medical text data classification, neglect of context among different words and semantic information, has a poor text description, classification effect and generalization capability and robustness. To tackle the inefficiencies and low precision in the classification of medical text data, in this paper, we presented a new classification method with improved convolutional neural network (CNN) and support vector machine (SVM), i.e., CNN-SVM method. In the method, some convolution kernel filters that contribute greatly to the CNN model are first selected by the average response energy (ARE) value, and then used to simplify and reconstruct the CNN model. Next, the SVM classifier was optimized by firefly algorithm (FA) and context information to overcome the disadvantages of over-saturation and over-training in SVM classification. Finally, the presented CNN-SVM method is tested by the simulation experiment and the true classification of medical text data. The experimental results show that the presented CNN-SVM method in this paper can significantly reduce the complexity and amount of computation compared to the conventional methods, and further promote the computational efficiency and classification accuracy of medical text data.

2021 ◽  
pp. 102568
Author(s):  
Mesut Ersin Sonmez ◽  
Numan Eczacıoglu ◽  
Numan Emre Gumuş ◽  
Muhammet Fatih Aslan ◽  
Kadir Sabanci ◽  
...  

Author(s):  
Sumit S. Lad ◽  
◽  
Amol C. Adamuthe

Malware is a threat to people in the cyber world. It steals personal information and harms computer systems. Various developers and information security specialists around the globe continuously work on strategies for detecting malware. From the last few years, machine learning has been investigated by many researchers for malware classification. The existing solutions require more computing resources and are not efficient for datasets with large numbers of samples. Using existing feature extractors for extracting features of images consumes more resources. This paper presents a Convolutional Neural Network model with pre-processing and augmentation techniques for the classification of malware gray-scale images. An investigation is conducted on the Malimg dataset, which contains 9339 gray-scale images. The dataset created from binaries of malware belongs to 25 different families. To create a precise approach and considering the success of deep learning techniques for the classification of raising the volume of newly created malware, we proposed CNN and Hybrid CNN+SVM model. The CNN is used as an automatic feature extractor that uses less resource and time as compared to the existing methods. Proposed CNN model shows (98.03%) accuracy which is better than other existing CNN models namely VGG16 (96.96%), ResNet50 (97.11%) InceptionV3 (97.22%), Xception (97.56%). The execution time of the proposed CNN model is significantly reduced than other existing CNN models. The proposed CNN model is hybridized with a support vector machine. Instead of using Softmax as activation function, SVM performs the task of classifying the malware based on features extracted by the CNN model. The proposed fine-tuned model of CNN produces a well-selected features vector of 256 Neurons with the FC layer, which is input to SVM. Linear SVC kernel transforms the binary SVM classifier into multi-class SVM, which classifies the malware samples using the one-against-one method and delivers the accuracy of 99.59%.


2021 ◽  
Vol 9 ◽  
Author(s):  
Ashwini K ◽  
P. M. Durai Raj Vincent ◽  
Kathiravan Srinivasan ◽  
Chuan-Yu Chang

Neonatal infants communicate with us through cries. The infant cry signals have distinct patterns depending on the purpose of the cries. Preprocessing, feature extraction, and feature selection need expert attention and take much effort in audio signals in recent days. In deep learning techniques, it automatically extracts and selects the most important features. For this, it requires an enormous amount of data for effective classification. This work mainly discriminates the neonatal cries into pain, hunger, and sleepiness. The neonatal cry auditory signals are transformed into a spectrogram image by utilizing the short-time Fourier transform (STFT) technique. The deep convolutional neural network (DCNN) technique takes the spectrogram images for input. The features are obtained from the convolutional neural network and are passed to the support vector machine (SVM) classifier. Machine learning technique classifies neonatal cries. This work combines the advantages of machine learning and deep learning techniques to get the best results even with a moderate number of data samples. The experimental result shows that CNN-based feature extraction and SVM classifier provides promising results. While comparing the SVM-based kernel techniques, namely radial basis function (RBF), linear and polynomial, it is found that SVM-RBF provides the highest accuracy of kernel-based infant cry classification system provides 88.89% accuracy.


2020 ◽  
Vol 17 (4) ◽  
pp. 572-578
Author(s):  
Mohammad Parseh ◽  
Mohammad Rahmanimanesh ◽  
Parviz Keshavarzi

Persian handwritten digit recognition is one of the important topics of image processing which significantly considered by researchers due to its many applications. The most important challenges in Persian handwritten digit recognition is the existence of various patterns in Persian digit writing that makes the feature extraction step to be more complicated.Since the handcraft feature extraction methods are complicated processes and their performance level are not stable, most of the recent studies have concentrated on proposing a suitable method for automatic feature extraction. In this paper, an automatic method based on machine learning is proposed for high-level feature extraction from Persian digit images by using Convolutional Neural Network (CNN). After that, a non-linear multi-class Support Vector Machine (SVM) classifier is used for data classification instead of fully connected layer in final layer of CNN. The proposed method has been applied to HODA dataset and obtained 99.56% of recognition rate. Experimental results are comparable with previous state-of-the-art methods


Energies ◽  
2020 ◽  
Vol 13 (2) ◽  
pp. 460 ◽  
Author(s):  
Zuojun Liu ◽  
Cheng Xiao ◽  
Tieling Zhang ◽  
Xu Zhang

In wind power generation, one aim of wind turbine control is to maintain it in a safe operational status while achieving cost-effective operation. The purpose of this paper is to investigate new techniques for wind turbine fault detection based on supervisory control and data acquisition (SCADA) system data in order to avoid unscheduled shutdowns. The proposed method starts with analyzing and determining the fault indicators corresponding to a failure mode. Three main system failures including generator failure, converter failure and pitch system failure are studied. First, the indicators data corresponding to each of the three key failures are extracted from the SCADA system, and the radar charts are generated. Secondly, the convolutional neural network with ResNet50 as the backbone network is selected, and the fault model is trained using the radar charts to detect the fault and calculate the detection evaluation indices. Thirdly, the support vector machine classifier is trained using the support vector machine method to achieve fault detection. In order to show the effectiveness of the proposed radar chart-based methods, support vector regression analysis is also employed to build the fault detection model. By analyzing and comparing the fault detection accuracy among these three methods, it is found that the fault detection accuracy by the models developed using the convolutional neural network is obviously higher than the other two methods applied given the same data condition. Therefore, the newly proposed method for wind turbine fault detection is proved to be more effective.


2021 ◽  
Vol 8 ◽  
Author(s):  
Nicolas Schneider ◽  
Keywan Sohrabi ◽  
Henning Schneider ◽  
Klaus-Peter Zimmer ◽  
Patrick Fischer ◽  
...  

Introduction: The rising incidence of pediatric inflammatory bowel diseases (PIBD) facilitates the need for new methods of improving diagnosis latency, quality of care and documentation. Machine learning models have shown to be applicable to classifying PIBD when using histological data or extensive serology. This study aims to evaluate the performance of algorithms based on promptly available data more suited to clinical applications.Methods: Data of inflammatory locations of the bowels from initial and follow-up visitations is extracted from the CEDATA-GPGE registry and two follow-up sets are split off containing only input from 2017 and 2018. Pre-processing excludes patients in remission and encodes the categorical data numerically. For classification of PIBD diagnosis, a support vector machine (SVM), a random forest algorithm (RF), extreme gradient boosting (XGBoost), a dense neural network (DNN) and a convolutional neural network (CNN) are employed. As best performer, a convolutional neural network is further improved using grid optimization.Results: The achieved accuracy of the optimized neural network reaches up to 90.57% on data inserted into the registry in 2018. Less performant methods reach 88.78% for the DNN down to 83.94% for the XGBoost. The accuracy of prediction for the 2018 follow-up dataset is higher than those for older datasets. Neural networks yield a higher standard deviation with 3.45 for the CNN compared to 0.83–0.86 of the support vector machine and ensemble methods.Discussion: The displayed accuracy of the convolutional neural network proofs the viability of machine learning classification in PIBD diagnostics using only timely available data.


2018 ◽  
Author(s):  
Youshan Zhang ◽  
Jon-Patrick Allem ◽  
Jennifer Beth Unger ◽  
Tess Boley Cruz

BACKGROUND Instagram, with millions of posts per day, can be used to inform public health surveillance targets and policies. However, current research relying on image-based data often relies on hand coding of images, which is time-consuming and costly, ultimately limiting the scope of the study. Current best practices in automated image classification (eg, support vector machine (SVM), backpropagation neural network, and artificial neural network) are limited in their capacity to accurately distinguish between objects within images. OBJECTIVE This study aimed to demonstrate how a convolutional neural network (CNN) can be used to extract unique features within an image and how SVM can then be used to classify the image. METHODS Images of waterpipes or hookah (an emerging tobacco product possessing similar harms to that of cigarettes) were collected from Instagram and used in the analyses (N=840). A CNN was used to extract unique features from images identified to contain waterpipes. An SVM classifier was built to distinguish between images with and without waterpipes. Methods for image classification were then compared to show how a CNN+SVM classifier could improve accuracy. RESULTS As the number of validated training images increased, the total number of extracted features increased. In addition, as the number of features learned by the SVM classifier increased, the average level of accuracy increased. Overall, 99.5% (418/420) of images classified were correctly identified as either hookah or nonhookah images. This level of accuracy was an improvement over earlier methods that used SVM, CNN, or bag-of-features alone. CONCLUSIONS A CNN extracts more features of images, allowing an SVM classifier to be better informed, resulting in higher accuracy compared with methods that extract fewer features. Future research can use this method to grow the scope of image-based studies. The methods presented here might help detect increases in the popularity of certain tobacco products over time on social media. By taking images of waterpipes from Instagram, we place our methods in a context that can be utilized to inform health researchers analyzing social media to understand user experience with emerging tobacco products and inform public health surveillance targets and policies.


Author(s):  
Hafizatul Hanin Hamzah ◽  
Nurbaity Sabri ◽  
Zaidah Ibrahim ◽  
Dino Isa

This paper investigates bambara groundnut leaf disease recognition using two popular techniques known as Convolutional Neural Network (CNN) and Bag of Features (BOF) with Speeded-up Robust Feature (SURF) and Support Vector Machine (SVM) classifier.  Leaf disease recognition has attracted many researchers because the outcome is useful for farmers. One of the crops that provide high income for farmers is bambara groundnut but the leaves are easily infected with diseases especially after the rain.  This could affect the crop productivity.  Thus, automatic disease recognition is crucial.  A new dataset that consists of 400 images of the infected and non-infected leaves of bambara groundnut has been constructed. The experimental results indicate that both of these techniques produce excellent leaf disease recognition accuracy.


Sign in / Sign up

Export Citation Format

Share Document