scholarly journals Exploring Significant Characteristics and Models for Classification of Structure Function of Academic Documents

2020 ◽  
Vol 5 (1) ◽  
pp. 65-74
Author(s):  
Bowen Ma ◽  
Chengzhi Zhang ◽  
Yuzhuo Wang

AbstractWith the increasing abundance of literature resources, how to acquire knowledge elements efficiently and accurately is the key to achieving accurate literature retrieval and utilization of available literature resources. The identification of the structure function of academic documents is a fundamental work to meet the above requirements. In this study, the proceedings of the Association for Computational Linguistics (ACL) conferences are used as the primitive corpus, and the training corpus of chapter category is obtained by manual annotation. Based on the chapter titles and the in-chapter texts, traditional machine learning and deep learning models are both used for classifier training. Our results show that the title of a chapter is more beneficial to the identification of the structure function of academic documents than the in-chapter texts. The highest F1 value in our experiments is 0.9249, which is obtained on the traditional logistic regression (LR) and support vector machine (SVM) models (slightly higher than on the convolutional neural network [CNN]). And through the experiment of adding other chapter characteristics based on the traditional model, we find that combining the relative position of chapters can effectively improve the classification performance. Finally, this study compares the results of experimental groups with different methods, analyzes the misclassification of the structure function of academic documents, and points out the main direction to improve the classification performance in the future.

Author(s):  
Yuejun Liu ◽  
Yifei Xu ◽  
Xiangzheng Meng ◽  
Xuguang Wang ◽  
Tianxu Bai

Background: Medical imaging plays an important role in the diagnosis of thyroid diseases. In the field of machine learning, multiple dimensional deep learning algorithms are widely used in image classification and recognition, and have achieved great success. Objective: The method based on multiple dimensional deep learning is employed for the auxiliary diagnosis of thyroid diseases based on SPECT images. The performances of different deep learning models are evaluated and compared. Methods: Thyroid SPECT images are collected with three types, they are hyperthyroidism, normal and hypothyroidism. In the pre-processing, the region of interest of thyroid is segmented and the amount of data sample is expanded. Four CNN models, including CNN, Inception, VGG16 and RNN, are used to evaluate deep learning methods. Results: Deep learning based methods have good classification performance, the accuracy is 92.9%-96.2%, AUC is 97.8%-99.6%. VGG16 model has the best performance, the accuracy is 96.2% and AUC is 99.6%. Especially, the VGG16 model with a changing learning rate works best. Conclusion: The standard CNN, Inception, VGG16, and RNN four deep learning models are efficient for the classification of thyroid diseases with SPECT images. The accuracy of the assisted diagnostic method based on deep learning is higher than that of other methods reported in the literature.


2019 ◽  
Vol 2019 ◽  
pp. 1-9
Author(s):  
Yizhe Wang ◽  
Cunqian Feng ◽  
Yongshun Zhang ◽  
Sisan He

Precession is a common micromotion form of space targets, introducing additional micro-Doppler (m-D) modulation into the radar echo. Effective classification of space targets is of great significance for further micromotion parameter extraction and identification. Feature extraction is a key step during the classification process, largely influencing the final classification performance. This paper presents two methods for classifying different types of space precession targets from the HRRPs. We first establish the precession model of space targets and analyze the scattering characteristics and then compute electromagnetic data of the cone target, cone-cylinder target, and cone-cylinder-flare target. Experimental results demonstrate that the support vector machine (SVM) using histograms of oriented gradient (HOG) features achieves a good result, whereas the deep convolutional neural network (DCNN) obtains a higher classification accuracy. DCNN combines the feature extractor and the classifier itself to automatically mine the high-level signatures of HRRPs through a training process. Besides, the efficiency of the two classification processes are compared using the same dataset.


Sensors ◽  
2021 ◽  
Vol 21 (21) ◽  
pp. 7417
Author(s):  
Alex J. Hope ◽  
Utkarsh Vashisth ◽  
Matthew J. Parker ◽  
Andreas B. Ralston ◽  
Joshua M. Roper ◽  
...  

Concussion injuries remain a significant public health challenge. A significant unmet clinical need remains for tools that allow related physiological impairments and longer-term health risks to be identified earlier, better quantified, and more easily monitored over time. We address this challenge by combining a head-mounted wearable inertial motion unit (IMU)-based physiological vibration acceleration (“phybrata”) sensor and several candidate machine learning (ML) models. The performance of this solution is assessed for both binary classification of concussion patients and multiclass predictions of specific concussion-related neurophysiological impairments. Results are compared with previously reported approaches to ML-based concussion diagnostics. Using phybrata data from a previously reported concussion study population, four different machine learning models (Support Vector Machine, Random Forest Classifier, Extreme Gradient Boost, and Convolutional Neural Network) are first investigated for binary classification of the test population as healthy vs. concussion (Use Case 1). Results are compared for two different data preprocessing pipelines, Time-Series Averaging (TSA) and Non-Time-Series Feature Extraction (NTS). Next, the three best-performing NTS models are compared in terms of their multiclass prediction performance for specific concussion-related impairments: vestibular, neurological, both (Use Case 2). For Use Case 1, the NTS model approach outperformed the TSA approach, with the two best algorithms achieving an F1 score of 0.94. For Use Case 2, the NTS Random Forest model achieved the best performance in the testing set, with an F1 score of 0.90, and identified a wider range of relevant phybrata signal features that contributed to impairment classification compared with manual feature inspection and statistical data analysis. The overall classification performance achieved in the present work exceeds previously reported approaches to ML-based concussion diagnostics using other data sources and ML models. This study also demonstrates the first combination of a wearable IMU-based sensor and ML model that enables both binary classification of concussion patients and multiclass predictions of specific concussion-related neurophysiological impairments.


2019 ◽  
Vol 38 (1) ◽  
pp. 155-169
Author(s):  
Chihli Hung ◽  
You-Xin Cao

Purpose This paper aims to propose a novel approach which integrates collocations and domain concepts for Chinese cosmetic word of mouth (WOM) sentiment classification. Most sentiment analysis works by collecting sentiment scores from each unigram or bigram. However, not every unigram or bigram in a WOM document contains sentiments. Chinese collocations consist of the main sentiments of WOM. This paper reduces the complexity of the document dimensionality and makes an improvement for sentiment classification. Design/methodology/approach This paper builds two contextual lexicons for feature words and sentiment words, respectively. Based on these contextual lexicons, this paper uses the techniques of associated rules and mutual information to build possible Chinese collocation sets. This paper applies preference vector modelling as the vector representation approach to catch the relationship between Chinese collocations and their associated concepts. Findings This paper compares the proposed preference vector models with benchmarks, using three classification techniques (i.e. support vector machine, J48 decision tree and multilayer perceptron). According to the experimental results, the proposed models outperform all benchmarks evaluated by the criterion of accuracy. Originality/value This paper focuses on Chinese collocations and proposes a novel research approach for sentiment classification. The Chinese collocations used in this paper are adaptable to the content and domains. Finally, this paper integrates collocations with the preference vector modelling approach, which not only achieves a better sentiment classification performance for Chinese WOM documents but also avoids the curse of dimensionality.


2021 ◽  
Vol 7 ◽  
pp. e680
Author(s):  
Muhammad Amirul Abdullah ◽  
Muhammad Ar Rahim Ibrahim ◽  
Muhammad Nur Aiman Shapiee ◽  
Muhammad Aizzat Zakaria ◽  
Mohd Azraai Mohd Razman ◽  
...  

This study aims at classifying flat ground tricks, namely Ollie, Kickflip, Shove-it, Nollie and Frontside 180, through the identification of significant input image transformation on different transfer learning models with optimized Support Vector Machine (SVM) classifier. A total of six amateur skateboarders (20 ± 7 years of age with at least 5.0 years of experience) executed five tricks for each type of trick repeatedly on a customized ORY skateboard (IMU sensor fused) on a cemented ground. From the IMU data, a total of six raw signals extracted. A total of two input image type, namely raw data (RAW) and Continous Wavelet Transform (CWT), as well as six transfer learning models from three different families along with grid-searched optimized SVM, were investigated towards its efficacy in classifying the skateboarding tricks. It was shown from the study that RAW and CWT input images on MobileNet, MobileNetV2 and ResNet101 transfer learning models demonstrated the best test accuracy at 100% on the test dataset. Nonetheless, by evaluating the computational time amongst the best models, it was established that the CWT-MobileNet-Optimized SVM pipeline was found to be the best. It could be concluded that the proposed method is able to facilitate the judges as well as coaches in identifying skateboarding tricks execution.


2018 ◽  
Vol 21 (62) ◽  
pp. 1
Author(s):  
Jorge E. Camargo ◽  
Vladimir Vargas-Calderon ◽  
Nelson Vargas ◽  
Liliana Calderón-Benavides

With the purpose of classifying text based on its sentiment polarity (positive or negative), we proposed an extension of a 68,000 tweets corpus through the inclusion of word definitions from a dictionary of the Real Academia Espa\~{n}ola de la Lengua (RAE). A set of 28,000 combinations of 6 Word2Vec and support vector machine parameters were considered in order to evaluate how positively would affect the inclusion of a RAE's dictionary definitions classification performance. We found that such a corpus extension significantly improve the classification accuracy. Therefore, we conclude that the inclusion of a RAE's dictionary increases the semantic relations learned by Word2Vec allowing a better classification accuracy.


Electrocardiogram (ECG) examination via computer techniques that involve feature extraction, pre-processing and post-processing was implemented due to its significant advantages. Extracting ECG signal standard features that requires high processing operation level was the main focusing point for many studies. In this paper, up to 6 different ECG signal classes are accurately predicted in the absence of ECG feature extraction. The corner stone of the proposed technique in this paper is the Linear predictive coding (LPC) technique that regress and normalize the signal during the pre-processing phase. Prior to the feature extraction using Wavelet energy (WE), a direct Wavelet transform (DWT) is implemented that converted ECG signal to frequency domain. In addition, the dataset was divided into two parts , one for training and the other for testing purposes Which have been classified in this proposed algorithm using support vector machine (SVM). Moreover, using MIT AI2 Companion was developed by MIT Center for Mobile Learning, the classification result was shared to the patient mobile phone that can call the ambulance and send the location in case of serious emergency. Finally, the confusion matrix values are used to measure the proposed classification performance. For 6 different ECG classes, an accuracy ration of about 98.15% was recorded. This ratio became 100% for 3 ECG signal classes and decreases to 97.95% by increasing ECG signal to 7 classes.


Author(s):  
G. Jayagopi ◽  
S. Pushpa

<span>Heart diseases had been molded as potential threats to human lives, especially to elderly people in recent days due to the dynamically varying food habits among the people. However, these diseases could be easily caught by proper analysis of Electrocardiogram (ECG) signals acquired from individuals. This paper proposes a better method to detect and classify the arrhythmia using 15 features which include 4 R-R interval features, 3 statistical and 6 chaotic features estimated from ECG signals. Additionally, Entropy and Energy features had been gained after converting one dimensional ECG signals to two dimensional data and applied Tetrolet transforms on that.  Total numbers of 15 features had been utilized to classify the heart beats from the benchmark MIT-Arrhythmia database using Support Vector Machines (SVM). The classification performance was analyzed under various kernel functions and different Tetrolet decomposition levels. It is found that Radial Basis Function (RBF) kernel could perform better than linear and polynomial kernels. This research attempt yielded an accuracy of 99.35 % against the existing works. Moreover, addition of two more features had introduced a negligible overhead of time. Hence, this method is better suitable to detect and classify the Arrhythmia in both online and offline.</span>


2021 ◽  
Vol 3 (1) ◽  
pp. 6
Author(s):  
Eren Can Seyrek ◽  
Murat Uysal

Hyperspectral images (HSI) offer detailed spectral reflectance information about sensed objects through provision of information on hundreds of narrow spectral bands. HSI have a leading role in a broad range of applications, such as in forestry, agriculture, geology, and environmental sciences. The monitoring and management of agricultural lands is of great importance for meeting the nutritional and other needs of a rapidly and continuously increasing world population. In relation to this, classification of HSI is an effective way for creating land use and land cover maps quickly and accurately. In recent years, classification of HSI using convolutional neural networks (CNN), which is a sub-field of deep learning, has become a very popular research topic and several CNN architectures have been developed by researchers. The aim of this study was to investigate the classification performance of CNN model on agricultural HSI scenes. For this purpose, a 3D-2D CNN framework and a well-known support vector machine (SVM) model were compared using the Indian Pines and Salinas Scene datasets that contain crop and mixed vegetation classes. As a result of this study, it was confirmed that use of 3D-2D CNN offers superior performance for classifying agricultural HSI datasets.


Entropy ◽  
2019 ◽  
Vol 21 (8) ◽  
pp. 745 ◽  
Author(s):  
Yangjie Wei ◽  
Shiliang Fang ◽  
Xiaoyan Wang

Since digital communication signals are widely used in radio and underwater acoustic systems, the modulation classification of these signals has become increasingly significant in various military and civilian applications. However, due to the adverse channel transmission characteristics and low signal to noise ratio (SNR), the modulation classification of communication signals is extremely challenging. In this paper, a novel method for automatic modulation classification of digital communication signals using a support vector machine (SVM) based on hybrid features, cyclostationary, and information entropy is proposed. In this proposed method, by combining the theory of the cyclostationary and entropy, based on the existing signal features, we propose three other new features to assist the classification of digital communication signals, which are the maximum value of the normalized cyclic spectrum when the cyclic frequency is not zero, the Shannon entropy of the cyclic spectrum, and Renyi entropy of the cyclic spectrum respectively. Because these new features do not require any prior information and have a strong anti-noise ability, they are very suitable for the identification of communication signals. Finally, a one against one SVM is designed as a classifier. Simulation results show that the proposed method outperforms the existing methods in terms of classification performance and noise tolerance.


Sign in / Sign up

Export Citation Format

Share Document