The Northern Extragalactic WISE × Pan-STARRS (NEWS) catalogue

This study involves two photometric catalogues, AllWISE and Pan-STARRS Data Release 1, which were cross-matched to identify extragalactic objects among the common sources of these catalogues. To separate galaxies and quasars from stars, we created a machine-learning model that is trained on photometric (in fact, colour-based) information from the optical and infrared wavelength ranges. The model is based on three important procedures: the construction of the autoencoder artificial neural network, separation of galaxies and quasars from stars with a support vector machine (SVM) classifier, and cleaning of the AllWISE × PS1 sample to remove sources with abnormal colour indices using a one-class SVM. As a training sample, we employed a set of spectroscopically confirmed sources from the Sloan Digital Sky Survey Data Release 14. Having applied the classification model to the data of crossing the AllWISE and Pan-STARRS DR1 samples, we created the Northern Extragalactic WISE × Pan-STARRS (NEWS) catalogue, containing 40 million extragalactic objects and covering 3/4 of celestial sphere up to g = 23m. Several independent classification quality tests, namely, the astrometric test along with others based on the use of data from spectroscopic surveys show similar results and indicate a high purity (∼98.0%) and completeness (> 98%) for the NEWS catalogue within the magnitude range of 19.0m < g < 22.5m. The classification quality still retains quite acceptable levels of 70% for purity and 97% for completeness for the brightest and faintest objects from this magnitude range. In addition, validation with external data sets has demonstrated the need for using only those sources in the NEWS catalogue that are outside the zone with the enhanced extinction. We show that the number of quasars from the NEWS catalogue identified in Gaia DR2 exceeds the number of quasars previously identified in Gaia DR2 with the use of the AllWISEAGN catalogue. These quasars may be used in future as an additional sample for testing and anchoring the Gaia Celestial Reference Frame.

Download Full-text

Research on a Fast Human-Detection Algorithm for Unmanned Surveillance Area in Bulk Ports

Mathematical Problems in Engineering ◽

10.1155/2014/386764 ◽

2014 ◽

Vol 2014 ◽

pp. 1-17 ◽

Cited By ~ 2

Author(s):

Chao Mi ◽

Xin He ◽

Haiwei Liu ◽

Youfang Huang ◽

Weijian Mi

Keyword(s):

Real Time ◽

Computing Time ◽

Training Sample ◽

Detection Algorithm ◽

Human Detection ◽

Security Requirements ◽

Support Vector ◽

Svm Classifier ◽

Histograms Of Oriented Gradients ◽

Heavy Equipment

With the development of port automation, most operational fields utilizing heavy equipment have gradually become unmanned. It is therefore imperative to monitor these fields in an effective and real-time manner. In this paper, a fast human-detection algorithm is proposed based on image processing. To speed up the detection process, the optimized histograms of oriented gradients (HOG) algorithm that can avoid the large number of double calculations of the original HOG and ignore insignificant features is used to describe the contour of the human body in real time. Based on the HOG features, using a training sample set consisting of scene images of a bulk port, a support vector machine (SVM) classifier combined with the AdaBoost classifier is trained to detect human. Finally, the results of the human detection experiments on Tianjin Port show that the accuracy of the proposed optimized algorithm has roughly the same accuracy as a traditional algorithm, while the proposed algorithm only takes 1/7 the amount of time. The accuracy and computing time of the proposed fast human-detection algorithm were verified to meet the security requirements of unmanned port areas.

Download Full-text

Data Mining Technology Application in False Text Information Recognition

Mobile Information Systems ◽

10.1155/2021/4206424 ◽

2021 ◽

Vol 2021 ◽

pp. 1-13

Author(s):

Jie Wan ◽

Xue Cao ◽

Kun Yao ◽

Donghui Yang ◽

E. Peng ◽

...

Keyword(s):

Data Mining ◽

Classification Model ◽

Support Vector ◽

Svm Classifier ◽

Characteristic Matrix ◽

Mining Technology ◽

Technology Application ◽

Text Information ◽

The Government ◽

Effect Of The Support

False information on the Internet is being heralded as serious social harm to our society. To recognize false text information, in this paper, an effective method for mining text features is proposed in the field of false drug advertisements. Firstly, the data of false drug advertisements and real drug advertisements were collected from the official websites to build a database of false and real drug advertisements. Secondly, by performing feature extraction on the text of drug advertisements, this work built a characteristic matrix based on the effective features and assigned positive or negative labels to the feature vector of the matrix according to whether it is a fake medical advertisement or not. Thirdly, this study trained and tested several different classifiers, selected the classification model with the best performance in identifying false drug advertisements, and found the key characteristics that can determine the classification. Finally, the model with the best performance was used to predict new false drug advertisements collected from Sina Weibo. In the case of identifying false drug advertisements, the classification effect of the support vector machine (SVM) classifier established on the feature set after feature selection was the most effective. The findings of this study can provide an effective method for the government to identify and combat false advertisements. This study has a certain reference significance in demonstrating the use of text data mining technology to identify and detect information fraud behavior.

Download Full-text

A Study of Supplier Selection Method Based on SVM for Weighting Expert Evaluation

Discrete Dynamics in Nature and Society ◽

10.1155/2021/8056209 ◽

2021 ◽

Vol 2021 ◽

pp. 1-11

Author(s):

Li Zhao ◽

Wenjing Qi ◽

Meihong Zhu

Keyword(s):

Supplier Selection ◽

Strategic Decision ◽

Classification Model ◽

Support Vector ◽

Svm Classifier ◽

Preference Order ◽

Expert Evaluation ◽

Learning To Learn ◽

Evaluation Data ◽

Evaluation Information

How to choose suppliers scientifically is an important part of strategic decision-making management of enterprises. Expert evaluation is subjective and uncontrollable; sometimes, there exists biased evaluation, which will lead to controversial or unfair results in supplier selection. To tackle this problem, this paper proposes a novel method that employs machine learning to learn the credibility of expert from historical data, which is converted to weights in evaluation process. We first use the Support Vector Machine (SVM) classifier to classify the historical evaluation data of experts and calculate the experts’ evaluation credibility, then determine the weights of the evaluation experts, finally assemble the weighted evaluation results, and get a preference order of choosing suppliers. The main contribution of this method is that it overcomes the shortcomings of multiple conversions and large loss on evaluation information, maintains the initial evaluation information to the maximum extent, and improves the credibility of evaluation results and the fairness and scientificity of supplier selection. The results show that it is feasible to classify the past evaluation data of the evaluation experts by the SVM classification model, and the expert weights determined on the basis of the evaluation credibility of experts are adjustable.

Download Full-text

Classification of Rice and Starch Flours by Using Multiple Hyperspectral Imaging Systems and Chemometric Methods

Applied Sciences ◽

10.3390/app10196724 ◽

2020 ◽

Vol 10 (19) ◽

pp. 6724

Author(s):

Youngwook Seo ◽

Ahyeong Lee ◽

Balgeum Kim ◽

Jongguk Lim

Keyword(s):

Discriminant Analysis ◽

Hyperspectral Imaging ◽

Partial Least Square ◽

Hyperspectral Images ◽

Classification Model ◽

Support Vector ◽

Chemometric Methods ◽

Infrared Wavelength ◽

Linear Discriminant

(1) Background: The general use of food-processing facilities in the agro-food industry has increased the risk of unexpected material contamination. For instance, grain flours have similar colors and shapes, making their detection and isolation from each other difficult. Therefore, this study is aimed at verifying the feasibility of detecting and isolating grain flours by using hyperspectral imaging technology and developing a classification model of grain flours. (2) Methods: Multiple hyperspectral images were acquired through line scanning methods from reflectance of visible and near-infrared wavelength (400–1000 nm), reflectance of shortwave infrared wavelength (900–1700 nm), and fluorescence (400–700 nm) by 365 nm ultraviolet (UV) excitation. Eight varieties of grain flours were prepared (rice: 4, starch: 4), and the particle size and starch damage content were measured. To develop the classification model, four multivariate analysis methods (linear discriminant analysis (LDA), partial least-square discriminant analysis, support vector machine, and classification and regression tree) were implemented with several pre-processing methods, and their classification results were compared with respect to accuracy and Cohen’s kappa coefficient obtained from confusion matrices. (3) Results: The highest accuracy was achieved as 97.43% through short-wavelength infrared with normalization in the spectral domain. The submission of the developed classification model to the hyperspectral images showed that the fluorescence method achieves the highest accuracy of 81% using LDA. (4) Conclusions: In this study, the potential of non-destructive classification of rice and starch flours using multiple hyperspectral modalities and chemometric methods were demonstrated.

Download Full-text

Morphological Neuroimaging Biomarkers for Tinnitus: Evidence Obtained by Applying Machine Learning

Neural Plasticity ◽

10.1155/2019/1712342 ◽

2019 ◽

Vol 2019 ◽

pp. 1-11

Author(s):

Yawen Liu ◽

Haijun Niu ◽

Jianming Zhu ◽

Pengfei Zhao ◽

Hongxia Yin ◽

...

Keyword(s):

Machine Learning ◽

Healthy Subjects ◽

Morphological Changes ◽

Gray Matter Volume ◽

Brain Regions ◽

Classification Model ◽

Middle Temporal Gyrus ◽

Support Vector ◽

Svm Classifier ◽

Neuroimaging Biomarkers

According to previous studies, many neuroanatomical alterations have been detected in patients with tinnitus. However, the results of these studies have been inconsistent. The objective of this study was to explore the cortical/subcortical morphological neuroimaging biomarkers that may characterize idiopathic tinnitus using machine learning methods. Forty-six patients with idiopathic tinnitus and fifty-six healthy subjects were included in this study. For each subject, the gray matter volume of 61 brain regions was extracted as an original feature pool. From this feature pool, a hybrid feature selection algorithm combining the F-score and sequential forward floating selection (SFFS) methods was performed to select features. Then, the selected features were used to train a support vector machine (SVM) model. The area under the curve (AUC) and accuracy were used to assess the performance of the classification model. As a result, a combination of 13 cortical/subcortical brain regions was found to have the highest classification accuracy for effectively differentiating patients with tinnitus from healthy subjects. These brain regions include the bilateral hypothalamus, right insula, bilateral superior temporal gyrus, left rostral middle frontal gyrus, bilateral inferior temporal gyrus, right inferior parietal lobule, right transverse temporal gyrus, right middle temporal gyrus, right cingulate gyrus, and left superior frontal gyrus. The accuracy in the training and test datasets was 80.49% and 80.00%, respectively, and the AUC was 0.8586. To the best of our knowledge, this is the first study to elucidate brain morphological changes in patients with tinnitus by applying an SVM classifier. This study provides validated cortical/subcortical morphological neuroimaging biomarkers to differentiate patients with tinnitus from healthy subjects and contributes to the understanding of neuroanatomical alterations in patients with tinnitus.

Download Full-text

KEYWORD SPOTTING FROM ONLINE CHINESE HANDWRITTEN DOCUMENTS USING ONE-VERSUS-ALL CHARACTER CLASSIFICATION MODEL

International Journal of Pattern Recognition and Artificial Intelligence ◽

10.1142/s0218001413530017 ◽

2013 ◽

Vol 27 (03) ◽

pp. 1353001 ◽

Cited By ~ 3

Author(s):

HENG ZHANG ◽

DA-HAN WANG ◽

CHENG-LIN LIU ◽

HORST BUNKE

Keyword(s):

Classification Model ◽

Experimental Comparison ◽

Support Vector ◽

Svm Classifier ◽

Keyword Spotting ◽

Handwritten Documents ◽

Handwritten Text Recognition ◽

Adaptive Thresholds ◽

Query Word ◽

Character Classification

In this paper, we propose a method for text-query-based keyword spotting from online Chinese handwritten documents using character classification model. The similarity between the query word and handwriting is obtained by combining the character classification scores. The classifier is trained by one-versus-all strategy so that it gives high similarity to the target class and low scores to the others. Using character classification-based word similarity also helps overcome the out-of-vocabulary (OOV) problem. We use a character-synchronous dynamic search algorithm to efficiently spot the query word in large database. The retrieval performance is further improved by using competing character confusion and writer-adaptive thresholds. Our experimental results on a large handwriting database CASIA-OLHWDB justify the superiority of one-versus-all trained classifiers and the benefits of confidence transformation, character confusion and adaptive thresholds. Particularly, a one-versus-all trained prototype classifier performs as well as a linear support vector machine (SVM) classifier, but consumes much less storage of index file. The experimental comparison with keyword spotting based on handwritten text recognition also demonstrates the effectiveness of the proposed method.

Download Full-text

Categorization of Common Pigmented Skin Lesions (CPSL) using Multi-Deep Features and Support Vector Machine

10.21203/rs.3.rs-136988/v1 ◽

2021 ◽

Author(s):

SANTI BEHERA ◽

PRABIRA SETHY

Keyword(s):

Support Vector Machine ◽

Skin Cancer ◽

Principal Component ◽

Skin Lesions ◽

Classification Model ◽

Healthcare Sector ◽

Support Vector ◽

Svm Classifier ◽

Deep Feature ◽

Pigmented Skin Lesions

Abstract The skin is the main organ. It is approximately 8 pounds for the average adult. Our skin is a truly wonderful organ. It isolates us and shields our bodies from hazards. However, the skin is also vulnerable to damage and distracted from its original appearance; brown, black, or blue, or combinations of those colors, known as pigmented skin lesions. These common pigmented skin lesions (CPSL) are the leading factor of skin cancer, or can say these are the primary causes of skin cancer. In the healthcare sector, the categorization of CPSL is the main problem because of inaccurate outputs, overfitting, and higher computational costs. Hence, we proposed a classification model based on multi-deep feature and support vector machine (SVM) for the classification of CPSL. The proposed system comprises two phases: first, evaluate the 11 CNN model's performance in the deep feature extraction approach with SVM. Then, concatenate the top performed three CNN model's deep features and with the help of SVM to categorize the CPSL. In the second step, 8192 and 12288 features are obtained by combining binary and triple networks of 4096 features from the top performed CNN model. These features are also given to the SVM classifiers. The SVM results are also evaluated with principal component analysis (PCA) algorithm to the combined feature of 8192 and 12288. The highest results are obtained with 12288 features. The experimentation results, the combination of the deep feature of Alexnet, VGG16 & VGG19, achieved the highest accuracy of 91.7% using SVM classifier. As a result, the results show that the proposed methods are a useful tool for CPSL classification.

Download Full-text

Power Transformer Insulation Assessment Based on Oil-Paper Measurement Data Using SVM-Classifier

10.20944/preprints201806.0002.v1 ◽

2018 ◽

Author(s):

Suwarno ◽

Rahman A. Prasojo

Keyword(s):

Power Transformer ◽

Measurement Data ◽

Classification Model ◽

Support Vector ◽

Svm Classifier ◽

Classification Analysis ◽

Dielectric Characteristics ◽

Paper Condition ◽

Paper Insulation ◽

Insulation Condition

Oil immersed paper insulation condition is a crucial aspect of power transformer’s life condition diagnostic. The measurement testing database collected over the years made it possible for researchers to implement classification analysis to in-service power transformer. This article presents classification analysis of transformer oil-immersed paper insulation condition. The measurements data (dielectric characteristics, dissolved gas analysis, and furanic compounds) of 149 transformers with primary voltage of 150 kV had been gathered and analyzed. The algorithm used for developing classification model is Support Vector Machine (SVM). The model has been trained and tested using different datasets. Different models have been created and the best chosen, resulting in 90.63% accuracy in predicting the oil-immersed paper insulation condition. Further implementation was executed to classify oil-paper condition of 19 Transformers which Furan data is not available. The classification results combined, reviewed, and compared to conventional assessment methods and standards, confirming that the model developed has the ability to do classification of current oil-paper condition based on Dissolved Gasses and Dielectric Characteristics.

Download Full-text

Multiclass patent document classification

Artificial Intelligence Research ◽

10.5430/air.v7n1p1 ◽

2017 ◽

Vol 7 (1) ◽

pp. 1 ◽

Cited By ~ 8

Author(s):

Chaitanya Anne ◽

Avdesh Mishra ◽

Md Tamjidul Hoque ◽

Shengru Tu

Keyword(s):

Text Classification ◽

Information Gain ◽

Document Classification ◽

Classification Model ◽

Machine Learning Techniques ◽

Support Vector ◽

Svm Classifier ◽

Practical Reasons ◽

Patent Document ◽

Vast Number

Text classification is used in information extraction and retrieval from a given text, and text classification has been considered as an important step to manage a vast number of records given in digital form that is far-reaching and expanding. This article addresses patent document classification problem into fifteen different categories or classes, where some classes overlap with each other for practical reasons. For the development of the classification model using machine learning techniques, useful features have been extracted from the given documents. The features are used to classify patent document as well as to generate useful tag-words. The overall objective of this work is to systematize NASA’s patent management, by developing a set of automated tools that can assist NASA to manage and market its portfolio of intellectual properties (IP), and to enable easier discovery of relevant IP by users. We have identified an array of methods that can be applied such as k-Nearest Neighbors (kNN), two variations of the Support Vector Machine (SVM) algorithms, and two tree based classification algorithms: Random Forest and J48. The major research steps in this paper consist of filtering techniques for variable selection, information gain and feature correlation analysis, and training and testing potential models using effective classifiers. Further, the obstacles associated with the imbalanced data were mitigated by adding pseudo-synthetic data wherever appropriate, which resulted in a superior SVM classifier based model.

Download Full-text

Spectral DWT Multilevel Decomposition with Spatial Filtering Enhancement Preprocessing-Based Approaches for Hyperspectral Imagery Classification

Remote Sensing ◽

10.3390/rs11242906 ◽

2019 ◽

Vol 11 (24) ◽

pp. 2906 ◽

Cited By ~ 2

Author(s):

Razika Bazine ◽

Huayi Wu ◽

Kamel Boukhechba

Keyword(s):

Hyperspectral Image ◽

Hyperspectral Imagery ◽

Wiener Filter ◽

Spatial Filtering ◽

Training Sample ◽

Spatial Filter ◽

Support Vector ◽

Svm Classifier ◽

Discrete Wavelet ◽

Two Dimensional

In this paper, spectral–spatial preprocessing using discrete wavelet transform (DWT) multilevel decomposition and spatial filtering is proposed for improving the accuracy of hyperspectral imagery classification. Specifically, spectral DWT multilevel decomposition (SDWT) is performed on the hyperspectral image to separate the approximation coefficients from the detail coefficients. For each level of decomposition, only the detail coefficients are spatially filtered instead of being discarded, as is often adopted by the wavelet-based approaches. Thus, three different spatial filters are explored, including two-dimensional DWT (2D-DWT), adaptive Wiener filter (AWF), and two-dimensional discrete cosine transform (2D-DCT). After the enhancement of the spectral information by performing the spatial filter on the detail coefficients, DWT reconstruction is carried out on both the approximation and the filtered detail coefficients. The final preprocessed image is fed into a linear support vector machine (SVM) classifier. Evaluation results on three widely used real hyperspectral datasets show that the proposed framework using spectral DWT multilevel decomposition with 2D-DCT filter (SDWT-2DCT_SVM) exhibits a significant performance and outperforms many state-of-the-art methods in terms of classification accuracy, even under the constraint of small training sample size, and execution time.

Download Full-text