Improving the classification of neuropsychiatric conditions using gene ontology terms as features

AbstractAlthough neuropsychiatric disorders have a well-established genetic background, their specific molecular foundations remain elusive. This has prompted many investigators to design studies that identify explanatory biomarkers, and then use these biomarkers to predict clinical outcomes. One approach involves using machine learning algorithms to classify patients based on blood mRNA expression from high-throughput transcriptomic assays. However, these endeavours typically fail to achieve the high level of performance, stability, and generalizability required for clinical translation. Moreover, these classifiers can lack interpretability because informative genes do not necessarily have relevance to researchers. For this study, we hypothesized that annotation-based classifiers can improve classification performance, stability, generalizability, and interpretability. To this end, we evaluated the performance of four classification algorithms on six neuropsychiatric data sets using four annotation databases. Our results suggest that the Gene Ontology Biological Process database can transform gene expression into an annotation-based feature space that improves the performance and stability of blood-based classifiers for neuropsychiatric conditions. We also show how annotation features can improve the interpretability of classifiers: since annotation databases are often used to assign biological importance to genes, annotation-based classifiers are easy to interpret because the biological importance of the features are the features themselves. We found that using annotations as features improves the performance and stability of classifiers. We also noted that the top ranked annotations tend contain the top ranked genes, suggesting that the most predictive annotations are a superset of the most predictive genes. Based on this, and the fact that annotations are used routinely to assign biological importance to genetic data, we recommend transforming gene-level expression into annotation-level expression prior to the classification of neuropsychiatric conditions.

Download Full-text

Space Precession Target Classification Based on Radar High-Resolution Range Profiles

International Journal of Antennas and Propagation ◽

10.1155/2019/8151620 ◽

2019 ◽

Vol 2019 ◽

pp. 1-9

Author(s):

Yizhe Wang ◽

Cunqian Feng ◽

Yongshun Zhang ◽

Sisan He

Keyword(s):

Parameter Extraction ◽

Classification Performance ◽

Support Vector ◽

Electromagnetic Data ◽

Feature Extractor ◽

Different Types ◽

Radar Echo ◽

High Level ◽

Cone Target

Precession is a common micromotion form of space targets, introducing additional micro-Doppler (m-D) modulation into the radar echo. Effective classification of space targets is of great significance for further micromotion parameter extraction and identification. Feature extraction is a key step during the classification process, largely influencing the final classification performance. This paper presents two methods for classifying different types of space precession targets from the HRRPs. We first establish the precession model of space targets and analyze the scattering characteristics and then compute electromagnetic data of the cone target, cone-cylinder target, and cone-cylinder-flare target. Experimental results demonstrate that the support vector machine (SVM) using histograms of oriented gradient (HOG) features achieves a good result, whereas the deep convolutional neural network (DCNN) obtains a higher classification accuracy. DCNN combines the feature extractor and the classifier itself to automatically mine the high-level signatures of HRRPs through a training process. Besides, the efficiency of the two classification processes are compared using the same dataset.

Download Full-text

Big Data and Machine Learning

Advances in Business Information Systems and Analytics - Protocols and Applications for the Industrial Internet of Things ◽

10.4018/978-1-5225-3805-9.ch008 ◽

2018 ◽

pp. 225-239

Author(s):

Fernando Enrique Lopez Martinez ◽

Edward Rolando Núñez-Valdez

Keyword(s):

Public Health ◽

Artificial Intelligence ◽

Machine Learning ◽

Big Data ◽

Complete Solution ◽

Big Data Analytics ◽

Machine Learning Algorithms ◽

Data Sets ◽

Healthcare Organizations ◽

High Level

IoT, big data, and artificial intelligence are currently three of the most relevant and trending pieces for innovation and predictive analysis in healthcare. Many healthcare organizations are already working on developing their own home-centric data collection networks and intelligent big data analytics systems based on machine-learning principles. The benefit of using IoT, big data, and artificial intelligence for community and population health is better health outcomes for the population and communities. The new generation of machine-learning algorithms can use large standardized data sets generated in healthcare to improve the effectiveness of public health interventions. A lot of these data come from sensors, devices, electronic health records (EHR), data generated by public health nurses, mobile data, social media, and the internet. This chapter shows a high-level implementation of a complete solution of IoT, big data, and machine learning implemented in the city of Cartagena, Colombia for hypertensive patients by using an eHealth sensor and Amazon Web Services components.

Download Full-text

Improving the classification of neuropsychiatric conditions using gene ontology terms as features

American Journal of Medical Genetics Part B Neuropsychiatric Genetics ◽

10.1002/ajmg.b.32727 ◽

2019 ◽

Vol 180 (7) ◽

pp. 508-518 ◽

Cited By ~ 1

Author(s):

Thomas P. Quinn ◽

Samuel C. Lee ◽

Svetha Venkatesh ◽

Thin Nguyen

Keyword(s):

Gene Ontology ◽

Neuropsychiatric Conditions

Download Full-text

A systematical approach to classification problems with feature space heterogeneity

Kybernetes ◽

10.1108/k-06-2018-0313 ◽

2019 ◽

Vol 48 (9) ◽

pp. 2006-2029

Author(s):

Hongshan Xiao ◽

Yu Wang

Keyword(s):

Factor Analysis ◽

Meta Analysis ◽

Feature Space ◽

Classification Performance ◽

Classification Algorithm ◽

Significant Feature ◽

Data Sets ◽

Data Set ◽

Classification Techniques ◽

Content Type

Purpose Feature space heterogeneity exists widely in various application fields of classification techniques, such as customs inspection decision, credit scoring and medical diagnosis. This paper aims to study the relationship between feature space heterogeneity and classification performance. Design/methodology/approach A measurement is first developed for measuring and identifying any significant heterogeneity that exists in the feature space of a data set. The main idea of this measurement is derived from a meta-analysis. For the data set with significant feature space heterogeneity, a classification algorithm based on factor analysis and clustering is proposed to learn the data patterns, which, in turn, are used for data classification. Findings The proposed approach has two main advantages over the previous methods. The first advantage lies in feature transform using orthogonal factor analysis, which results in new features without redundancy and irrelevance. The second advantage rests on samples partitioning to capture the feature space heterogeneity reflected by differences of factor scores. The validity and effectiveness of the proposed approach is verified on a number of benchmarking data sets. Research limitations/implications Measurement should be used to guide the heterogeneity elimination process, which is an interesting topic in future research. In addition, to develop a classification algorithm that enables scalable and incremental learning for large data sets with significant feature space heterogeneity is also an important issue. Practical implications Measuring and eliminating the feature space heterogeneity possibly existing in the data are important for accurate classification. This study provides a systematical approach to feature space heterogeneity measurement and elimination for better classification performance, which is favorable for applications of classification techniques in real-word problems. Originality/value A measurement based on meta-analysis for measuring and identifying any significant feature space heterogeneity in a classification problem is developed, and an ensemble classification framework is proposed to deal with the feature space heterogeneity and improve the classification accuracy.

Download Full-text

Kernel Joint Sparse Representation Based on Self-Paced Learning for Hyperspectral Image Classification

Remote Sensing ◽

10.3390/rs11091114 ◽

2019 ◽

Vol 11 (9) ◽

pp. 1114

Author(s):

Sixiu Hu ◽

Jiangtao Peng ◽

Yingxiong Fu ◽

Luoqing Li

Keyword(s):

Sparse Representation ◽

Hyperspectral Image ◽

Feature Space ◽

Classification Performance ◽

Hyperspectral Data ◽

Data Sets ◽

Neighborhood Structure ◽

Joint Sparse Representation ◽

Negative Effect ◽

Sparse Coefficient

By means of joint sparse representation (JSR) and kernel representation, kernel joint sparse representation (KJSR) models can effectively model the intrinsic nonlinear relations of hyperspectral data and better exploit spatial neighborhood structure to improve the classification performance of hyperspectral images. However, due to the presence of noisy or inhomogeneous pixels around the central testing pixel in the spatial domain, the performance of KJSR is greatly affected. Motivated by the idea of self-paced learning (SPL), this paper proposes a self-paced KJSR (SPKJSR) model to adaptively learn weights and sparse coefficient vectors for different neighboring pixels in the kernel-based feature space. SPL strateges can learn a weight to indicate the difficulty of feature pixels within a spatial neighborhood. By assigning small weights for unimportant or complex pixels, the negative effect of inhomogeneous or noisy neighboring pixels can be suppressed. Hence, SPKJSR is usually much more robust. Experimental results on Indian Pines and Salinas hyperspectral data sets demonstrate that SPKJSR is much more effective than traditional JSR and KJSR models.

Download Full-text

Classification of coronary artery disease data sets by using a deep neural network

The EuroBiotech Journal ◽

10.24190/issn2564-615x/2017/04.03 ◽

2017 ◽

Vol 1 (4) ◽

pp. 271-277 ◽

Cited By ~ 8

Author(s):

Abdullah Caliskan ◽

Mehmet Emin Yuksel

Keyword(s):

Neural Network ◽

Coronary Artery Disease ◽

Coronary Artery ◽

Deep Neural Network ◽

Cost Effective ◽

Classification Performance ◽

Data Sets ◽

Neural Network Classifier ◽

Artery Disease

Abstract In this study, a deep neural network classifier is proposed for the classification of coronary artery disease medical data sets. The proposed classifier is tested on reference CAD data sets from the literature and also compared with popular representative classification methods regarding its classification performance. Experimental results show that the deep neural network classifier offers much better accuracy, sensitivity and specificity rates when compared with other methods. The proposed method presents itself as an easily accessible and cost-effective alternative to currently existing methods used for the diagnosis of CAD and it can be applied for easily checking whether a given subject under examination has at least one occluded coronary artery or not.

Download Full-text

Fingerprint Intramodal Biometric System Based on ABC Feature Fusion

Asian Journal of Research in Computer Science ◽

10.9734/ajrcos/2021/v11i230256 ◽

2021 ◽

pp. 1-10

Author(s):

J. O. Jooda ◽

A. O. Oke ◽

E. O. Omidiora ◽

O. T. Adedeji

Keyword(s):

Feature Selection ◽

Feature Fusion ◽

Feature Space ◽

Classification Performance ◽

High Dimensionality ◽

Biometric System ◽

Bee Colony ◽

Optimizing Algorithm ◽

Level Fusion

Unimodal biometrics system (UBS) drawbacks include noisy data, intra-class variance, inter-class similarities, non-universality, which all affect the system's classification performance. Intramodal fingerprint fusion can overcome the limitations imposed by UBS when features are fused at the feature level as it is a good approach to boost the performance of the biometric system. However, feature level fusion leads to high dimensionality of feature space which can be overcame by Feature Selection (FS). FS improves the performance of classification by selecting only relevant and useful information from extracted feature sets being an optimization problem. Artificial Bee Colony (ABC) is an optimizing algorithm that has been frequently used in solving FS problems because of its simple concept, use of few control parameters, easy implementation and good exploration characteristics. ABC was proposed for optimized feature selection prior to the classification of Fingerprint Intramodal Biometric System (FIBS). Performance evaluation of ABC-based FIBS showed the system had a Sensitivity of 97.69% and RA of 96.76%. The developed ABC optimized feature selection reduced the high dimensionality of features space prior to classification tasks thereby increasing sensitivity and recognition accuracy of FIBS.

Download Full-text

Classification of Marine Vessels with Multi-Feature Structure Fusion

Applied Sciences ◽

10.3390/app9102153 ◽

2019 ◽

Vol 9 (10) ◽

pp. 2153 ◽

Cited By ~ 6

Author(s):

Erhu Zhang ◽

Kelu Wang ◽

Guangfeng Lin

Keyword(s):

Promising Result ◽

Infrared Image ◽

Feature Space ◽

Classification Performance ◽

Feature Structure ◽

Redundant Feature ◽

Feature Dimension ◽

Spectral Regression ◽

Marine Vessels

The classification of marine vessels is one of the important problems of maritime traffic. To fully exploit the complementarity between different features and to more effectively identify marine vessels, a novel feature structure fusion method based on spectral regression discriminant analysis (SF-SRDA) was proposed. Firstly, we selected the different convolutional neural network features that better describe the characteristics of ships, and constructed the features based on graphs by the similarity metric. Then we weighed the concatenate multi-feature and fused their structures according to the linear relationship assumption. Finally, we constructed the optimization formula to solve the fusion features and structure by using spectral regression discriminant analyses. Experiments on the VAIS dataset show that the proposed SF-SRDA method can reduce the feature dimension from the original 102,400 dimensions to 5 dimensions, that the classification accuracy of visible images can reach 87.60%, and that that of the infrared image can reach 74.68% at daytime. The experimental results demonstrate that the proposed method can not only extract the optimal features from the original redundant feature space, but also greatly reduce the dimensions of the feature. Furthermore, the classification performance of SF-SRDA also gets a promising result.

Download Full-text

Pattern Classification Based on Self-Organizing Feature Mapping Neural Network

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.448-453.3645 ◽

2013 ◽

Vol 448-453 ◽

pp. 3645-3649 ◽

Cited By ~ 2

Author(s):

Shuo Ding ◽

Xiao Heng Chang ◽

Qing Hui Wu

Keyword(s):

Neural Network ◽

Pattern Classification ◽

Classification Performance ◽

Adjustment Process ◽

Data Sets ◽

Two Dimensional ◽

Sample Data ◽

Sofm Network ◽

Traditional Pattern

Traditional pattern classification methods are not always efficient because sample data sets are sometimes incomplete and there are exceptions and counter examples. In this paper, SOFM neural network is applied in pattern classification of two-dimensional vectors after analysis of its structure and algorithm. The method to establish SOFM network via MATLAB7.0 is introduced before the network is applied to classify two-dimensional vectors. The adjustment process of weight vectors together with classification performance of SOFM model are also tested in the condition of different number of training steps. The simulation results show that the classification approach based on SOFM model is effective because of its fast speed, high accuracy and strong generalization ability.

Download Full-text

Application of Statistical Machine Learning Algorithms for Classification of Bridge Deformation Data Sets

2021 IEEE International Systems Conference (SysCon) ◽

10.1109/syscon48628.2021.9447056 ◽

2021 ◽

Author(s):

Juan C. Avendano ◽

Luis Daniel Otero ◽

Carlos Otero

Keyword(s):

Machine Learning ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Data Sets ◽

Statistical Machine Learning

Download Full-text