Transparent Classification with Multilayer Logical Perceptrons and Random Binarization

Zhuo Wang; Wei Zhang; Ning LIU; Jianyong Wang

doi:10.1609/aaai.v34i04.6102

Transparent Classification with Multilayer Logical Perceptrons and Random Binarization

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i04.6102 ◽

2020 ◽

Vol 34 (04) ◽

pp. 6331-6339

Author(s):

Zhuo Wang ◽

Wei Zhang ◽

Ning LIU ◽

Jianyong Wang

Keyword(s):

Network Architecture ◽

Gradient Descent ◽

Classification Performance ◽

Data Sets ◽

Neural Network Architecture ◽

Continuous Version ◽

Continuous Space ◽

Public Data ◽

Rule Sets ◽

Classification Tasks

Models with transparent inner structure and high classification performance are required to reduce potential risk and provide trust for users in domains like health care, finance, security, etc. However, existing models are hard to simultaneously satisfy the above two properties. In this paper, we propose a new hierarchical rule-based model for classification tasks, named Concept Rule Sets (CRS), which has both a strong expressive ability and a transparent inner structure. To address the challenge of efficiently learning the non-differentiable CRS model, we propose a novel neural network architecture, Multilayer Logical Perceptron (MLLP), which is a continuous version of CRS. Using MLLP and the Random Binarization (RB) method we proposed, we can search the discrete solution of CRS in continuous space using gradient descent and ensure the discrete CRS acts almost the same as the corresponding continuous MLLP. Experiments on 12 public data sets show that CRS outperforms the state-of-the-art approaches and the complexity of the learned CRS is close to the simple decision tree.

Download Full-text

NeuRank: learning to rank with neural networks for drug–target interaction prediction

BMC Bioinformatics ◽

10.1186/s12859-021-04476-y ◽

2021 ◽

Vol 22 (1) ◽

Author(s):

Xiujin Wu ◽

Wenhua Zeng ◽

Fan Lin ◽

Xiuze Zhou

Keyword(s):

Network Architecture ◽

Drug Target ◽

Learning To Rank ◽

Data Sets ◽

Neural Network Architecture ◽

Ranking Problem ◽

Target Interaction ◽

Similar Drug ◽

Public Data ◽

G Protein Coupled

Abstract Background Experimental verification of a drug discovery process is expensive and time-consuming. Therefore, recently, the demand to more efficiently and effectively identify drug–target interactions (DTIs) has intensified. Results We treat the prediction of DTIs as a ranking problem and propose a neural network architecture, NeuRank, to address it. Also, we assume that similar drug compounds are likely to interact with similar target proteins. Thus, in our model, we add drug and target similarities, which are very effective at improving the prediction of DTIs. Then, we develop NeuRank from a point-wise to a pair-wise, and further to list-wise model. Conclusion Finally, results from extensive experiments on five public data sets (DrugBank, Enzymes, Ion Channels, G-Protein-Coupled Receptors, and Nuclear Receptors) show that, in identifying DTIs, our models achieve better performance than other state-of-the-art methods.

Download Full-text

A data-driven neural network architecture for sentiment analysis

Data Technologies and Applications ◽

10.1108/dta-03-2018-0017 ◽

2019 ◽

Vol 53 (1) ◽

pp. 2-19 ◽

Cited By ~ 1

Author(s):

Erion Çano ◽

Maurizio Morisio

Keyword(s):

Neural Network ◽

Sentiment Analysis ◽

Network Architecture ◽

Network Models ◽

Data Sets ◽

Feature Maps ◽

Neural Network Architecture ◽

Neural Network Models ◽

Content Type ◽

Max Pooling

Purpose The fabulous results of convolution neural networks in image-related tasks attracted attention of text mining, sentiment analysis and other text analysis researchers. It is, however, difficult to find enough data for feeding such networks, optimize their parameters, and make the right design choices when constructing network architectures. The purpose of this paper is to present the creation steps of two big data sets of song emotions. The authors also explore usage of convolution and max-pooling neural layers on song lyrics, product and movie review text data sets. Three variants of a simple and flexible neural network architecture are also compared. Design/methodology/approach The intention was to spot any important patterns that can serve as guidelines for parameter optimization of similar models. The authors also wanted to identify architecture design choices which lead to high performing sentiment analysis models. To this end, the authors conducted a series of experiments with neural architectures of various configurations. Findings The results indicate that parallel convolutions of filter lengths up to 3 are usually enough for capturing relevant text features. Also, max-pooling region size should be adapted to the length of text documents for producing the best feature maps. Originality/value Top results the authors got are obtained with feature maps of lengths 6–18. An improvement on future neural network models for sentiment analysis could be generating sentiment polarity prediction of documents using aggregation of predictions on smaller excerpt of the entire text.

Download Full-text

A Neighborhood Rough Sets-Based Attribute Reduction Method Using Lebesgue and Entropy Measures

Entropy ◽

10.3390/e21020138 ◽

2019 ◽

Vol 21 (2) ◽

pp. 138 ◽

Cited By ~ 4

Author(s):

Lin Sun ◽

Lanying Wang ◽

Jiucheng Xu ◽

Shiguang Zhang

Keyword(s):

Rough Sets ◽

Reduction Method ◽

Attribute Reduction ◽

Numerical Data ◽

Classification Performance ◽

Data Sets ◽

Decision Systems ◽

Public Data ◽

Entropy Measures ◽

Neighborhood Rough Sets

For continuous numerical data sets, neighborhood rough sets-based attribute reduction is an important step for improving classification performance. However, most of the traditional reduction algorithms can only handle finite sets, and yield low accuracy and high cardinality. In this paper, a novel attribute reduction method using Lebesgue and entropy measures in neighborhood rough sets is proposed, which has the ability of dealing with continuous numerical data whilst maintaining the original classification information. First, Fisher score method is employed to eliminate irrelevant attributes to significantly reduce computation complexity for high-dimensional data sets. Then, Lebesgue measure is introduced into neighborhood rough sets to investigate uncertainty measure. In order to analyze the uncertainty and noisy of neighborhood decision systems well, based on Lebesgue and entropy measures, some neighborhood entropy-based uncertainty measures are presented, and by combining algebra view with information view in neighborhood rough sets, a neighborhood roughness joint entropy is developed in neighborhood decision systems. Moreover, some of their properties are derived and the relationships are established, which help to understand the essence of knowledge and the uncertainty of neighborhood decision systems. Finally, a heuristic attribute reduction algorithm is designed to improve the classification performance of large-scale complex data. The experimental results under an instance and several public data sets show that the proposed method is very effective for selecting the most relevant attributes with high classification accuracy.

Download Full-text

Designing deep neural networks for continual learning in an open world

10.21248/gups.62487 ◽

2021 ◽

Author(s):

◽

Martin Mundt

Keyword(s):

Neural Network ◽

Machine Learning ◽

Deep Learning ◽

Network Architecture ◽

Neural Network Training ◽

Neural Network Architecture ◽

Neural Architecture ◽

Network Training ◽

Classification Tasks ◽

Continual Learning

Deep learning with neural networks seems to have largely replaced traditional design of computer vision systems. Automated methods to learn a plethora of parameters are now used in favor of previously practiced selection of explicit mathematical operators for a specific task. The entailed promise is that practitioners no longer need to take care of every individual step, but rather focus on gathering big amounts of data for neural network training. As a consequence, both a shift in mindset towards a focus on big datasets, as well as a wave of conceivable applications based exclusively on deep learning can be observed. This PhD dissertation aims to uncover some of the only implicitly mentioned or overlooked deep learning aspects, highlight unmentioned assumptions, and finally introduce methods to address respective immediate weaknesses. In the author’s humble opinion, these prevalent shortcomings can be tied to the fact that the involved steps in the machine learning workflow are frequently decoupled. Success is predominantly measured based on accuracy measures designed for evaluation with static benchmark test sets. Individual machine learning workflow components are assessed in isolation with respect to available data, choice of neural network architecture, and a particular learning algorithm, rather than viewing the machine learning system as a whole in context of a particular application. Correspondingly, in this dissertation, three key challenges have been identified: 1. Choice and flexibility of a neural network architecture. 2. Identification and rejection of unseen unknown data to avoid false predictions. 3. Continual learning without forgetting of already learned information. These latter challenges have already been crucial topics in older literature, alas, seem to require a renaissance in modern deep learning literature. Initially, it may appear that they pose independent research questions, however, the thesis posits that the aspects are intertwined and require a joint perspective in machine learning based systems. In summary, the essential question is thus how to pick a suitable neural network architecture for a specific task, how to recognize which data inputs belong to this context, which ones originate from potential other tasks, and ultimately how to continuously include such identified novel data in neural network training over time without overwriting existing knowledge. Thus, the central emphasis of this dissertation is to build on top of existing deep learning strengths, yet also acknowledge mentioned weaknesses, in an effort to establish a deeper understanding of interdependencies and synergies towards the development of unified solution mechanisms. For this purpose, the main portion of the thesis is in cumulative form. The respective publications can be grouped according to the three challenges outlined above. Correspondingly, chapter 1 is focused on choice and extendability of neural network architectures, analyzed in context of popular image classification tasks. An algorithm to automatically determine neural network layer width is introduced and is first contrasted with static architectures found in the literature. The importance of neural architecture design is then further showcased on a real-world application of defect detection in concrete bridges. Chapter 2 is comprised of the complementary ensuing questions of how to identify unknown concepts and subsequently incorporate them into continual learning. A joint central mechanism to distinguish unseen concepts from what is known in classification tasks, while enabling consecutive training without forgetting or revisiting older classes, is proposed. Once more, the role of the chosen neural network architecture is quantitatively reassessed. Finally, chapter 3 culminates in an overarching view, where developed parts are connected. Here, an extensive survey further serves the purpose to embed the gained insights in the broader literature landscape and emphasizes the importance of a common frame of thought. The ultimately presented approach thus reflects the overall thesis’ contribution to advance neural network based machine learning towards a unified solution that ties together choice of neural architecture with the ability to learn continually and the capability to automatically separate known from unknown data.

Download Full-text

Detection of Suicide Ideation in Social Media Forums Using Deep Learning

Algorithms ◽

10.3390/a13010007 ◽

2019 ◽

Vol 13 (1) ◽

pp. 7 ◽

Cited By ~ 4

Author(s):

Michael Mesfin Tadesse ◽

Hongfei Lin ◽

Bo Xu ◽

Liang Yang

Keyword(s):

Social Media ◽

Deep Learning ◽

Network Architecture ◽

Suicide Ideation ◽

Suicide Risk Assessment ◽

Neural Network Architecture ◽

Combined Model ◽

Ongoing Work ◽

Classification Tasks ◽

Learning Architectures

Suicide ideation expressed in social media has an impact on language usage. Many at-risk individuals use social forum platforms to discuss their problems or get access to information on similar tasks. The key objective of our study is to present ongoing work on automatic recognition of suicidal posts. We address the early detection of suicide ideation through deep learning and machine learning-based classification approaches applied to Reddit social media. For such purpose, we employ an LSTM-CNN combined model to evaluate and compare to other classification models. Our experiment shows the combined neural network architecture with word embedding techniques can achieve the best relevance classification results. Additionally, our results support the strength and ability of deep learning architectures to build an effective model for a suicide risk assessment in various text classification tasks.

Download Full-text

CORENup: a combination of convolutional and recurrent deep neural networks for nucleosome positioning identification

BMC Bioinformatics ◽

10.1186/s12859-020-03627-x ◽

2020 ◽

Vol 21 (S8) ◽

Author(s):

Domenico Amato ◽

Giosue’ Lo Bosco ◽

Riccardo Rizzo

Keyword(s):

Neural Network ◽

Dna Sequence ◽

Network Architecture ◽

Computation Time ◽

Nucleosome Positioning ◽

Dense Layer ◽

Data Sets ◽

Sequence Organization ◽

Public Data ◽

Genomic Scale

Abstract Background Nucleosomes wrap the DNA into the nucleus of the Eukaryote cell and regulate its transcription phase. Several studies indicate that nucleosomes are determined by the combined effects of several factors, including DNA sequence organization. Interestingly, the identification of nucleosomes on a genomic scale has been successfully performed by computational methods using DNA sequence as input data. Results In this work, we propose CORENup, a deep learning model for nucleosome identification. CORENup processes a DNA sequence as input using one-hot representation and combines in a parallel fashion a fully convolutional neural network and a recurrent layer. These two parallel levels are devoted to catching both non periodic and periodic DNA string features. A dense layer is devoted to their combination to give a final classification. Conclusions Results computed on public data sets of different organisms show that CORENup is a state of the art methodology for nucleosome positioning identification based on a Deep Neural Network architecture. The comparisons have been carried out using two groups of datasets, currently adopted by the best performing methods, and CORENup has shown top performance both in terms of classification metrics and elapsed computation time.

Download Full-text

Multiattentive Recurrent Neural Network Architecture for Multilingual Readability Assessment

Transactions of the Association for Computational Linguistics ◽

10.1162/tacl_a_00278 ◽

2019 ◽

Vol 7 ◽

pp. 421-436 ◽

Cited By ~ 2

Author(s):

Ion Madrazo Azpiazu ◽

Maria Soledad Pera

Keyword(s):

Neural Network ◽

Recurrent Neural Network ◽

Network Architecture ◽

Text Structure ◽

Reading Level ◽

Data Sets ◽

Neural Network Architecture ◽

Using Data ◽

Word Attention ◽

Readability Assessment

We present a multiattentive recurrent neural network architecture for automatic multilingual readability assessment. This architecture considers raw words as its main input, but internally captures text structure and informs its word attention process using other syntax- and morphology-related datapoints, known to be of great importance to readability. This is achieved by a multiattentive strategy that allows the neural network to focus on specific parts of a text for predicting its reading level. We conducted an exhaustive evaluation using data sets targeting multiple languages and prediction task types, to compare the proposed model with traditional, state-of-the-art, and other neural network strategies.

Download Full-text

FUZZY RULE EXTRACTION FROM SIMPLE EVOLVING CONNECTIONIST SYSTEMS

International Journal of Computational Intelligence and Applications ◽

10.1142/s146902680400132x ◽

2004 ◽

Vol 04 (03) ◽

pp. 299-308 ◽

Cited By ~ 2

Author(s):

MICHAEL J. WATTS

Keyword(s):

Neural Network ◽

Network Architecture ◽

Learning Algorithm ◽

Fuzzy Rule ◽

Rule Extraction ◽

Data Sets ◽

Neural Network Architecture ◽

The Neural Network ◽

Extraction Algorithm ◽

Training Examples

A method for extracting Zadeh–Mamdani fuzzy rules from a minimalist constructive neural network model is described. The network contains no embedded fuzzy logic elements. The rule extraction algorithm needs no modification of the neural network architecture. No modification of the network learning algorithm is required, nor is it necessary to retain any training examples. The algorithm is illustrated on two well known benchmark data sets and compared with a relevant existing rule extraction algorithm.

Download Full-text

Cross-Domain Reuse of Extracted Knowledge in Genetic Programming for Image Classification

10.26686/wgtn.13152290 ◽

2020 ◽

Author(s):

M Iqbal ◽

Bing Xue ◽

Harith Al-Sahaf ◽

Mengjie Zhang

Keyword(s):

Genetic Programming ◽

Image Classification ◽

Transfer Learning ◽

Evolutionary Process ◽

Classification Performance ◽

Data Sets ◽

Classification Problems ◽

Novel Approach ◽

Complex Image ◽

Classification Tasks

© 2017 IEEE. Genetic programming (GP) is a well-known evolutionary computation technique, which has been successfully used to solve various problems, such as optimization, image analysis, and classification. Transfer learning is a type of machine learning approach that can be used to solve complex tasks. Transfer learning has been introduced to GP to solve complex Boolean and symbolic regression problems with some promise. However, the use of transfer learning with GP has not been investigated to address complex image classification tasks with noise and rotations, where GP cannot achieve satisfactory performance, but GP with transfer learning may improve the performance. In this paper, we propose a novel approach based on transfer learning and GP to solve complex image classification problems by extracting and reusing blocks of knowledge/information, which are automatically discovered from similar as well as different image classification tasks during the evolutionary process. The proposed approach is evaluated on three texture data sets and three office data sets of image classification benchmarks, and achieves better classification performance than the state-of-the-art image classification algorithm. Further analysis on the evolved solutions/trees shows that the proposed approach with transfer learning can successfully discover and reuse knowledge/information extracted from similar or different problems to improve its performance on complex image classification problems.

Download Full-text

Automated scoring of pre-REM sleep in mice with deep learning

Scientific Reports ◽

10.1038/s41598-021-91286-0 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Niklas Grieger ◽

Justus T. C. Schwabedal ◽

Stefanie Wendel ◽

Yvonne Ritze ◽

Stephan Bialonski

Keyword(s):

Deep Learning ◽

Rem Sleep ◽

Network Architecture ◽

Classification Performance ◽

Sleep Stages ◽

Data Sets ◽

Manual Task ◽

Typical Data ◽

Out Of Sample ◽

Simple Neural Network

AbstractReliable automation of the labor-intensive manual task of scoring animal sleep can facilitate the analysis of long-term sleep studies. In recent years, deep-learning-based systems, which learn optimal features from the data, increased scoring accuracies for the classical sleep stages of Wake, REM, and Non-REM. Meanwhile, it has been recognized that the statistics of transitional stages such as pre-REM, found between Non-REM and REM, may hold additional insight into the physiology of sleep and are now under vivid investigation. We propose a classification system based on a simple neural network architecture that scores the classical stages as well as pre-REM sleep in mice. When restricted to the classical stages, the optimized network showed state-of-the-art classification performance with an out-of-sample F1 score of 0.95 in male C57BL/6J mice. When unrestricted, the network showed lower F1 scores on pre-REM (0.5) compared to the classical stages. The result is comparable to previous attempts to score transitional stages in other species such as transition sleep in rats or N1 sleep in humans. Nevertheless, we observed that the sequence of predictions including pre-REM typically transitioned from Non-REM to REM reflecting sleep dynamics observed by human scorers. Our findings provide further evidence for the difficulty of scoring transitional sleep stages, likely because such stages of sleep are under-represented in typical data sets or show large inter-scorer variability. We further provide our source code and an online platform to run predictions with our trained network.

Download Full-text