Topic Modelling Meets Deep Neural Networks: A Survey

Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2021/638 ◽

2021 ◽

Author(s):

He Zhao ◽

Dinh Phung ◽

Viet Huynh ◽

Yuan Jin ◽

Lan Du ◽

...

Keyword(s):

Neural Networks ◽

Deep Neural Networks ◽

Topic Models ◽

Research Area ◽

Language Models ◽

Topic Modelling ◽

Comprehensive Overview ◽

Open Problems ◽

Wide Range ◽

Successful Technique

Topic modelling has been a successful technique for text analysis for almost twenty years. When topic modelling met deep neural networks, there emerged a new and increasingly popular research area, neural topic models, with nearly a hundred models developed and a wide range of applications in neural language understanding such as text generation, summarisation and language models. There is a need to summarise research developments and discuss open problems and future directions. In this paper, we provide a focused yet comprehensive overview of neural topic models for interested researchers in the AI community, so as to facilitate them to navigate and innovate in this fast-growing research area. To the best of our knowledge, ours is the first review on this specific topic.

Download Full-text

MetaLearning with Graph Neural Networks

ACM SIGKDD Explorations Newsletter ◽

10.1145/3510374.3510379 ◽

2021 ◽

Vol 23 (2) ◽

pp. 13-22

Author(s):

Debmalya Mandal ◽

Sourav Medya ◽

Brian Uzzi ◽

Charu Aggarwal

Keyword(s):

Neural Networks ◽

Deep Neural Networks ◽

Future Research ◽

Open Problems ◽

Research Directions ◽

Graph Problems ◽

Meta Learning ◽

Comprehensive Survey ◽

Future Research Directions ◽

Graph Neural Networks

Graph Neural Networks (GNNs), a generalization of deep neural networks on graph data have been widely used in various domains, ranging from drug discovery to recommender systems. However, GNNs on such applications are limited when there are few available samples. Meta-learning has been an important framework to address the lack of samples in machine learning, and in recent years, researchers have started to apply meta-learning to GNNs. In this work, we provide a comprehensive survey of different metalearning approaches involving GNNs on various graph problems showing the power of using these two approaches together. We categorize the literature based on proposed architectures, shared representations, and applications. Finally, we discuss several exciting future research directions and open problems.

Download Full-text

COSMO: Computing with Stochastic Numbers in Memory

ACM Journal on Emerging Technologies in Computing Systems ◽

10.1145/3484731 ◽

2022 ◽

Vol 18 (2) ◽

pp. 1-25

Author(s):

Saransh Gupta ◽

Mohsen Imani ◽

Joonseop Sim ◽

Andrew Huang ◽

Fan Wu ◽

...

Keyword(s):

Neural Networks ◽

Image Processing ◽

Energy Efficient ◽

Deep Neural Networks ◽

Parallel Architecture ◽

Low Energy ◽

Stochastic Computing ◽

Wide Range ◽

Low Energy Consumption ◽

Sc Addition

Stochastic computing (SC) reduces the complexity of computation by representing numbers with long streams of independent bits. However, increasing performance in SC comes with either an increase in area or a loss in accuracy. Processing in memory (PIM) computes data in-place while having high memory density and supporting bit-parallel operations with low energy consumption. In this article, we propose COSMO, an architecture for co mputing with s tochastic numbers in me mo ry, which enables SC in memory. The proposed architecture is general and can be used for a wide range of applications. It is a highly dense and parallel architecture that supports most SC encodings and operations in memory. It maximizes the performance and energy efficiency of SC by introducing several innovations: (i) in-memory parallel stochastic number generation, (ii) efficient implication-based logic in memory, (iii) novel memory bit line segmenting, (iv) a new memory-compatible SC addition operation, and (v) enabling flexible block allocation. To show the generality and efficiency of our stochastic architecture, we implement image processing, deep neural networks (DNNs), and hyperdimensional (HD) computing on the proposed hardware. Our evaluations show that running DNN inference on COSMO is 141× faster and 80× more energy efficient as compared to GPU.

Download Full-text

Analysis of Explainers of Black Box Deep Neural Networks for Computer Vision: A Survey

Machine Learning and Knowledge Extraction ◽

10.3390/make3040048 ◽

2021 ◽

Vol 3 (4) ◽

pp. 966-989

Author(s):

Vanessa Buhrmester ◽

David Münch ◽

Michael Arens

Keyword(s):

Neural Networks ◽

Computer Vision ◽

Deep Neural Networks ◽

State Of The Art ◽

Black Box ◽

Complex Data ◽

Comprehensive Overview ◽

Nonlinear Structure ◽

Black Boxes ◽

Insight Into

Deep Learning is a state-of-the-art technique to make inference on extensive or complex data. As a black box model due to their multilayer nonlinear structure, Deep Neural Networks are often criticized as being non-transparent and their predictions not traceable by humans. Furthermore, the models learn from artificially generated datasets, which often do not reflect reality. By basing decision-making algorithms on Deep Neural Networks, prejudice and unfairness may be promoted unknowingly due to a lack of transparency. Hence, several so-called explanators, or explainers, have been developed. Explainers try to give insight into the inner structure of machine learning black boxes by analyzing the connection between the input and output. In this survey, we present the mechanisms and properties of explaining systems for Deep Neural Networks for Computer Vision tasks. We give a comprehensive overview about the taxonomy of related studies and compare several survey papers that deal with explainability in general. We work out the drawbacks and gaps and summarize further research ideas.

Download Full-text

Distinguishing between recent balancing selection and incomplete sweep using deep neural networks

10.1101/2020.07.31.230706 ◽

2020 ◽

Cited By ~ 1

Author(s):

Ulas Isildak ◽

Alessandro Stella ◽

Matteo Fumagalli

Keyword(s):

Neural Network ◽

Neural Networks ◽

Deep Neural Networks ◽

Balancing Selection ◽

False Positive Rate ◽

Genomic Data ◽

Training Data ◽

Functional Changes ◽

Wide Range ◽

Recent Selection

1AbstractBalancing selection is an important adaptive mechanism underpinning a wide range of phenotypes. Despite its relevance, the detection of recent balancing selection from genomic data is challenging as its signatures are qualitatively similar to those left by ongoing positive selection. In this study we developed and implemented two deep neural networks and tested their performance to predict loci under recent selection, either due to balancing selection or incomplete sweep, from population genomic data. Specifically, we generated forward-intime simulations to train and test an artificial neural network (ANN) and a convolutional neural network (CNN). ANN received as input multiple summary statistics calculated on the locus of interest, while CNN was applied directly on the matrix of haplotypes. We found that both architectures have high accuracy to identify loci under recent selection. CNN generally outperformed ANN to distinguish between signals of balancing selection and incomplete sweep and was less affected by incorrect training data. We deployed both trained networks on neutral genomic regions in European populations and demonstrated a lower false positive rate for CNN than ANN. We finally deployed CNN within the MEFV gene region and identified several common variants predicted to be under incomplete sweep in a European population. Notably, two of these variants are functional changes and could modulate susceptibility to Familial Mediterranean Fever, possibly as a consequence of past adaptation to pathogens. In conclusion, deep neural networks were able to characterise signals of selection on intermediate-frequency variants, an analysis currently inaccessible by commonly used strategies.

Download Full-text

Incident detection and classification in renewable energy news using pre-trained language models on deep neural networks

Journal of Computational Methods in Sciences and Engineering ◽

10.3233/jcm-215594 ◽

2021 ◽

pp. 1-20

Author(s):

Qiqing Wang ◽

Cunbin Li

Keyword(s):

Neural Networks ◽

Renewable Energy ◽

Large Scale ◽

Deep Neural Networks ◽

Energy Systems ◽

Language Models ◽

Causal Effects ◽

Incident Detection ◽

Renewable Energy Systems ◽

Learning Framework

The surge of renewable energy systems can lead to increasing incidents that negatively impact economics and society, rendering incident detection paramount to understand the mechanism and range of those impacts. In this paper, a deep learning framework is proposed to detect renewable energy incidents from news articles containing accidents in various renewable energy systems. The pre-trained language models like Bidirectional Encoder Representations from Transformers (BERT) and word2vec are utilized to represent textual inputs, which are trained by the Text Convolutional Neural Networks (TCNNs) and Text Recurrent Neural Networks. Two types of classifiers for incident detection are trained and tested in this paper, one is a binary classifier for detecting the existence of an incident, the other is a multi-label classifier for identifying different incident attributes such as causal-effects and consequences, etc. The proposed incident detection framework is implemented on a hand-annotated dataset with 5 190 records. The results show that the proposed framework performs well on both the incident existence detection task (F1-score 91.4%) and the incident attributes identification task (micro F1-score 81.7%). It is also shown that the BERT-based TCNNs are effective and robust in detecting renewable energy incidents from large-scale textual materials.

Download Full-text

Bearing fault diagnosis using deep learning techniques coupled with handcrafted feature extraction: A comparative study

Journal of Vibration and Control ◽

10.1177/1077546320929141 ◽

2020 ◽

pp. 107754632092914

Author(s):

Mohammed Alabsi ◽

Yabin Liao ◽

Ala-Addin Nabulsi

Keyword(s):

Neural Networks ◽

Feature Extraction ◽

Deep Learning ◽

Comparative Study ◽

Domain Knowledge ◽

Deep Neural Networks ◽

Performance Limits ◽

Data Repositories ◽

Learning Techniques ◽

Wide Range

Deep learning has seen tremendous growth over the past decade. It has set new performance limits for a wide range of applications, including computer vision, speech recognition, and machinery health monitoring. With the abundance of instrumentation data and the availability of high computational power, deep learning continues to prove itself as an efficient tool for the extraction of micropatterns from machinery big data repositories. This study presents a comparative study for feature extraction capabilities using stacked autoencoders considering the use of expert domain knowledge. Case Western Reserve University bearing dataset was used for the study, and a classifier was trained and tested to extract and visualize features from 12 different failure classes. Based on the raw data preprocessing, four different deep neural network structures were studied. Results indicated that integrating domain knowledge with deep learning techniques improved feature extraction capabilities and reduced the deep neural networks size and computational requirements without the need for exhaustive deep neural networks architecture tuning and modification.

Download Full-text

Parameter Setting for Deep Neural Networks Using Swarm Intelligence on Phishing Websites Classification

International Journal of Artificial Intelligence Tools ◽

10.1142/s021821301960008x ◽

2019 ◽

Vol 28 (06) ◽

pp. 1960008 ◽

Cited By ~ 5

Author(s):

Grega Vrbančič ◽

Iztok Fister ◽

Vili Podgorelec

Keyword(s):

Neural Network ◽

Neural Networks ◽

Swarm Intelligence ◽

Deep Neural Network ◽

Deep Neural Networks ◽

Predictive Performance ◽

Parameter Setting ◽

The Past ◽

Wide Range ◽

Learning Architectures

Over the past years, the application of deep neural networks in a wide range of areas is noticeably increasing. While many state-of-the-art deep neural networks are providing the performance comparable or in some cases even superior to humans, major challenges such as parameter settings for learning deep neural networks and construction of deep learning architectures still exist. The implications of those challenges have a significant impact on how a deep neural network is going to perform on a specific task. With the proposed method, presented in this paper, we are addressing the problem of parameter setting for a deep neural network utilizing swarm intelligence algorithms. In our experiments, we applied the proposed method variants to the classification task for distinguishing between phishing and legitimate websites. The performance of the proposed method is evaluated and compared against four different phishing datasets, two of which we prepared on our own. The results, obtained from the conducted empirical experiments, have proven the proposed approach to be very promising. By utilizing the proposed swarm intelligence based methods, we were able to statistically significantly improve the predictive performance when compared to the manually tuned deep neural network. In general, the improvement of classification accuracy ranges from 2.5% to 3.8%, while the improvement of F1-score reached even 24% on one of the datasets.

Download Full-text

Network attributes describe a similarity between deep neural networks and large scale brain networks

Journal of Complex Networks ◽

10.1093/comnet/cnz044 ◽

2019 ◽

Author(s):

Kosuke Takagi

Keyword(s):

Neural Networks ◽

Deep Learning ◽

Human Brain ◽

Large Scale ◽

Deep Neural Networks ◽

Distribution Model ◽

Common Mechanism ◽

Learning Models ◽

Connection Weight ◽

Wide Range

Abstract Despite the recent success of deep learning models in solving various problems, their ability is still limited compared with human intelligence, which has the flexibility to adapt to a changing environment. To obtain a model which achieves adaptability to a wide range of problems and tasks is a challenging problem. To achieve this, an issue that must be addressed is identification of the similarities and differences between the human brain and deep neural networks. In this article, inspired by the human flexibility which might suggest the existence of a common mechanism allowing solution of different kinds of tasks, we consider a general learning process in neural networks, on which no specific conditions and constraints are imposed. Subsequently, we theoretically show that, according to the learning progress, the network structure converges to the state, which is characterized by a unique distribution model with respect to network quantities such as the connection weight and node strength. Noting that the empirical data indicate that this state emerges in the large scale network in the human brain, we show that the same state can be reproduced in a simple example of deep learning models. Although further research is needed, our findings provide an insight into the common inherent mechanism underlying the human brain and deep learning. Thus, our findings provide suggestions for designing efficient learning algorithms for solving a wide variety of tasks in the future.

Download Full-text

Korean Grammatical Error Correction Based on Transformer with Copying Mechanisms and Grammatical Noise Implantation Methods

Sensors ◽

10.3390/s21082658 ◽

2021 ◽

Vol 21 (8) ◽

pp. 2658

Author(s):

Myunghoon Lee ◽

Hyeonho Shin ◽

Dabin Lee ◽

Sung-Pil Choi

Keyword(s):

Neural Networks ◽

Error Correction ◽

Deep Neural Networks ◽

Language Models ◽

Grammatical Errors ◽

Grammatical Error

Grammatical Error Correction (GEC) is the task of detecting and correcting various grammatical errors in texts. Many previous approaches to the GEC have used various mechanisms including rules, statistics, and their combinations. Recently, the performance of the GEC in English has been drastically enhanced due to the vigorous applications of deep neural networks and pretrained language models. Following the promising results of the English GEC tasks, we apply the Transformer with Copying Mechanism into the Korean GEC task by introducing novel and effective noising methods for constructing Korean GEC datasets. Our comparative experiments showed that the proposed system outperforms two commercial grammar check and other NMT-based models.

Download Full-text

Understanding Local Robustness of Deep Neural Networks under Natural Variations

Fundamental Approaches to Software Engineering - Lecture Notes in Computer Science ◽

10.1007/978-3-030-71500-7_16 ◽

2021 ◽

pp. 313-337

Author(s):

Ziyuan Zhong ◽

Yuchi Tian ◽

Baishakhi Ray

Keyword(s):

Neural Networks ◽

Image Classification ◽

Deep Neural Networks ◽

Autonomous Driving ◽

Regression Problem ◽

Rigorous Testing ◽

Wide Range ◽

Natural Variants ◽

Norm Bound ◽

Commercial Applications

AbstractDeep Neural Networks (DNNs) are being deployed in a wide range of settings today, from safety-critical applications like autonomous driving to commercial applications involving image classifications. However, recent research has shown that DNNs can be brittle to even slight variations of the input data. Therefore, rigorous testing of DNNs has gained widespread attention.While DNN robustness under norm-bound perturbation got significant attention over the past few years, our knowledge is still limited when natural variants of the input images come. These natural variants, e.g., a rotated or a rainy version of the original input, are especially concerning as they can occur naturally in the field without any active adversary and may lead to undesirable consequences. Thus, it is important to identify the inputs whose small variations may lead to erroneous DNN behaviors. The very few studies that looked at DNN’s robustness under natural variants, however, focus on estimating the overall robustness of DNNs across all the test data rather than localizing such error-producing points. This work aims to bridge this gap.To this end, we study the local per-input robustness properties of the DNNs and leverage those properties to build a white-box (DeepRobust-W) and a black-box (DeepRobust-B) tool to automatically identify the non-robust points. Our evaluation of these methods on three DNN models spanning three widely used image classification datasets shows that they are effective in flagging points of poor robustness. In particular, DeepRobust-W and DeepRobust-B are able to achieve an F1 score of up to 91.4% and 99.1%, respectively. We further show that DeepRobust-W can be applied to a regression problem in a domain beyond image classification. Our evaluation on three self-driving car models demonstrates that DeepRobust-W is effective in identifying points of poor robustness with F1 score up to 78.9%.

Download Full-text