scholarly journals Topic Modelling Meets Deep Neural Networks: A Survey

Author(s):  
He Zhao ◽  
Dinh Phung ◽  
Viet Huynh ◽  
Yuan Jin ◽  
Lan Du ◽  
...  

Topic modelling has been a successful technique for text analysis for almost twenty years. When topic modelling met deep neural networks, there emerged a new and increasingly popular research area, neural topic models, with nearly a hundred models developed and a wide range of applications in neural language understanding such as text generation, summarisation and language models. There is a need to summarise research developments and discuss open problems and future directions. In this paper, we provide a focused yet comprehensive overview of neural topic models for interested researchers in the AI community, so as to facilitate them to navigate and innovate in this fast-growing research area. To the best of our knowledge, ours is the first review on this specific topic.

2021 ◽  
Vol 23 (2) ◽  
pp. 13-22
Author(s):  
Debmalya Mandal ◽  
Sourav Medya ◽  
Brian Uzzi ◽  
Charu Aggarwal

Graph Neural Networks (GNNs), a generalization of deep neural networks on graph data have been widely used in various domains, ranging from drug discovery to recommender systems. However, GNNs on such applications are limited when there are few available samples. Meta-learning has been an important framework to address the lack of samples in machine learning, and in recent years, researchers have started to apply meta-learning to GNNs. In this work, we provide a comprehensive survey of different metalearning approaches involving GNNs on various graph problems showing the power of using these two approaches together. We categorize the literature based on proposed architectures, shared representations, and applications. Finally, we discuss several exciting future research directions and open problems.


2022 ◽  
Vol 18 (2) ◽  
pp. 1-25
Author(s):  
Saransh Gupta ◽  
Mohsen Imani ◽  
Joonseop Sim ◽  
Andrew Huang ◽  
Fan Wu ◽  
...  

Stochastic computing (SC) reduces the complexity of computation by representing numbers with long streams of independent bits. However, increasing performance in SC comes with either an increase in area or a loss in accuracy. Processing in memory (PIM) computes data in-place while having high memory density and supporting bit-parallel operations with low energy consumption. In this article, we propose COSMO, an architecture for co mputing with s tochastic numbers in me mo ry, which enables SC in memory. The proposed architecture is general and can be used for a wide range of applications. It is a highly dense and parallel architecture that supports most SC encodings and operations in memory. It maximizes the performance and energy efficiency of SC by introducing several innovations: (i) in-memory parallel stochastic number generation, (ii) efficient implication-based logic in memory, (iii) novel memory bit line segmenting, (iv) a new memory-compatible SC addition operation, and (v) enabling flexible block allocation. To show the generality and efficiency of our stochastic architecture, we implement image processing, deep neural networks (DNNs), and hyperdimensional (HD) computing on the proposed hardware. Our evaluations show that running DNN inference on COSMO is 141× faster and 80× more energy efficient as compared to GPU.


2021 ◽  
Vol 3 (4) ◽  
pp. 966-989
Author(s):  
Vanessa Buhrmester ◽  
David Münch ◽  
Michael Arens

Deep Learning is a state-of-the-art technique to make inference on extensive or complex data. As a black box model due to their multilayer nonlinear structure, Deep Neural Networks are often criticized as being non-transparent and their predictions not traceable by humans. Furthermore, the models learn from artificially generated datasets, which often do not reflect reality. By basing decision-making algorithms on Deep Neural Networks, prejudice and unfairness may be promoted unknowingly due to a lack of transparency. Hence, several so-called explanators, or explainers, have been developed. Explainers try to give insight into the inner structure of machine learning black boxes by analyzing the connection between the input and output. In this survey, we present the mechanisms and properties of explaining systems for Deep Neural Networks for Computer Vision tasks. We give a comprehensive overview about the taxonomy of related studies and compare several survey papers that deal with explainability in general. We work out the drawbacks and gaps and summarize further research ideas.


Author(s):  
Ulas Isildak ◽  
Alessandro Stella ◽  
Matteo Fumagalli

1AbstractBalancing selection is an important adaptive mechanism underpinning a wide range of phenotypes. Despite its relevance, the detection of recent balancing selection from genomic data is challenging as its signatures are qualitatively similar to those left by ongoing positive selection. In this study we developed and implemented two deep neural networks and tested their performance to predict loci under recent selection, either due to balancing selection or incomplete sweep, from population genomic data. Specifically, we generated forward-intime simulations to train and test an artificial neural network (ANN) and a convolutional neural network (CNN). ANN received as input multiple summary statistics calculated on the locus of interest, while CNN was applied directly on the matrix of haplotypes. We found that both architectures have high accuracy to identify loci under recent selection. CNN generally outperformed ANN to distinguish between signals of balancing selection and incomplete sweep and was less affected by incorrect training data. We deployed both trained networks on neutral genomic regions in European populations and demonstrated a lower false positive rate for CNN than ANN. We finally deployed CNN within the MEFV gene region and identified several common variants predicted to be under incomplete sweep in a European population. Notably, two of these variants are functional changes and could modulate susceptibility to Familial Mediterranean Fever, possibly as a consequence of past adaptation to pathogens. In conclusion, deep neural networks were able to characterise signals of selection on intermediate-frequency variants, an analysis currently inaccessible by commonly used strategies.


Author(s):  
Qiqing Wang ◽  
Cunbin Li

The surge of renewable energy systems can lead to increasing incidents that negatively impact economics and society, rendering incident detection paramount to understand the mechanism and range of those impacts. In this paper, a deep learning framework is proposed to detect renewable energy incidents from news articles containing accidents in various renewable energy systems. The pre-trained language models like Bidirectional Encoder Representations from Transformers (BERT) and word2vec are utilized to represent textual inputs, which are trained by the Text Convolutional Neural Networks (TCNNs) and Text Recurrent Neural Networks. Two types of classifiers for incident detection are trained and tested in this paper, one is a binary classifier for detecting the existence of an incident, the other is a multi-label classifier for identifying different incident attributes such as causal-effects and consequences, etc. The proposed incident detection framework is implemented on a hand-annotated dataset with 5 190 records. The results show that the proposed framework performs well on both the incident existence detection task (F1-score 91.4%) and the incident attributes identification task (micro F1-score 81.7%). It is also shown that the BERT-based TCNNs are effective and robust in detecting renewable energy incidents from large-scale textual materials.


2020 ◽  
pp. 107754632092914
Author(s):  
Mohammed Alabsi ◽  
Yabin Liao ◽  
Ala-Addin Nabulsi

Deep learning has seen tremendous growth over the past decade. It has set new performance limits for a wide range of applications, including computer vision, speech recognition, and machinery health monitoring. With the abundance of instrumentation data and the availability of high computational power, deep learning continues to prove itself as an efficient tool for the extraction of micropatterns from machinery big data repositories. This study presents a comparative study for feature extraction capabilities using stacked autoencoders considering the use of expert domain knowledge. Case Western Reserve University bearing dataset was used for the study, and a classifier was trained and tested to extract and visualize features from 12 different failure classes. Based on the raw data preprocessing, four different deep neural network structures were studied. Results indicated that integrating domain knowledge with deep learning techniques improved feature extraction capabilities and reduced the deep neural networks size and computational requirements without the need for exhaustive deep neural networks architecture tuning and modification.


2019 ◽  
Vol 28 (06) ◽  
pp. 1960008 ◽  
Author(s):  
Grega Vrbančič ◽  
Iztok Fister ◽  
Vili Podgorelec

Over the past years, the application of deep neural networks in a wide range of areas is noticeably increasing. While many state-of-the-art deep neural networks are providing the performance comparable or in some cases even superior to humans, major challenges such as parameter settings for learning deep neural networks and construction of deep learning architectures still exist. The implications of those challenges have a significant impact on how a deep neural network is going to perform on a specific task. With the proposed method, presented in this paper, we are addressing the problem of parameter setting for a deep neural network utilizing swarm intelligence algorithms. In our experiments, we applied the proposed method variants to the classification task for distinguishing between phishing and legitimate websites. The performance of the proposed method is evaluated and compared against four different phishing datasets, two of which we prepared on our own. The results, obtained from the conducted empirical experiments, have proven the proposed approach to be very promising. By utilizing the proposed swarm intelligence based methods, we were able to statistically significantly improve the predictive performance when compared to the manually tuned deep neural network. In general, the improvement of classification accuracy ranges from 2.5% to 3.8%, while the improvement of F1-score reached even 24% on one of the datasets.


Author(s):  
Kosuke Takagi

Abstract Despite the recent success of deep learning models in solving various problems, their ability is still limited compared with human intelligence, which has the flexibility to adapt to a changing environment. To obtain a model which achieves adaptability to a wide range of problems and tasks is a challenging problem. To achieve this, an issue that must be addressed is identification of the similarities and differences between the human brain and deep neural networks. In this article, inspired by the human flexibility which might suggest the existence of a common mechanism allowing solution of different kinds of tasks, we consider a general learning process in neural networks, on which no specific conditions and constraints are imposed. Subsequently, we theoretically show that, according to the learning progress, the network structure converges to the state, which is characterized by a unique distribution model with respect to network quantities such as the connection weight and node strength. Noting that the empirical data indicate that this state emerges in the large scale network in the human brain, we show that the same state can be reproduced in a simple example of deep learning models. Although further research is needed, our findings provide an insight into the common inherent mechanism underlying the human brain and deep learning. Thus, our findings provide suggestions for designing efficient learning algorithms for solving a wide variety of tasks in the future.


Sensors ◽  
2021 ◽  
Vol 21 (8) ◽  
pp. 2658
Author(s):  
Myunghoon Lee ◽  
Hyeonho Shin ◽  
Dabin Lee ◽  
Sung-Pil Choi

Grammatical Error Correction (GEC) is the task of detecting and correcting various grammatical errors in texts. Many previous approaches to the GEC have used various mechanisms including rules, statistics, and their combinations. Recently, the performance of the GEC in English has been drastically enhanced due to the vigorous applications of deep neural networks and pretrained language models. Following the promising results of the English GEC tasks, we apply the Transformer with Copying Mechanism into the Korean GEC task by introducing novel and effective noising methods for constructing Korean GEC datasets. Our comparative experiments showed that the proposed system outperforms two commercial grammar check and other NMT-based models.


Author(s):  
Ziyuan Zhong ◽  
Yuchi Tian ◽  
Baishakhi Ray

AbstractDeep Neural Networks (DNNs) are being deployed in a wide range of settings today, from safety-critical applications like autonomous driving to commercial applications involving image classifications. However, recent research has shown that DNNs can be brittle to even slight variations of the input data. Therefore, rigorous testing of DNNs has gained widespread attention.While DNN robustness under norm-bound perturbation got significant attention over the past few years, our knowledge is still limited when natural variants of the input images come. These natural variants, e.g., a rotated or a rainy version of the original input, are especially concerning as they can occur naturally in the field without any active adversary and may lead to undesirable consequences. Thus, it is important to identify the inputs whose small variations may lead to erroneous DNN behaviors. The very few studies that looked at DNN’s robustness under natural variants, however, focus on estimating the overall robustness of DNNs across all the test data rather than localizing such error-producing points. This work aims to bridge this gap.To this end, we study the local per-input robustness properties of the DNNs and leverage those properties to build a white-box (DeepRobust-W) and a black-box (DeepRobust-B) tool to automatically identify the non-robust points. Our evaluation of these methods on three DNN models spanning three widely used image classification datasets shows that they are effective in flagging points of poor robustness. In particular, DeepRobust-W and DeepRobust-B are able to achieve an F1 score of up to 91.4% and 99.1%, respectively. We further show that DeepRobust-W can be applied to a regression problem in a domain beyond image classification. Our evaluation on three self-driving car models demonstrates that DeepRobust-W is effective in identifying points of poor robustness with F1 score up to 78.9%.


Sign in / Sign up

Export Citation Format

Share Document