Deep Learning
Recently Published Documents


(FIVE YEARS 63972)



2022 ◽  
Vol 16 (4) ◽  
pp. 1-19
Hanrui Wu ◽  
Michael K. Ng

Hypergraphs have shown great power in representing high-order relations among entities, and lots of hypergraph-based deep learning methods have been proposed to learn informative data representations for the node classification problem. However, most of these deep learning approaches do not take full consideration of either the hyperedge information or the original relationships among nodes and hyperedges. In this article, we present a simple yet effective semi-supervised node classification method named Hypergraph Convolution on Nodes-Hyperedges network, which performs filtering on both nodes and hyperedges as well as recovers the original hypergraph with the least information loss. Instead of only reducing the cross-entropy loss over the labeled samples as most previous approaches do, we additionally consider the hypergraph reconstruction loss as prior information to improve prediction accuracy. As a result, by taking both the cross-entropy loss on the labeled samples and the hypergraph reconstruction loss into consideration, we are able to achieve discriminative latent data representations for training a classifier. We perform extensive experiments on the semi-supervised node classification problem and compare the proposed method with state-of-the-art algorithms. The promising results demonstrate the effectiveness of the proposed method.

2022 ◽  
Vol 16 (4) ◽  
pp. 1-22
Mu Yuan ◽  
Lan Zhang ◽  
Xiang-Yang Li ◽  
Lin-Zhuo Yang ◽  
Hui Xiong

Labeling data (e.g., labeling the people, objects, actions, and scene in images) comprehensively and efficiently is a widely needed but challenging task. Numerous models were proposed to label various data and many approaches were designed to enhance the ability of deep learning models or accelerate them. Unfortunately, a single machine-learning model is not powerful enough to extract various semantic information from data. Given certain applications, such as image retrieval platforms and photo album management apps, it is often required to execute a collection of models to obtain sufficient labels. With limited computing resources and stringent delay, given a data stream and a collection of applicable resource-hungry deep-learning models, we design a novel approach to adaptively schedule a subset of these models to execute on each data item, aiming to maximize the value of the model output (e.g., the number of high-confidence labels). Achieving this lofty goal is nontrivial since a model’s output on any data item is content-dependent and unknown until we execute it. To tackle this, we propose an Adaptive Model Scheduling framework, consisting of (1) a deep reinforcement learning-based approach to predict the value of unexecuted models by mining semantic relationship among diverse models, and (2) two heuristic algorithms to adaptively schedule the model execution order under a deadline or deadline-memory constraints, respectively. The proposed framework does not require any prior knowledge of the data, which works as a powerful complement to existing model optimization technologies. We conduct extensive evaluations on five diverse image datasets and 30 popular image labeling models to demonstrate the effectiveness of our design: our design could save around 53% execution time without loss of any valuable labels.

2022 ◽  
Vol 16 (4) ◽  
pp. 1-55
Manish Gupta ◽  
Puneet Agrawal

In recent years, the fields of natural language processing (NLP) and information retrieval (IR) have made tremendous progress thanks to deep learning models like Recurrent Neural Networks (RNNs), Gated Recurrent Units (GRUs) and Long Short-Term Memory (LSTMs) networks, and Transformer [ 121 ] based models like Bidirectional Encoder Representations from Transformers (BERT) [ 24 ], Generative Pre-training Transformer (GPT-2) [ 95 ], Multi-task Deep Neural Network (MT-DNN) [ 74 ], Extra-Long Network (XLNet) [ 135 ], Text-to-text transfer transformer (T5) [ 96 ], T-NLG [ 99 ], and GShard [ 64 ]. But these models are humongous in size. On the other hand, real-world applications demand small model size, low response times, and low computational power wattage. In this survey, we discuss six different types of methods (Pruning, Quantization, Knowledge Distillation (KD), Parameter Sharing, Tensor Decomposition, and Sub-quadratic Transformer-based methods) for compression of such models to enable their deployment in real industry NLP projects. Given the critical need of building applications with efficient and small models, and the large amount of recently published work in this area, we believe that this survey organizes the plethora of work done by the “deep learning for NLP” community in the past few years and presents it as a coherent story.

2022 ◽  
Vol 8 ◽  
pp. 1568-1577
Qin Xin ◽  
Mamoun Alazab ◽  
Vicente García Díaz ◽  
Carlos Enrique Montenegro-Marin ◽  
Rubén González Crespo

2022 ◽  
Vol 206 ◽  
pp. 107776
Bangru Xiong ◽  
Lu Lou ◽  
Xinyu Meng ◽  
Xin Wang ◽  
Hui Ma ◽  

2022 ◽  
Vol 211 ◽  
pp. 114478
Michiel Larmuseau ◽  
Koenraad Theuwissen ◽  
Kurt Lejaeghere ◽  
Lode Duprez ◽  
Tom Dhaene ◽  

Sign in / Sign up

Export Citation Format

Share Document