scholarly journals TextNAS: A Neural Architecture Search Space Tailored for Text Representation

2020 ◽  
Vol 34 (05) ◽  
pp. 9242-9249
Author(s):  
Yujing Wang ◽  
Yaming Yang ◽  
Yiren Chen ◽  
Jing Bai ◽  
Ce Zhang ◽  
...  

Learning text representation is crucial for text classification and other language related tasks. There are a diverse set of text representation networks in the literature, and how to find the optimal one is a non-trivial problem. Recently, the emerging Neural Architecture Search (NAS) techniques have demonstrated good potential to solve the problem. Nevertheless, most of the existing works of NAS focus on the search algorithms and pay little attention to the search space. In this paper, we argue that the search space is also an important human prior to the success of NAS in different applications. Thus, we propose a novel search space tailored for text representation. Through automatic search, the discovered network architecture outperforms state-of-the-art models on various public datasets on text classification and natural language inference tasks. Furthermore, some of the design principles found in the automatic network agree well with human intuition.

Author(s):  
Ming Hao ◽  
Weijing Wang ◽  
Fang Zhou

Short text classification is an important foundation for natural language processing (NLP) tasks. Though, the text classification based on deep language models (DLMs) has made a significant headway, in practical applications however, some texts are ambiguous and hard to classify in multi-class classification especially, for short texts whose context length is limited. The mainstream method improves the distinction of ambiguous text by adding context information. However, these methods rely only the text representation, and ignore that the categories overlap and are not completely independent of each other. In this paper, we establish a new general method to solve the problem of ambiguous text classification by introducing label embedding to represent each category, which makes measurable difference between the categories. Further, a new compositional loss function is proposed to train the model, which makes the text representation closer to the ground-truth label and farther away from others. Finally, a constraint is obtained by calculating the similarity between the text representation and label embedding. Errors caused by ambiguous text can be corrected by adding constraints to the output layer of the model. We apply the method to three classical models and conduct experiments on six public datasets. Experiments show that our method can effectively improve the classification accuracy of the ambiguous texts. In addition, combining our method with BERT, we obtain the state-of-the-art results on the CNT dataset.


2020 ◽  
Vol 10 (11) ◽  
pp. 3712
Author(s):  
Dongjing Shan ◽  
Xiongwei Zhang ◽  
Wenhua Shi ◽  
Li Li

Regarding the sequence learning of neural networks, there exists a problem of how to capture long-term dependencies and alleviate the gradient vanishing phenomenon. To manage this problem, we proposed a neural network with random connections via a scheme of a neural architecture search. First, a dense network was designed and trained to construct a search space, and then another network was generated by random sampling in the space, whose skip connections could transmit information directly over multiple periods and capture long-term dependencies more efficiently. Moreover, we devised a novel cell structure that required less memory and computational power than the structures of long short-term memories (LSTMs), and finally, we performed a special initialization scheme on the cell parameters, which could permit unhindered gradient propagation on the time axis at the beginning of training. In the experiments, we evaluated four sequential tasks: adding, copying, frequency discrimination, and image classification; we also adopted several state-of-the-art methods for comparison. The experimental results demonstrated that our proposed model achieved the best performance.


2020 ◽  
Vol 34 (07) ◽  
pp. 10443-10450
Author(s):  
Jie An ◽  
Haoyi Xiong ◽  
Jun Huan ◽  
Jiebo Luo

The key challenge in photorealistic style transfer is that an algorithm should faithfully transfer the style of a reference photo to a content photo while the generated image should look like one captured by a camera. Although several photorealistic style transfer algorithms have been proposed, they need to rely on post- and/or pre-processing to make the generated images look photorealistic. If we disable the additional processing, these algorithms would fail to produce plausible photorealistic stylization in terms of detail preservation and photorealism. In this work, we propose an effective solution to these issues. Our method consists of a construction step (C-step) to build a photorealistic stylization network and a pruning step (P-step) for acceleration. In the C-step, we propose a dense auto-encoder named PhotoNet based on a carefully designed pre-analysis. PhotoNet integrates a feature aggregation module (BFA) and instance normalized skip links (INSL). To generate faithful stylization, we introduce multiple style transfer modules in the decoder and INSLs. PhotoNet significantly outperforms existing algorithms in terms of both efficiency and effectiveness. In the P-step, we adopt a neural architecture search method to accelerate PhotoNet. We propose an automatic network pruning framework in the manner of teacher-student learning for photorealistic stylization. The network architecture named PhotoNAS resulted from the search achieves significant acceleration over PhotoNet while keeping the stylization effects almost intact. We conduct extensive experiments on both image and video transfer. The results show that our method can produce favorable results while achieving 20-30 times acceleration in comparison with the existing state-of-the-art approaches. It is worth noting that the proposed algorithm accomplishes better performance without any pre- or post-processing.


2020 ◽  
Vol 32 (23) ◽  
pp. 17309-17320
Author(s):  
Rolandos Alexandros Potamias ◽  
Georgios Siolas ◽  
Andreas - Georgios Stafylopatis

AbstractFigurative language (FL) seems ubiquitous in all social media discussion forums and chats, posing extra challenges to sentiment analysis endeavors. Identification of FL schemas in short texts remains largely an unresolved issue in the broader field of natural language processing, mainly due to their contradictory and metaphorical meaning content. The main FL expression forms are sarcasm, irony and metaphor. In the present paper, we employ advanced deep learning methodologies to tackle the problem of identifying the aforementioned FL forms. Significantly extending our previous work (Potamias et al., in: International conference on engineering applications of neural networks, Springer, Berlin, pp 164–175, 2019), we propose a neural network methodology that builds on a recently proposed pre-trained transformer-based network architecture which is further enhanced with the employment and devise of a recurrent convolutional neural network. With this setup, data preprocessing is kept in minimum. The performance of the devised hybrid neural architecture is tested on four benchmark datasets, and contrasted with other relevant state-of-the-art methodologies and systems. Results demonstrate that the proposed methodology achieves state-of-the-art performance under all benchmark datasets, outperforming, even by a large margin, all other methodologies and published studies.


Author(s):  
Carlos Hernandez ◽  
Adi Botea ◽  
Jorge A. Baier ◽  
Vadim Bulitko

Real-time search algorithms are relevant to time-sensitive decision-making domains such as video games and robotics. In such settings, the agent is required to decide on each action under a constant time bound, regardless of the search space size. Despite recent progress, poor-quality solutions can be produced mainly due to state re-visitation. Different techniques have been developed to reduce such a re-visitation with state pruning showing promise. In this paper, we propose a novel pruning approach applicable to the wide class of real-time search algorithms. Given a local search space of arbitrary size, our technique aggressively prunes away all states in its interior, possibly adding new edges to maintain the connectivity of the search space frontier. An experimental evaluation shows that our pruning often improves the performance of a base real-time search algorithm by over an order of magnitude. This allows our implemented system to outperform state-of-the-art real-time search algorithms used in the evaluation.


2022 ◽  
Vol 73 ◽  
Author(s):  
Maximilian Fickert ◽  
Jörg Hoffmann

In classical AI planning, heuristic functions typically base their estimates on a relaxation of the input task. Such relaxations can be more or less precise, and many heuristic functions have a refinement procedure that can be iteratively applied until the desired degree of precision is reached. Traditionally, such refinement is performed offline to instantiate the heuristic for the search. However, a natural idea is to perform such refinement online instead, in situations where the heuristic is not sufficiently accurate. We introduce several online-refinement search algorithms, based on hill-climbing and greedy best-first search. Our hill-climbing algorithms perform a bounded lookahead, proceeding to a state with lower heuristic value than the root state of the lookahead if such a state exists, or refining the heuristic otherwise to remove such a local minimum from the search space surface. These algorithms are complete if the refinement procedure satisfies a suitable convergence property. We transfer the idea of bounded lookaheads to greedy best-first search with a lightweight lookahead after each expansion, serving both as a method to boost search progress and to detect when the heuristic is inaccurate, identifying an opportunity for online refinement. We evaluate our algorithms with the partial delete relaxation heuristic hCFF, which can be refined by treating additional conjunctions of facts as atomic, and whose refinement operation satisfies the convergence property required for completeness. On both the IPC domains as well as on the recently published Autoscale benchmarks, our online-refinement search algorithms significantly beat state-of-the-art satisficing planners, and are competitive even with complex portfolios.


2021 ◽  
Vol 11 (23) ◽  
pp. 11436
Author(s):  
Ha Yoon Song

The current evolution of deep learning requires further optimization in terms of accuracy and time. From the perspective of new requirements, AutoML is an area that could provide possible solutions. AutoML has a neural architecture search (NAS) field. DARTS is a widely used approach in NAS and is based on gradient descent; however, it has some drawbacks. In this study, we attempted to overcome some of the drawbacks of DARTS by improving the accuracy and decreasing the search cost. The DARTS algorithm uses a mixed operation that combines all operations in the search space. The architecture parameter of each operation comprising a mixed operation is trained using gradient descent, and the operation with the largest architecture parameter is selected. The use of a mixed operation causes a problem called vote dispersion: similar operations share architecture parameters during gradient descent; thus, there are cases where the most important operation is disregarded. In this selection process, vote dispersion causes DARTS performance to degrade. To cope with this problem, we propose a new algorithm based on DARTS called DG-DARTS. Two search stages are introduced, and the clustering of operations is applied in DG-DARTS. In summary, DG-DARTS achieves an error rate of 2.51% on the CIFAR10 dataset, and its search cost is 0.2 GPU days because the search space of the second stage is reduced by half. The speed-up factor of DG-DARTS to DARTS is 6.82, which indicates that the search cost of DG-DARTS is only 13% that of DARTS.


2020 ◽  
Vol 34 (07) ◽  
pp. 10526-10533 ◽  
Author(s):  
Hanlin Chen ◽  
Li'an Zhuo ◽  
Baochang Zhang ◽  
Xiawu Zheng ◽  
Jianzhuang Liu ◽  
...  

Neural architecture search (NAS) can have a significant impact in computer vision by automatically designing optimal neural network architectures for various tasks. A variant, binarized neural architecture search (BNAS), with a search space of binarized convolutions, can produce extremely compressed models. Unfortunately, this area remains largely unexplored. BNAS is more challenging than NAS due to the learning inefficiency caused by optimization requirements and the huge architecture space. To address these issues, we introduce channel sampling and operation space reduction into a differentiable NAS to significantly reduce the cost of searching. This is accomplished through a performance-based strategy used to abandon less potential operations. Two optimization methods for binarized neural networks are used to validate the effectiveness of our BNAS. Extensive experiments demonstrate that the proposed BNAS achieves a performance comparable to NAS on both CIFAR and ImageNet databases. An accuracy of 96.53% vs. 97.22% is achieved on the CIFAR-10 dataset, but with a significantly compressed model, and a 40% faster search than the state-of-the-art PC-DARTS.


2012 ◽  
Vol 20 (4) ◽  
pp. 509-541 ◽  
Author(s):  
Petr Pošík ◽  
Waltraud Huyer ◽  
László Pál

Four methods for global numerical black box optimization with origins in the mathematical programming community are described and experimentally compared with the state of the art evolutionary method, BIPOP-CMA-ES. The methods chosen for the comparison exhibit various features that are potentially interesting for the evolutionary computation community: systematic sampling of the search space (DIRECT, MCS) possibly combined with a local search method (MCS), or a multi-start approach (NEWUOA, GLOBAL) possibly equipped with a careful selection of points to run a local optimizer from (GLOBAL). The recently proposed “comparing continuous optimizers” (COCO) methodology was adopted as the basis for the comparison. Based on the results, we draw suggestions about which algorithm should be used depending on the available budget of function evaluations, and we propose several possibilities for hybridizing evolutionary algorithms (EAs) with features of the other compared algorithms.


Author(s):  
Jianwei Zhang ◽  
Dong Li ◽  
Lituan Wang ◽  
Lei Zhang

Neural Architecture Search (NAS), which aims at automatically designing neural architectures, recently draw a growing research interest. Different from conventional NAS methods, in which a large number of neural architectures need to be trained for evaluation, the one-shot NAS methods only have to train one supernet which synthesizes all the possible candidate architectures. As a result, the search efficiency could be significantly improved by sharing the supernet’s weights during the candidate architectures’ evaluation. This strategy could greatly speed up the search process but suffer a challenge that the evaluation based on sharing weights is not predictive enough. Recently, pruning the supernet during the search has been proven to be an efficient way to alleviate this problem. However, the pruning direction in complex-structured search space remains unexplored. In this paper, we revisited the role of path dropout strategy, which drops the neural operations instead of the neurons, in supernet training, and several interesting characters of the supernet trained with dropout are found. Based on the observations, a Hierarchically-Ordered Pruning Neural Architecture Search (HOPNAS) algorithm is proposed by dynamically pruning the supernet with a proper pruning direction. Experimental results indicate that our method is competitive with state-of-the-art approaches on CIFAR10 and ImageNet.


Sign in / Sign up

Export Citation Format

Share Document