scholarly journals Towards Generating Summaries for Lexically Confusing Code through Code Erosion

Author(s):  
Fan Yan ◽  
Ming Li

Code summarization aims to summarize code functionality as high-level nature language descriptions to assist in code comprehension. Recent approaches in this field mainly focus on generating summaries for code with precise identifier names, in which meaningful words can be found indicating code functionality. When faced with lexically confusing code, current approaches are likely to fail since the correlation between code lexical tokens and summaries is scarce. To tackle this problem, we propose a novel summarization framework named VECOS. VECOS introduces an erosion mechanism to conquer the model's reliance on precisely defined lexical information. To facilitate learning the eroded code's functionality, we force the representation of the eroded code to align with the representation of its original counterpart via variational inference. Experimental results show that our approach outperforms the state-of-the-art approaches to generate coherent and reliable summaries for various lexically confusing code.

2021 ◽  
Vol 11 (23) ◽  
pp. 11344
Author(s):  
Wei Ke ◽  
Ka-Hou Chan

Paragraph-based datasets are hard to analyze by a simple RNN, because a long sequence always contains lengthy problems of long-term dependencies. In this work, we propose a Multilayer Content-Adaptive Recurrent Unit (CARU) network for paragraph information extraction. In addition, we present a type of CNN-based model as an extractor to explore and capture useful features in the hidden state, which represent the content of the entire paragraph. In particular, we introduce the Chebyshev pooling to connect to the end of the CNN-based extractor instead of using the maximum pooling. This can project the features into a probability distribution so as to provide an interpretable evaluation for the final analysis. Experimental results demonstrate the superiority of the proposed approach, being compared to the state-of-the-art models.


Author(s):  
Nicolas Bougie ◽  
Ryutaro Ichise

Deep reinforcement learning (DRL) methods traditionally struggle with tasks where environment rewards are sparse or delayed, which entails that exploration remains one of the key challenges of DRL. Instead of solely relying on extrinsic rewards, many state-of-the-art methods use intrinsic curiosity as exploration signal. While they hold promise of better local exploration, discovering global exploration strategies is beyond the reach of current methods. We propose a novel end-to-end intrinsic reward formulation that introduces high-level exploration in reinforcement learning. Our curiosity signal is driven by a fast reward that deals with local exploration and a slow reward that incentivizes long-time horizon exploration strategies. We formulate curiosity as the error in an agent’s ability to reconstruct the observations given their contexts. Experimental results show that this high-level exploration enables our agents to outperform prior work in several Atari games.


2020 ◽  
Vol 2020 ◽  
pp. 1-12
Author(s):  
Jiaxi Ye ◽  
Ruilin Li ◽  
Bin Zhang

Directed fuzzing is a practical technique, which concentrates its testing energy on the process toward the target code areas, while costing little on other unconcerned components. It is a promising way to make better use of available resources, especially in testing large-scale programs. However, by observing the state-of-the-art-directed fuzzing engine (AFLGo), we argue that there are two universal limitations, the balance problem between the exploration and the exploitation and the blindness in mutation toward the target code areas. In this paper, we present a new prototype RDFuzz to address these two limitations. In RDFuzz, we first introduce the frequency-guided strategy in the exploration and improve its accuracy by adopting the branch-level instead of the path-level frequency. Then, we introduce the input-distance-based evaluation strategy in the exploitation stage and present an optimized mutation to distinguish and protect the distance sensitive input content. Moreover, an intertwined testing schedule is leveraged to perform the exploration and exploitation in turn. We test RDFuzz on 7 benchmarks, and the experimental results demonstrate that RDFuzz is skilled at driving the program toward the target code areas, and it is not easily stuck by the balance problem of the exploration and the exploitation.


Author(s):  
Mirko Luca Lobina ◽  
Luigi Atzori ◽  
Davide Mula

Many audio watermarking techniques presented in the last years make use of masking and psychological models derived from signal processing. Such a basic idea is winning because it guarantees a high level of robustness and bandwidth of the watermark as well as fidelity of the watermarked signal. This chapter first describes the relationship between digital right management, intellectual property, and use of watermarking techniques. Then, the crossing use of watermarking and masking models is detailed, providing schemes, examples, and references. Finally, the authors present two strategies that make use of a masking model, applied to a classic watermarking technique. The joint use of classic frameworks and masking models seems to be one of the trends for the future of research in watermarking. Several tests on the proposed strategies with the state of the art are also offered to give an idea of how to assess the effectiveness of a watermarking technique.


2020 ◽  
Vol 10 (8) ◽  
pp. 2864 ◽  
Author(s):  
Muhammad Asad ◽  
Ahmed Moustafa ◽  
Takayuki Ito

Artificial Intelligence (AI) has been applied to solve various challenges of real-world problems in recent years. However, the emergence of new AI technologies has brought several problems, especially with regard to communication efficiency, security threats and privacy violations. Towards this end, Federated Learning (FL) has received widespread attention due to its ability to facilitate the collaborative training of local learning models without compromising the privacy of data. However, recent studies have shown that FL still consumes considerable amounts of communication resources. These communication resources are vital for updating the learning models. In addition, the privacy of data could still be compromised once sharing the parameters of the local learning models in order to update the global model. Towards this end, we propose a new approach, namely, Federated Optimisation (FedOpt) in order to promote communication efficiency and privacy preservation in FL. In order to implement FedOpt, we design a novel compression algorithm, namely, Sparse Compression Algorithm (SCA) for efficient communication, and then integrate the additively homomorphic encryption with differential privacy to prevent data from being leaked. Thus, the proposed FedOpt smoothly trade-offs communication efficiency and privacy preservation in order to adopt the learning task. The experimental results demonstrate that FedOpt outperforms the state-of-the-art FL approaches. In particular, we consider three different evaluation criteria; model accuracy, communication efficiency and computation overhead. Then, we compare the proposed FedOpt with the baseline configurations and the state-of-the-art approaches, i.e., Federated Averaging (FedAvg) and the paillier-encryption based privacy-preserving deep learning (PPDL) on all these three evaluation criteria. The experimental results show that FedOpt is able to converge within fewer training epochs and a smaller privacy budget.


Physics ◽  
2020 ◽  
Vol 2 (1) ◽  
pp. 49-66 ◽  
Author(s):  
Vyacheslav I. Yukalov

The article presents the state of the art and reviews the literature on the long-standing problem of the possibility for a sample to be at the same time solid and superfluid. Theoretical models, numerical simulations, and experimental results are discussed.


Author(s):  
Rung-Tzuo Liaw ◽  
Chuan-Kang Ting

Evolutionary multitasking is a significant emerging search paradigm that utilizes evolutionary algorithms to concurrently optimize multiple tasks. The multi-factorial evolutionary algorithm renders an effectual realization of evolutionary multitasking on two or three tasks. However, there remains room for improvement on the performance and capability of evolutionary multitasking. Beyond three tasks, this paper proposes a novel framework, called the symbiosis in biocoenosis optimization (SBO), to address evolutionary many-tasking optimization. The SBO leverages the notion of symbiosis in biocoenosis for transferring information and knowledge among different tasks through three major components: 1) transferring information through inter-task individual replacement, 2) measuring symbiosis through intertask paired evaluations, and 3) coordinating the frequency and quantity of transfer based on symbiosis in biocoenosis. The inter-task individual replacement with paired evaluations caters for estimation of symbiosis, while the symbiosis in biocoenosis provides a good estimator of transfer. This study examines the effectiveness and efficiency of the SBO on a suite of many-tasking benchmark problems, designed to deal with 30 tasks simultaneously. The experimental results show that SBO leads to better solutions and faster convergence than the state-of-the-art evolutionary multitasking algorithms. Moreover, the results indicate that SBO is highly capable of identifying the similarity between problems and transferring information appropriately.


Electronics ◽  
2018 ◽  
Vol 7 (10) ◽  
pp. 258 ◽  
Author(s):  
Abdus Hassan ◽  
Umar Afzaal ◽  
Tooba Arifeen ◽  
Jeong Lee

Recently, concurrent error detection enabled through invariant relationships between different wires in a circuit has been proposed. Because there are many such implications in a circuit, selection strategies have been developed to select the most valuable implications for inclusion in the checker hardware such that a sufficiently high probability of error detection ( P d e t e c t i o n ) is achieved. These algorithms, however, due to their heuristic nature cannot guarantee a lossless P d e t e c t i o n . In this paper, we develop a new input-aware implication selection algorithm with the help of ATPG which minimizes loss on P d e t e c t i o n . In our algorithm, the detectability of errors for each candidate implication is carefully evaluated using error prone vectors. The evaluation results are then utilized to select the most efficient candidates for achieving optimal P d e t e c t i o n . The experimental results on 15 representative combinatorial benchmark circuits from the MCNC benchmarks suite show that the implications selected from our algorithm achieve better P d e t e c t i o n in comparison to the state of the art. The proposed method also offers better performance, up to 41.10%, in terms of the proposed impact-level metric, which is the ratio of achieved P d e t e c t i o n to the implication count.


2020 ◽  
Vol 34 (4) ◽  
pp. 571-584
Author(s):  
Rajarshi Biswas ◽  
Michael Barz ◽  
Daniel Sonntag

AbstractImage captioning is a challenging multimodal task. Significant improvements could be obtained by deep learning. Yet, captions generated by humans are still considered better, which makes it an interesting application for interactive machine learning and explainable artificial intelligence methods. In this work, we aim at improving the performance and explainability of the state-of-the-art method Show, Attend and Tell by augmenting their attention mechanism using additional bottom-up features. We compute visual attention on the joint embedding space formed by the union of high-level features and the low-level features obtained from the object specific salient regions of the input image. We embed the content of bounding boxes from a pre-trained Mask R-CNN model. This delivers state-of-the-art performance, while it provides explanatory features. Further, we discuss how interactive model improvement can be realized through re-ranking caption candidates using beam search decoders and explanatory features. We show that interactive re-ranking of beam search candidates has the potential to outperform the state-of-the-art in image captioning.


Author(s):  
Tianxing Wu ◽  
Guilin Qi ◽  
Bin Luo ◽  
Lei Zhang ◽  
Haofen Wang

Extracting knowledge from Wikipedia has attracted much attention in recent ten years. One of the most valuable kinds of knowledge is type information, which refers to the axioms stating that an instance is of a certain type. Current approaches for inferring the types of instances from Wikipedia mainly rely on some language-specific rules. Since these rules cannot catch the semantic associations between instances and classes (i.e. candidate types), it may lead to mistakes and omissions in the process of type inference. The authors propose a new approach leveraging attributes to perform language-independent type inference of the instances from Wikipedia. The proposed approach is applied to the whole English and Chinese Wikipedia, which results in the first version of MulType (Multilingual Type Information), a knowledge base describing the types of instances from multilingual Wikipedia. Experimental results show that not only the proposed approach outperforms the state-of-the-art comparison methods, but also MulType contains lots of new and high-quality type information.


Sign in / Sign up

Export Citation Format

Share Document