scholarly journals DeepAM: Migrate APIs with Multi-modal Sequence to Sequence Learning

Author(s):  
Xiaodong Gu ◽  
Hongyu Zhang ◽  
Dongmei Zhang ◽  
Sunghun Kim

Computer programs written in one language are often required to be ported to other languages to support multiple devices and environments. When programs use language specific APIs (Application Programming Interfaces), it is very challenging to migrate these APIs to the corresponding APIs written in other languages. Existing approaches mine API mappings from projects that have corresponding versions in two languages. They rely on the sparse availability of bilingual projects, thus producing a limited number of API mappings. In this paper, we propose an intelligent system called DeepAM for automatically mining API mappings from a large-scale code corpus without bilingual projects. The key component of DeepAM is based on the multi-modal sequence to sequence learning architecture that aims to learn joint semantic representations of bilingual API sequences from big source code data. Experimental results indicate that DeepAM significantly increases the accuracy of API mappings as well as the number of API mappings when compared with the state-of-the-art approaches.

2021 ◽  
Vol 8 (1) ◽  
Author(s):  
Mehdi Srifi ◽  
Ahmed Oussous ◽  
Ayoub Ait Lahcen ◽  
Salma Mouline

AbstractVarious recommender systems (RSs) have been developed over recent years, and many of them have concentrated on English content. Thus, the majority of RSs from the literature were compared on English content. However, the research investigations about RSs when using contents in other languages such as Arabic are minimal. The researchers still neglect the field of Arabic RSs. Therefore, we aim through this study to fill this research gap by leveraging the benefit of recent advances in the English RSs field. Our main goal is to investigate recent RSs in an Arabic context. For that, we firstly selected five state-of-the-art RSs devoted originally to English content, and then we empirically evaluated their performance on Arabic content. As a result of this work, we first build four publicly available large-scale Arabic datasets for recommendation purposes. Second, various text preprocessing techniques have been provided for preparing the constructed datasets. Third, our investigation derived well-argued conclusions about the usage of modern RSs in the Arabic context. The experimental results proved that these systems ensure high performance when applied to Arabic content.


2018 ◽  
Author(s):  
Jianfeng Li ◽  
Bowen Cui ◽  
Yuting Dai ◽  
Ling Bai ◽  
Jinyan Huang

The number of bioinformatics resources, such as tools/scripts and databases are growing exponentially. This poses a great challenge for users to access, manage, and integrate the corresponding bioinformatics resources. To facilitate the request, we proposed a comprehensive R package, BioInstaller, which includes the R functions, Shiny application, and the HTTP representational state transfer (REST) application programming interfaces (APIs). We also established a community-based configuration pool to collect, access and share bioinformatics resources. The source code of BioInstaller is freely available at our lab website http://bioinfo.rjh.com.cn/labs/jhuang/tools/bioinstaller or popular package host GitHub at: https://github.com/JhuangLab/BioInstaller. Also, a docker image can be downloaded from DockerHub (https://hub.docker.com/r/bioinstaller).


Author(s):  
Jun Zhou ◽  
Longfei Li ◽  
Ziqi Liu ◽  
Chaochao Chen

Recently, Factorization Machine (FM) has become more and more popular for recommendation systems due to its effectiveness in finding informative interactions between features. Usually, the weights for the interactions are learned as a low rank weight matrix, which is formulated as an inner product of two low rank matrices. This low rank matrix can help improve the generalization ability of Factorization Machine. However, to choose the rank properly, it usually needs to run the algorithm for many times using different ranks, which clearly is inefficient for some large-scale datasets. To alleviate this issue, we propose an Adaptive Boosting framework of Factorization Machine (AdaFM), which can adaptively search for proper ranks for different datasets without re-training. Instead of using a fixed rank for FM, the proposed algorithm will gradually increase its rank according to its performance until the performance does not grow. Extensive experiments are conducted to validate the proposed method on multiple large-scale datasets. The experimental results demonstrate that the proposed method can be more effective than the state-of-the-art Factorization Machines.


2020 ◽  
Vol 2020 ◽  
pp. 1-12
Author(s):  
Jiaxi Ye ◽  
Ruilin Li ◽  
Bin Zhang

Directed fuzzing is a practical technique, which concentrates its testing energy on the process toward the target code areas, while costing little on other unconcerned components. It is a promising way to make better use of available resources, especially in testing large-scale programs. However, by observing the state-of-the-art-directed fuzzing engine (AFLGo), we argue that there are two universal limitations, the balance problem between the exploration and the exploitation and the blindness in mutation toward the target code areas. In this paper, we present a new prototype RDFuzz to address these two limitations. In RDFuzz, we first introduce the frequency-guided strategy in the exploration and improve its accuracy by adopting the branch-level instead of the path-level frequency. Then, we introduce the input-distance-based evaluation strategy in the exploitation stage and present an optimized mutation to distinguish and protect the distance sensitive input content. Moreover, an intertwined testing schedule is leveraged to perform the exploration and exploitation in turn. We test RDFuzz on 7 benchmarks, and the experimental results demonstrate that RDFuzz is skilled at driving the program toward the target code areas, and it is not easily stuck by the balance problem of the exploration and the exploitation.


2021 ◽  
Vol 8 (2) ◽  
pp. 273-287
Author(s):  
Xuewei Bian ◽  
Chaoqun Wang ◽  
Weize Quan ◽  
Juntao Ye ◽  
Xiaopeng Zhang ◽  
...  

AbstractRecent learning-based approaches show promising performance improvement for the scene text removal task but usually leave several remnants of text and provide visually unpleasant results. In this work, a novel end-to-end framework is proposed based on accurate text stroke detection. Specifically, the text removal problem is decoupled into text stroke detection and stroke removal; we design separate networks to solve these two subproblems, the latter being a generative network. These two networks are combined as a processing unit, which is cascaded to obtain our final model for text removal. Experimental results demonstrate that the proposed method substantially outperforms the state-of-the-art for locating and erasing scene text. A new large-scale real-world dataset with 12,120 images has been constructed and is being made available to facilitate research, as current publicly available datasets are mainly synthetic so cannot properly measure the performance of different methods.


2021 ◽  
Vol 7 ◽  
pp. e824
Author(s):  
Yiren Li ◽  
Tieke Li ◽  
Pei Shen ◽  
Liang Hao ◽  
Wenjing Liu ◽  
...  

Microservice-based Web Systems (MWS), which provide a fundamental infrastructure for constructing large-scale cloud-based Web applications, are designed as a set of independent, small and modular microservices implementing individual tasks and communicating with messages. This microservice-based architecture offers great application scalability, but meanwhile incurs complex and reactive autoscaling actions that are performed dynamically and periodically based on current workloads. However, this problem has thus far remained largely unexplored. In this paper, we formulate a problem of Dynamic Resource Scheduling for Microservice-based Web Systems (DRS-MWS) and propose a similarity-based heuristic scheduling algorithm that aims to quickly find viable scheduling schemes by utilizing solutions to similar problems. The performance superiority of the proposed scheduling solution in comparison with three state-of-the-art algorithms is illustrated by experimental results generated through a well-known microservice benchmark on disparate computing nodes in public clouds.


2013 ◽  
Vol 10 (4) ◽  
pp. 82-101 ◽  
Author(s):  
Buqing Cao ◽  
Jianxun Liu ◽  
Mingdong Tang ◽  
Zibin Zheng ◽  
Guangrong Wang

With the rapid development of Web2.0 and its related technologies, Mashup services (i.e., Web applications created by combining two or more Web APIs) are becoming a hot research topic. The explosion of Mashup services, especially the functionally similar or equivalent services, however, make services discovery more difficult than ever. In this paper, we present an approach to recommend Mashup services to users based on usage history and service network. This approach firstly extracts users' interests from their Mashup service usage history and builds a service network based on social relationships information among Mashup services, Web application programming interfaces (APIs) and their tags. The approach then leverages the target user's interest and the service social relationship to perform Mashup service recommendation. Large-scale experiments based on a real-world Mashup service dataset show that the authors' proposed approach can effectively recommend Mashup services to users with excellent performance. Moreover, a Mashup service recommendation prototype system is developed.


Author(s):  
Michael Adeyeye

The cloud is becoming an atmosphere to store huge data and deploy massive applications. Using virtualization technologies, it is economical and feasible to provide testbeds in the cloud. The convergence of Next Generation (NG) networks and Internet-based applications may result in the deployment of future rich Internet applications and services in the cloud. This chapter shows the migration of mobility-enabled services to the cloud. It presents a SIP-based hybrid architecture for Web session mobility that offers content sharing and session handoff between Web browsers. The implemented system has recently evolved to a framework for developing different kinds of converged services over the Internet, which are similar to services offered by Google Wave and existing telephony Application Programming Interfaces (APIs). In addition, the work in this chapter is compared with those similar technologies. Lastly, the authors show efforts to migrate the SIP/HTTP application server to the cloud, which was necessitated by the need to include more functionalities (i.e., QoS and rich media support) as well as to provide large-scale deployment in a multi-domain scenario.


2013 ◽  
Vol 21 (1) ◽  
pp. 113-138 ◽  
Author(s):  
MUHUA ZHU ◽  
JINGBO ZHU ◽  
HUIZHEN WANG

AbstractShift-reduce parsing has been studied extensively for diverse grammars due to the simplicity and running efficiency. However, in the field of constituency parsing, shift-reduce parsers lag behind state-of-the-art parsers. In this paper we propose a semi-supervised approach for advancing shift-reduce constituency parsing. First, we apply the uptraining approach (Petrov, S. et al. 2010. In Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing (EMNLP), Cambridge, MA, USA, pp. 705–713) to improve part-of-speech taggers to provide better part-of-speech tags to subsequent shift-reduce parsers. Second, we enhance shift-reduce parsing models with novel features that are defined on lexical dependency information. Both stages depend on the use of large-scale unlabeled data. Experimental results show that the approach achieves overall improvements of 1.5 percent and 2.1 percent on English and Chinese data respectively. Moreover, the final parsing accuracies reach 90.9 percent and 82.2 percent respectively, which are comparable with the accuracy of state-of-the-art parsers.


2020 ◽  
Author(s):  
Huan Liu ◽  
Zhiliang Qiu ◽  
Weitao Pan ◽  
Jun Li ◽  
Ling Zheng ◽  
...  

Cyclic redundancy check (CRC) is a well-known error detection code that is widely used in Ethernet, PCIe, and other transmission protocols. The existing FPGA-based implementation solutions are faced with the problem of excessive resource utilization in high-performance scenarios. The padding zeros problem and the introduction of programmability further exacerbate this problem. In this brief, the stride-by-5 algorithm is proposed to achieve the optimal utilization of FPGA resources. The pipelining go back algorithm is proposed to solve the padding zeros problem. The method of reprogramming by HWICAP is proposed to realize programmability with a small and constant resource utilization. The experimental results show that the resource utilization of proposed non-segmented architecture is 80.7%-87.5% and 25.1%-46.2% lower than those of two state-of-the-art FPGA-based CRC implementations, and the proposed segmented architecture has a lower resource utilization by 81.7%-85.9% and 2.9%-20.8% compared wtih the two state-of-the-art architectures; meanwhile, the throughput and programmability are guaranteed. We made the source code available on GitHub.


Sign in / Sign up

Export Citation Format

Share Document