FACER: An API Usage-based Code-example Recommender for Opportunistic Reuse

Abstract To save time, developers often search for code examples that implement their desired software features. Existing code search techniques typically focus on ﬁnding code snippets for a single given query, which means that developers need to perform a separate search for each desired functionality. In this paper, we pro-pose FACER (Feature-driven API usage-based Code Examples Recommender), a technique that avoids repeated searches through opportunistic reuse. Speciﬁcally, given the selected code snippet that matches the initial search query, FACER ﬁnds and suggests related code snippets that represent features that the developer may want to implement next. FACER ﬁrst constructs a code fact repository by parsing the source code of open-source Java projects to obtain methods’ textual information, call graphs, and Application Programming Interface (API) usages. It then detects unique features by clustering methods based on similar API us-ages, where each cluster represents a feature or functionality. Finally, it detects frequently co-occurring features across projects using frequent pattern mining and recommends related methods from the mined patterns. To evaluate FACER, we run it on 120 Java Android apps from GitHub. We ﬁrst manually validate that the detected method clusters represent methods with similar functionality. We then perform an automated evaluation to determine the best parameters (e.g., similarity threshold) for FACER. We recruit 10 professional developers along with 39 experienced students to judge FACER’s recommendation of related methods. Our results show that, on average, FACER’s recommendations are 80% precise. We also survey a total of 20 professional Android and Java developers to understand their code search and reuse experiences, and also to obtain their feedback on the usability and usefulness of FACER. The survey results show that 95% of our surveyed professional developers ﬁnd the idea of related method recommendations useful during code reuse.

Download Full-text

Pattern Mining and Clustering on Image Databases

Successes and New Directions in Data Mining ◽

10.4018/978-1-59904-645-7.ch009 ◽

2008 ◽

pp. 187-212

Author(s):

Marinette Bouet ◽

Pierre Gançarski ◽

Marie-Aude Aufaure ◽

Omar Boussaïd

Keyword(s):

Pattern Mining ◽

Frequent Pattern Mining ◽

Image Data ◽

Research Direction ◽

Relevant Information ◽

Frequent Pattern ◽

Image Clustering ◽

Image Mining ◽

Clustering Methods ◽

New Research

Analysing and mining image data to derive potentially useful information is a very challenging task. Image mining concerns the extraction of implicit knowledge, image data relationships, associations between image data and other data or patterns not explicitly stored in the images. Another crucial task is to organize the large image volumes to extract relevant information. In fact, decision support systems are evolving to store and analyse these complex data. This paper presents a survey of the relevant research related to image data processing. We present data warehouse advances that organize large volumes of data linked with images and then, we focus on two techniques largely used in image mining. We present clustering methods applied to image analysis and we introduce the new research direction concerning pattern mining from large collections of images. While considerable advances have been made in image clustering, there is little research dealing with image frequent pattern mining. We shall try to understand why.

Download Full-text

Pattern Mining and Clustering on Image Databases

Database Technologies ◽

10.4018/978-1-60566-058-5.ch005 ◽

2009 ◽

pp. 60-85

Author(s):

Marinette Bouet ◽

Pierre Gançarski ◽

Marie-Aude Aufaure ◽

Omar Boussaïd

Keyword(s):

Pattern Mining ◽

Frequent Pattern Mining ◽

Image Data ◽

Research Direction ◽

Relevant Information ◽

Frequent Pattern ◽

Image Clustering ◽

Image Mining ◽

Clustering Methods ◽

New Research

Analysing and mining image data to derive potentially useful information is a very challenging task. Image mining concerns the extraction of implicit knowledge, image data relationships, associations between image data and other data or patterns not explicitly stored in the images. Another crucial task is to organise the large image volumes to extract relevant information. In fact, decision support systems are evolving to store and analyse these complex data. This chapter presents a survey of the relevant research related to image data processing. We present data warehouse advances that organise large volumes of data linked with images, and then we focus on two techniques largely used in image mining. We present clustering methods applied to image analysis, and we introduce the new research direction concerning pattern mining from large collections of images. While considerable advances have been made in image clustering, there is little research dealing with image frequent pattern mining. We will try to understand why.

Download Full-text

Pattern Mining and Clustering on Image Databases

Data Warehousing and Mining ◽

10.4018/978-1-59904-951-9.ch018 ◽

2008 ◽

pp. 254-279

Author(s):

Marinette Bouet ◽

Pierre Gançarski ◽

Omar Boussaïd

Keyword(s):

Pattern Mining ◽

Frequent Pattern Mining ◽

Image Data ◽

Research Direction ◽

Relevant Information ◽

Frequent Pattern ◽

Image Clustering ◽

Image Mining ◽

Clustering Methods ◽

New Research

Download Full-text

An Adaptive Data Distribution Through Tree Rules in Frequent Pattern Mining

International Journal of Scientific Research in Computer Science Engineering and Information Technology ◽

10.32628/cseit183894 ◽

2018 ◽

pp. 300-305

Keyword(s):

Information Sharing ◽

Pattern Mining ◽

Data Distribution ◽

Frequent Pattern Mining ◽

Frequent Pattern ◽

General Development ◽

Secure Information ◽

Evaluation Parameters ◽

Secure Information Sharing

Information sharing among the associations is a general development in a couple of zones like business headway and exhibiting. As bit of the touchy principles that ought to be kept private may be uncovered and such disclosure of delicate examples may impacts the advantages of the association that have the data. Subsequently the standards which are delicate must be secured before sharing the data. In this paper to give secure information sharing delicate guidelines are bothered first which was found by incessant example tree. Here touchy arrangement of principles are bothered by substitution. This kind of substitution diminishes the hazard and increment the utility of the dataset when contrasted with different techniques. Examination is done on certifiable dataset. Results shows that proposed work is better as appear differently in relation to various past strategies on the introduce of evaluation parameters.

Download Full-text

Learning and Synchronized Privacy Preserving Frequent Pattern Mining

Journal of Software ◽

10.3724/sp.j.1001.2011.04000 ◽

2011 ◽

Vol 22 (8) ◽

pp. 1749-1760

Author(s):

Yu-Hong GUO ◽

Yun-Hai TONG ◽

Shi-Wei TANG ◽

Leng-Dong WU

Keyword(s):

Pattern Mining ◽

Frequent Pattern Mining ◽

Privacy Preserving ◽

Frequent Pattern

Download Full-text

RAKING: An Efficient K-Maximal Frequent Pattern Mining Algorithm on Uncertain Graph Database

Chinese Journal of Computers ◽

10.3724/sp.j.1016.2010.01387 ◽

2010 ◽

Vol 33 (8) ◽

pp. 1387-1395 ◽

Cited By ~ 4

Author(s):

Meng HAN ◽

Wei ZHANG ◽

Jian-Zhong LI

Keyword(s):

Pattern Mining ◽

Frequent Pattern Mining ◽

Frequent Pattern ◽

Graph Database ◽

Uncertain Graph ◽

Mining Algorithm ◽

Maximal Frequent Pattern

Download Full-text

Sliding window based weighted maximal frequent pattern mining over data streams

Expert Systems with Applications ◽

10.1016/j.eswa.2013.07.094 ◽

2014 ◽

Vol 41 (2) ◽

pp. 694-708 ◽

Cited By ~ 64

Author(s):

Gangin Lee ◽

Unil Yun ◽

Keun Ho Ryu

Keyword(s):

Data Streams ◽

Pattern Mining ◽

Frequent Pattern Mining ◽

Sliding Window ◽

Frequent Pattern ◽

Maximal Frequent Pattern

Download Full-text

Guided pattern mining for API misuse detection by change-based code analysis

Automated Software Engineering ◽

10.1007/s10515-021-00294-x ◽

2021 ◽

Vol 28 (2) ◽

Author(s):

Sebastian Nielebock ◽

Robert Heumüller ◽

Kevin Michael Schott ◽

Frank Ortmeier

Keyword(s):

Pattern Mining ◽

Third Party ◽

Just In Time ◽

False Alarms ◽

Misuse Detection ◽

Code Search ◽

Usage Patterns ◽

Development Processes ◽

Code Changes ◽

Api Usage

AbstractLack of experience, inadequate documentation, and sub-optimal API design frequently cause developers to make mistakes when re-using third-party implementations. Such API misuses can result in unintended behavior, performance losses, or software crashes. Therefore, current research aims to automatically detect such misuses by comparing the way a developer used an API to previously inferred patterns of the correct API usage. While research has made significant progress, these techniques have not yet been adopted in practice. In part, this is due to the lack of a process capable of seamlessly integrating with software development processes. Particularly, existing approaches do not consider how to collect relevant source code samples from which to infer patterns. In fact, an inadequate collection can cause API usage pattern miners to infer irrelevant patterns which leads to false alarms instead of finding true API misuses. In this paper, we target this problem (a) by providing a method that increases the likelihood of finding relevant and true-positive patterns concerning a given set of code changes and agnostic to a concrete static, intra-procedural mining technique and (b) by introducing a concept for just-in-time API misuse detection which analyzes changes at the time of commit. Particularly, we introduce different, lightweight code search and filtering strategies and evaluate them on two real-world API misuse datasets to determine their usefulness in finding relevant intra-procedural API usage patterns. Our main results are (1) commit-based search with subsequent filtering effectively decreases the amount of code to be analyzed, (2) in particular method-level filtering is superior to file-level filtering, (3) project-internal and project-external code search find solutions for different types of misuses and thus are complementary, (4) incorporating prior knowledge of the misused API into the search has a negligible effect.

Download Full-text

Deep learning frequent pattern mining on static semi structured data streams for improving fast speed and complex data streams

2021 7th International Conference on Optimization and Applications (ICOA) ◽

10.1109/icoa51614.2021.9442621 ◽

2021 ◽

Author(s):

G. Suseendran ◽

D. Balaganesh ◽

D. Akila ◽

Souvik Pal

Keyword(s):

Deep Learning ◽

Data Streams ◽

Pattern Mining ◽

Frequent Pattern Mining ◽

Structured Data ◽

Frequent Pattern ◽

Complex Data ◽

Fast Speed

Download Full-text

Genotype Pattern Mining for Pairs of Interacting Variants Underlying Digenic Traits

Genes ◽

10.3390/genes12081160 ◽

2021 ◽

Vol 12 (8) ◽

pp. 1160

Author(s):

Atsuko Okazaki ◽

Sukanya Horpaopan ◽

Qingrun Zhang ◽

Matthew Randesi ◽

Jurg Ott

Keyword(s):

Null Hypothesis ◽

Pattern Mining ◽

Genetic Diseases ◽

Frequent Pattern Mining ◽

Case Control ◽

Frequent Pattern ◽

Permutation Testing ◽

Case Control Studies ◽

P Values ◽

Dna Variants

Some genetic diseases (“digenic traits”) are due to the interaction between two DNA variants, which presumably reflects biochemical interactions. For example, certain forms of Retinitis Pigmentosa, a type of blindness, occur in the presence of two mutant variants, one each in the ROM1 and RDS genes, while the occurrence of only one such variant results in a normal phenotype. Detecting variant pairs underlying digenic traits by standard genetic methods is difficult and is downright impossible when individual variants alone have minimal effects. Frequent pattern mining (FPM) methods are known to detect patterns of items. We make use of FPM approaches to find pairs of genotypes (from different variants) that can discriminate between cases and controls. Our method is based on genotype patterns of length two, and permutation testing allows assigning p-values to genotype patterns, where the null hypothesis refers to equal pattern frequencies in cases and controls. We compare different interaction search approaches and their properties on the basis of published datasets. Our implementation of FPM to case-control studies is freely available.

Download Full-text