Pattern Mining and Clustering on Image Databases

2009 ◽  
pp. 60-85
Author(s):  
Marinette Bouet ◽  
Pierre Gançarski ◽  
Marie-Aude Aufaure ◽  
Omar Boussaïd

Analysing and mining image data to derive potentially useful information is a very challenging task. Image mining concerns the extraction of implicit knowledge, image data relationships, associations between image data and other data or patterns not explicitly stored in the images. Another crucial task is to organise the large image volumes to extract relevant information. In fact, decision support systems are evolving to store and analyse these complex data. This chapter presents a survey of the relevant research related to image data processing. We present data warehouse advances that organise large volumes of data linked with images, and then we focus on two techniques largely used in image mining. We present clustering methods applied to image analysis, and we introduce the new research direction concerning pattern mining from large collections of images. While considerable advances have been made in image clustering, there is little research dealing with image frequent pattern mining. We will try to understand why.

Author(s):  
Marinette Bouet ◽  
Pierre Gançarski ◽  
Marie-Aude Aufaure ◽  
Omar Boussaïd

Analysing and mining image data to derive potentially useful information is a very challenging task. Image mining concerns the extraction of implicit knowledge, image data relationships, associations between image data and other data or patterns not explicitly stored in the images. Another crucial task is to organize the large image volumes to extract relevant information. In fact, decision support systems are evolving to store and analyse these complex data. This paper presents a survey of the relevant research related to image data processing. We present data warehouse advances that organize large volumes of data linked with images and then, we focus on two techniques largely used in image mining. We present clustering methods applied to image analysis and we introduce the new research direction concerning pattern mining from large collections of images. While considerable advances have been made in image clustering, there is little research dealing with image frequent pattern mining. We shall try to understand why.


2008 ◽  
pp. 254-279
Author(s):  
Marinette Bouet ◽  
Pierre Gançarski ◽  
Omar Boussaïd

Analysing and mining image data to derive potentially useful information is a very challenging task. Image mining concerns the extraction of implicit knowledge, image data relationships, associations between image data and other data or patterns not explicitly stored in the images. Another crucial task is to organize the large image volumes to extract relevant information. In fact, decision support systems are evolving to store and analyse these complex data. This paper presents a survey of the relevant research related to image data processing. We present data warehouse advances that organize large volumes of data linked with images and then, we focus on two techniques largely used in image mining. We present clustering methods applied to image analysis and we introduce the new research direction concerning pattern mining from large collections of images. While considerable advances have been made in image clustering, there is little research dealing with image frequent pattern mining. We shall try to understand why.


2014 ◽  
Vol 602-605 ◽  
pp. 3536-3539
Author(s):  
Yu Fu ◽  
Jun Rui Yang

Frequent pattern mining has been an important research direction in association rules. This paper use a methodology by preprocessing the original dataset using fuzzy clustering which can mapped quantitative datasets into linguistic datasets. Then we propose a algorithm based on fuzzy frequent pattern tree for extracting fuzzy frequent itemset from mapped linguistic datasets. Experimental results show that our algorithm is shorter than the F-Apriori on computing time to huge database. For large database, the algorithm presented in this paper is proved to have a good prospect.


2021 ◽  
Author(s):  
Shamsa Abid ◽  
Shafay Shamail ◽  
Hamid Abdul Basit ◽  
Sarah Nadi

Abstract To save time, developers often search for code examples that implement their desired software features. Existing code search techniques typically focus on finding code snippets for a single given query, which means that developers need to perform a separate search for each desired functionality. In this paper, we pro-pose FACER (Feature-driven API usage-based Code Examples Recommender), a technique that avoids repeated searches through opportunistic reuse. Specifically, given the selected code snippet that matches the initial search query, FACER finds and suggests related code snippets that represent features that the developer may want to implement next. FACER first constructs a code fact repository by parsing the source code of open-source Java projects to obtain methods’ textual information, call graphs, and Application Programming Interface (API) usages. It then detects unique features by clustering methods based on similar API us-ages, where each cluster represents a feature or functionality. Finally, it detects frequently co-occurring features across projects using frequent pattern mining and recommends related methods from the mined patterns. To evaluate FACER, we run it on 120 Java Android apps from GitHub. We first manually validate that the detected method clusters represent methods with similar functionality. We then perform an automated evaluation to determine the best parameters (e.g., similarity threshold) for FACER. We recruit 10 professional developers along with 39 experienced students to judge FACER’s recommendation of related methods. Our results show that, on average, FACER’s recommendations are 80% precise. We also survey a total of 20 professional Android and Java developers to understand their code search and reuse experiences, and also to obtain their feedback on the usability and usefulness of FACER. The survey results show that 95% of our surveyed professional developers find the idea of related method recommendations useful during code reuse.


Information sharing among the associations is a general development in a couple of zones like business headway and exhibiting. As bit of the touchy principles that ought to be kept private may be uncovered and such disclosure of delicate examples may impacts the advantages of the association that have the data. Subsequently the standards which are delicate must be secured before sharing the data. In this paper to give secure information sharing delicate guidelines are bothered first which was found by incessant example tree. Here touchy arrangement of principles are bothered by substitution. This kind of substitution diminishes the hazard and increment the utility of the dataset when contrasted with different techniques. Examination is done on certifiable dataset. Results shows that proposed work is better as appear differently in relation to various past strategies on the introduce of evaluation parameters.


2011 ◽  
Vol 22 (8) ◽  
pp. 1749-1760
Author(s):  
Yu-Hong GUO ◽  
Yun-Hai TONG ◽  
Shi-Wei TANG ◽  
Leng-Dong WU

Genes ◽  
2021 ◽  
Vol 12 (8) ◽  
pp. 1160
Author(s):  
Atsuko Okazaki ◽  
Sukanya Horpaopan ◽  
Qingrun Zhang ◽  
Matthew Randesi ◽  
Jurg Ott

Some genetic diseases (“digenic traits”) are due to the interaction between two DNA variants, which presumably reflects biochemical interactions. For example, certain forms of Retinitis Pigmentosa, a type of blindness, occur in the presence of two mutant variants, one each in the ROM1 and RDS genes, while the occurrence of only one such variant results in a normal phenotype. Detecting variant pairs underlying digenic traits by standard genetic methods is difficult and is downright impossible when individual variants alone have minimal effects. Frequent pattern mining (FPM) methods are known to detect patterns of items. We make use of FPM approaches to find pairs of genotypes (from different variants) that can discriminate between cases and controls. Our method is based on genotype patterns of length two, and permutation testing allows assigning p-values to genotype patterns, where the null hypothesis refers to equal pattern frequencies in cases and controls. We compare different interaction search approaches and their properties on the basis of published datasets. Our implementation of FPM to case-control studies is freely available.


Sign in / Sign up

Export Citation Format

Share Document