Scalable Fuzzy Algorithms for Data Management and Analysis

The emergence of music recommendation systems calls for the development of new data management technologies able to query vast music collections. In this chapter, the authors present a music warehouse prototype able to perform efficient nearest neighbor searches in an arbitrary song similarity space. Using fuzzy songs sets, the music warehouse offers a practical solution to three concrete musical data management scenarios: user musical preferences, user feedback, and song similarities. The authors investigate three practical approaches to tackle the storage issues of fuzzy song sets: tables, arrays, and compressed bitmaps. They confront theoretical estimates with practical implementation results and prove that, from a storage point of view, arrays and compressed bitmaps are both effective data structure solutions. With respect to speed, the authors show that operations on compressed bitmap offer a significant grain in performances for fuzzy song sets comprising a large number of songs. Finally, the authors argue that the presented results are not limited to music recommendations system but can be applied to other domains.

Download Full-text

High Scale Fuzzy Video Mining

Scalable Fuzzy Algorithms for Data Management and Analysis ◽

10.4018/978-1-60566-858-1.ch015 ◽

2010 ◽

pp. 365-378 ◽

Cited By ~ 1

Author(s):

Christophe Marsala ◽

Marcin Detyniecki

Keyword(s):

Fuzzy Logic ◽

Decision Trees ◽

Video Data ◽

Data Sets ◽

Fuzzy Decision ◽

High Scale ◽

Video Mining ◽

Quality Video ◽

Fuzzy Decision Trees

In this chapter, the authors focus on the use of forests of fuzzy decision trees (FFDT) in a video mining application. They discuss how to learn from a high scale video data sets and how to use the trained FFDTs to detect concepts in a high number of video shots. Moreover, the authors study the effect of the size of the forest on the performance; and of the use of fuzzy logic during the classification process. The experiments are performed on a well-know non-video dataset and on a real TV quality video benchmark.

Download Full-text

(Approximate) Frequent Item Set Mining Made Simple with a Split and Merge Algorithm

Scalable Fuzzy Algorithms for Data Management and Analysis ◽

10.4018/978-1-60566-858-1.ch010 ◽

2010 ◽

pp. 254-272

Author(s):

Christian Borgelt ◽

Xiaomeng Wang

Keyword(s):

Data Structure ◽

Main Memory ◽

New Method ◽

Processing Scheme ◽

Frequent Item ◽

Split And Merge ◽

Mining Algorithms ◽

Missing Items

In this chapter the authors introduce SaM, a split and merge algorithm for frequent item set mining. Its core advantages are its extremely simple data structure and processing scheme, which not only make it very easy to implement, but also fairly easy to execute on external storage, thus rendering it a highly useful method if the data to mine cannot be loaded into main memory. Furthermore, the authors present extensions of this algorithm, which allow for approximate or “fuzzy” frequent item set mining in the sense that missing items can be inserted into transactions with a user-specified penalty. Finally, they present experiments comparing their new method with classical frequent item set mining algorithms (like Apriori, Eclat and FP-growth) and with the approximate frequent item set mining version of RElim (an algorithm the authors proposed in an earlier paper and improved in the meantime).

Download Full-text

Human Focused Summarizing Statistics Using OWA Operators

Scalable Fuzzy Algorithms for Data Management and Analysis ◽

10.4018/978-1-60566-858-1.ch009 ◽

2010 ◽

pp. 238-253

Author(s):

Ronald R. Yager

Keyword(s):

Generating Functions ◽

Large Data ◽

Large Data Sets ◽

Weighted Averaging ◽

Data Sets ◽

Owa Operator ◽

Owa Operators ◽

Ordered Weighted Averaging ◽

Data Analyst ◽

Computer Aided

The ordered weighted averaging (OWA) operator is introduced and the author discusses how it can provide a basis for generating summarizing statistics over large data sets. The author further notes how different forms of OWA operators, and hence different summarizing statistics, can be induced using weight-generating functions. The author shows how these weight-generating functions can provide a vehicle with which a data analyst can express desired summarizing statistics. Modern data analysis requires the use of more human focused summarizing statistics then those classically used. The author’s goal here is to develop to ideas to enable a human focused approach to summarizing statistics. Using these ideas we can envision a computer aided construction of the weight generating functions based upon a combination of graphical and linguistic specifications provided by a data analyst describing his desired summarization.

Download Full-text

Linguistic Data Summarization

Scalable Fuzzy Algorithms for Data Management and Analysis ◽

10.4018/978-1-60566-858-1.ch008 ◽

2010 ◽

pp. 214-237 ◽

Cited By ~ 2

Author(s):

Janusz Kacprzyk ◽

Slawomir Zadrozny

Keyword(s):

Data Mining ◽

Natural Language ◽

Human Communication ◽

Problem Size ◽

Data Summarization ◽

Linguistic Data ◽

Mining Tool ◽

Data Summaries ◽

Natural Means ◽

Mining Tools

The authors discuss aspects related to the scalability of data mining tools meant in a different way than whether a data mining tool retains its intended functionality as the problem size increases. They introduce a new concept of a cognitive (perceptual) scalability meant as whether as the problem size increases the method remains fully functional in the sense of being able to provide intuitively appealing and comprehensible results to the human user. The authors argue that the use of natural language in the linguistic data summaries provides a high cognitive (perceptional) scalability because natural language is the only fully natural means of human communication and provides a common language for individuals and groups of different backgrounds, skills, knowledge. They show that the use of Zadeh’s protoform as general representations of linguistic data summaries, proposed by Kacprzyk and Zadrozny (2002; 2005a; 2005b), amplify this advantage leading to an ultimate cognitive (perceptual) scalability.

Download Full-text

A Flexible Language for Exploring Clustered Search Results

Scalable Fuzzy Algorithms for Data Management and Analysis ◽

10.4018/978-1-60566-858-1.ch007 ◽

2010 ◽

pp. 179-213

Author(s):

Gloria Bordogna ◽

Alessandro Campi ◽

Stefania Ronchi ◽

Giuseppe Psaila

Keyword(s):

Search Engines ◽

The Other ◽

Relational Algebra ◽

Internet Search ◽

Ranking Methods ◽

Flexible Approach ◽

Search Results ◽

Manipulation Language ◽

Ranked List

In this chapter the authors consider the problem of defining a flexible approach for exploring huge amounts of results retrieved by several Internet search services (like search engines). The goal is to offer users a way to discover relevant hidden relationships between documents. The proposal is motivated by the observation that visualization paradigms, based on either the ranked list or clustered results, do not allow users to fully appreciate and understand the retrieved contents. In the case of long ranked lists, the user generally analyzes only the first few pages. On the other side, in the case the documents are clustered, to understand their contents the user does not have other means that looking at the cluster labels. When the same query is submitted to distinct search services, they may produce partially overlapped clustered results, where clusters identified by distinct labels collect some common documents. Moreover, clusters with similar labels, but containing distinct documents, may be produced as well. In such a situation, it may be useful to compare, combine and rank the cluster contents, to filter out relevant documents. In this chapter the authors present a novel manipulation language, in which several operators (inspired by relational algebra) and distinct ranking methods can be exploited to analyze the clusters’ contents. New clusters can be generated and ranked based on distinct criteria, by combining (i.e., overlapping, refining and intersecting) clusters in a set oriented fashion. Specifically, the chapter is focused on the ranking methods defined for each operator of the language.

Download Full-text

Scalable Reasoning with Tractable Fuzzy Ontology Languages

Scalable Fuzzy Algorithms for Data Management and Analysis ◽

10.4018/978-1-60566-858-1.ch005 ◽

2010 ◽

pp. 130-158

Author(s):

Giorgos Stoilos ◽

Jeff Z. Pan ◽

Giorgos Stamou

Keyword(s):

Semantic Web ◽

Significant Role ◽

Research Effort ◽

Description Logics ◽

Polynomial Algorithms ◽

Scalable Algorithms ◽

Fuzzy Ontology ◽

Fuzzy Query ◽

Ontology Languages ◽

The Web

The last couple of years it is widely acknowledged that uncertainty and fuzzy extensions to ontology languages, like description logics (DLs) and OWL, could play a significant role in the improvement of many Semantic Web (SW) applications like matching, merging and ranking. Unfortunately, existing fuzzy reasoners focus on very expressive fuzzy ontology languages, like OWL, and are thus not able to handle the scale of data that the Web provides. For those reasons much research effort has been focused on providing fuzzy extensions and algorithms for tractable ontology languages. In this chapter, the authors present some recent results about reasoning and fuzzy query answering over tractable/polynomial fuzzy ontology languages namely Fuzzy DL-Lite and Fuzzy EL+. Fuzzy DL-Lite provides scalable algorithms for very expressive (extended) conjunctive queries, while Fuzzy EL+ provides polynomial algorithms for knowledge classification. For the Fuzzy DL-Lite case the authors will also report on an implementation in the ONTOSEARCH2 system and preliminary, but encouraging, benchmarking results.

Download Full-text

Mining Association Rules from Fuzzy DataCubes

Scalable Fuzzy Algorithms for Data Management and Analysis ◽

10.4018/978-1-60566-858-1.ch004 ◽

2010 ◽

pp. 84-129

Author(s):

Nicolás Marín ◽

Carlos Molina ◽

Daniel Sánchez ◽

M. Amparo Vila

Keyword(s):

Data Mining ◽

Fuzzy Logic ◽

Association Rules ◽

Online Analytical Processing ◽

Data Sources ◽

Data Mining Techniques ◽

Rule Sets ◽

Analytical Processing ◽

Mining Association Rules ◽

Different Sources

The use of online analytical processing (OLAP) systems as data sources for data mining techniques has been widely studied and has resulted in what is known as online analytical mining (OLAM). As a result of both the use of OLAP technology in new fields of knowledge and the merging of data from different sources, it has become necessary for models to support imprecision. We, therefore, need OLAM methods which are able to deal with this imprecision. Association rules are one of the most used data mining techniques. There are several proposals that enable the extraction of association rules on DataCubes but few of these deal with imprecision in the process and give as result complex rule sets. In this chapter the authors will present a method that manages the imprecision and reduces the complexity. They will study the influence of the use of fuzzy logic using different size problems and comparing the results with a crisp approach.

Download Full-text

Scaling Fuzzy Models

Scalable Fuzzy Algorithms for Data Management and Analysis ◽

10.4018/978-1-60566-858-1.ch002 ◽

2010 ◽

pp. 31-53 ◽

Cited By ~ 1

Author(s):

Lawrence O. Hall ◽

Dmitry B. Goldgof ◽

Juana Canul-Reich ◽

Prodip Hore ◽

Weijian Cheng ◽

...

Keyword(s):

Fuzzy Clustering ◽

Approaches To Learning ◽

Large Data ◽

Medical Data ◽

Large Data Sets ◽

Data Sets ◽

Data Repositories ◽

Fuzzy Models ◽

Fuzzy Learning ◽

Existing Data

This chapter examines how to scale algorithms which learn fuzzy models from the increasing amounts of labeled or unlabeled data that are becoming available. Large data repositories are increasingly available, such as records of network transmissions, customer transactions, medical data, and so on. A question arises about how to utilize the data effectively for both supervised and unsupervised fuzzy learning. This chapter will focus on ensemble approaches to learning fuzzy models for large data sets which may be labeled or unlabeled. Further, the authors examine ways of scaling fuzzy clustering to extremely large data sets. Examples from existing data repositories, some quite large, will be given to show the approaches discussed here are effective.

Download Full-text

Electronic Hardware for Fuzzy Computation

Scalable Fuzzy Algorithms for Data Management and Analysis ◽

10.4018/978-1-60566-858-1.ch001 ◽

2010 ◽

pp. 1-30 ◽

Cited By ~ 5

Author(s):

Koldo Basterretxea ◽

Inés del Campo

Keyword(s):

Data Mining ◽

Reconfigurable Computing ◽

High Performance ◽

Reconfigurable Hardware ◽

Dynamic Partial Reconfiguration ◽

Field Programmable ◽

Analog Approach ◽

Programmable Gate Arrays ◽

System On Programmable Chip ◽

Electronic Hardware

This chapter describes two decades of evolution of electronic hardware for fuzzy computing, and discusses the new trends and challenges that are currently being faced in this field. Firstly the authors analyze the main design approaches performed since first fuzzy chip designs were published and until the consolidation of reconfigurable hardware: the digital approach and the analog approach. Secondly, the evolution of fuzzy hardware based on reconfigurable devices, from traditional field programmable gate arrays to complex system-on-programmable chip solutions, is described and its relationship with the scalability issue is explained. The reconfigurable approach is completed by analyzing a cutting edge design methodology known as dynamic partial reconfiguration and by reviewing some evolvable fuzzy hardware designs. Lastly, regarding fuzzy data-mining processing, the main proposals to speed up data-mining workloads are presented: multiprocessor architectures, reconfigurable hardware, and high performance reconfigurable computing.

Download Full-text

Scalable Fuzzy Algorithms for Data Management and Analysis
Latest Publications

TOTAL DOCUMENTS

H-INDEX

Published By IGI Global

Using Fuzzy Song Sets in Music Warehouses

High Scale Fuzzy Video Mining

(Approximate) Frequent Item Set Mining Made Simple with a Split and Merge Algorithm

Human Focused Summarizing Statistics Using OWA Operators

Linguistic Data Summarization

A Flexible Language for Exploring Clustered Search Results

Scalable Reasoning with Tractable Fuzzy Ontology Languages

Mining Association Rules from Fuzzy DataCubes

Scaling Fuzzy Models

Electronic Hardware for Fuzzy Computation

Export Citation Format

Scalable Fuzzy Algorithms for Data Management and AnalysisLatest Publications

TOTAL DOCUMENTS

H-INDEX

Published By IGI Global

Using Fuzzy Song Sets in Music Warehouses

High Scale Fuzzy Video Mining

(Approximate) Frequent Item Set Mining Made Simple with a Split and Merge Algorithm

Human Focused Summarizing Statistics Using OWA Operators

Linguistic Data Summarization

A Flexible Language for Exploring Clustered Search Results

Scalable Reasoning with Tractable Fuzzy Ontology Languages

Mining Association Rules from Fuzzy DataCubes

Scaling Fuzzy Models

Electronic Hardware for Fuzzy Computation

Scalable Fuzzy Algorithms for Data Management and Analysis
Latest Publications