The Hierarchies of Multivalued Attribute Domains and Corresponding Applications in Data Mining

In mobile computing, machine learning models for natural language processing (NLP) have become one of the most attractive focus areas in research. Association rules among attributes are common knowledge patterns, which can often provide potential and useful information such as mobile users' interests. Actually, almost each attribute is associated with a hierarchy of the domain. Given an relation R=(U,A) and any cut αa on the hierarchy for every attribute a, there is another rough relation RΦ, where Φ=(αa:a∈A). This paper will establish the connection between the functional dependencies in R and RΦ, propose the method for extracting reducts in RΦ, and demonstrate the implementation of proposed method on an application in data mining of association rules. The method for acquiring association rules consists of the following three steps: (1) translating natural texts into relations, by NLP; (2) translating relations into rough ones, by attributes analysis or fuzzy k-means (FKM) clustering; and (3) extracting association rules from concept lattices, by formal concept analysis (FCA). Our experimental results show that the proposed methods, which can be applied directly to regular mobile data such as healthcare data, improved quality, and relevance of rules.

Download Full-text

Validity of association rules extracted by healthcare-data-mining

2014 36th Annual International Conference of the IEEE Engineering in Medicine and Biology Society ◽

10.1109/embc.2014.6944737 ◽

2014 ◽

Cited By ~ 2

Author(s):

Hiroshi Takeuchi ◽

Naoki Kodama

Keyword(s):

Data Mining ◽

Association Rules ◽

Healthcare Data

Download Full-text

Present State-of-The-ART of Dynamic Association Rule Mining Algorithms

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.a4107.119119 ◽

2019 ◽

Vol 9 (1) ◽

pp. 309-316

Keyword(s):

Data Mining ◽

Association Rules ◽

Association Rule ◽

Association Rule Mining ◽

Rule Mining ◽

Healthcare Data ◽

Data Mining Approach ◽

Mining Community ◽

Pattern Growth ◽

Mining Algorithms

Association Rule Mining (ARM) is a data mining approach for discovering rules that reveal latent associations among persisted entity sets. ARM has many significant applications in the real world such as finding interesting incidents, analyzing stock market data and discovering hidden relationships in healthcare data to mention few. Many algorithms that are efficient to mine association rules are found in the existing literature, apriori-based and Pattern-Growth. Comprehensive understanding of them helps data mining community and its stakeholders to make expert decisions. Dynamic update of association rules that have been discovered already is very challenging due to the fact that the changes are arbitrary and heterogeneous in the kind of operations. When new instances are added to existing dataset that has been subjected to ARM, only those instances are to be used in order to go for incremental mining of rules instead of considering the whole dataset again. Recently some algorithms were developed by researchers especially to achieve incremental ARM. They are broadly grouped into Apriori-based and Pattern-Growth. This paper provides review of Apriori-based and Pattern-Growth techniques that support incremental ARM.

Download Full-text

Analyze and Enhance Sales in Lulu Supermarket using Data Mining Technology

Journal of Student Research ◽

10.47611/jsr.vi.926 ◽

2020 ◽

Author(s):

Ahmed Abdullah Awadh Koofan ◽

Mohammed Kaleem

Keyword(s):

Data Mining ◽

Data Analysis ◽

Association Rules ◽

Association Rule ◽

Relevant Information ◽

Research Association ◽

Mining Technology ◽

Time Consumption ◽

Huge Data ◽

Using Data

-Data mining is a powerful technology for analyzing huge data, it has many techniques such as; classification, clustering, prediction and association rules etc., In this research Association rule will be used for analyzing data, which will help to extract the data related to combinations of items. Numerous customers tends to purchase items regularly, each time they visit supermarket, customer’s need to move around from shelf to shelf for the product of their interest which is time consuming. This research will help to minimize the time consumption for customers by analyzing the customer’s invoices and letting know the supermarket about the patterns of customer's orientations. In this work python tool will be used for data mining, by using association rule to analyze the customer’s purchases and retrieve the relevant information which will help to determine the customer’s pattern and know the association between products. In this rationale, the data of customer’s purchases were collected from Lulu hypermarket for data analysis and the outcomes of the analysis is to know the customer’s patterns and making the shopping easy by reorganizing the related items and the most buying items together on same shelf.

Download Full-text

Mining Rare Association Rules by Discovering Quasi-Functional Dependencies

Rare Association Rule Mining and Knowledge Discovery ◽

10.4018/978-1-60566-754-6.ch009 ◽

2010 ◽

pp. 131-149

Author(s):

Giulia Bruno ◽

Paolo Garza ◽

Elisa Quintarelli

Keyword(s):

Data Mining ◽

Association Rules ◽

Incremental Algorithm ◽

Functional Dependencies ◽

Data Mining Technique ◽

Data Set ◽

Rare Association ◽

Normal Behavior ◽

The Impact ◽

Over Time

In the context of anomaly detection, the data mining technique of extracting association rules can be used to identify rare rules which represent infrequent situations. A method to detect rare rules is to first infer the normal behavior of objects in the form of quasi-functional dependencies (i.e. functional dependencies that frequently hold), and then analyzing rare violations with respect to them. The quasi-functional dependencies are usually inferred from the current instance of a database. However, in several applications, the database is not static, but new data are added or deleted continuously. Thus, the anomalies have to be updated because they change over time. In this chapter, we propose an incremental algorithm to efficiently maintain up-to-date rules (i.e., functional and quasi-functional dependencies). The impact of the cardinality of the data set and the number of new tuples on the execution time is evaluated through a set of experiments on synthetic and real databases, whose results are here reported.

Download Full-text

Interactive Data Mining: A Short Background Study on Effective Interaction and Visualization by Association Rules

2nd International conference on Innovative Engineering Technologies (ICIET'2015) August 7-8, 2015 Bangkok (Thailand) ◽

10.15242/iie.e0815001 ◽

2015 ◽

Keyword(s):

Data Mining ◽

Association Rules ◽

Effective Interaction ◽

Interactive Data Mining ◽

Interactive Data

Download Full-text

How Useful Can Be Data Mining For A Continuos Speech Therapist’s Education?

Balkan Region Conference on Engineering and Business Education ◽

10.2478/cplbu-2014-0050 ◽

2014 ◽

Vol 1 (1) ◽

pp. 339-342

Author(s):

Mirela Danubianu ◽

Dragos Mircea Danubianu

Keyword(s):

Data Mining ◽

Information And Communication Technology ◽

Association Rules ◽

Communication Technology ◽

Speech Therapy ◽

Proper Treatment ◽

Speech Impairments ◽

Information And Communication ◽

Specific Education

AbstractSpeech therapy can be viewed as a business in logopaedic area that aims to offer services for correcting language. A proper treatment of speech impairments ensures improved efficiency of therapy, so, in order to do that, a therapist must continuously learn how to adjust its therapy methods to patient's characteristics. Using Information and Communication Technology in this area allowed collecting a lot of data regarding various aspects of treatment. These data can be used for a data mining process in order to find useful and usable patterns and models which help therapists to improve its specific education. Clustering, classification or association rules can provide unexpected information which help to complete therapist's knowledge and to adapt the therapy to patient's needs.

Download Full-text

Research and Application of Association Rules Methods in Data Mining for Commercial Sales Analysis

2009 International Conference on Networking and Digital Society ◽

10.1109/icnds.2009.51 ◽

2009 ◽

Cited By ~ 3

Author(s):

Han Bing ◽

Li Ye-bai

Keyword(s):

Data Mining ◽

Association Rules

Download Full-text

Application of association rules based on fuzzy formal concept analysis

The 27th Chinese Control and Decision Conference (2015 CCDC) ◽

10.1109/ccdc.2015.7161875 ◽

2015 ◽

Author(s):

Jianbo Liu ◽

Xiaomin Wang ◽

Yanyan Zhang ◽

Hongbo Feng

Keyword(s):

Association Rules ◽

Formal Concept Analysis ◽

Concept Analysis ◽

Formal Concept ◽

Fuzzy Formal Concept ◽

Fuzzy Formal Concept Analysis

Download Full-text

Triage and diagnosis of COVID-19 from medical social media (Preprint)

10.2196/preprints.30397 ◽

2021 ◽

Author(s):

Abul Hasan ◽

Mark Levene ◽

David Weston ◽

Renate Fromson ◽

Nicolas Koslover ◽

...

Keyword(s):

Machine Learning ◽

Social Media ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Learning Models ◽

Rule Based ◽

Additional Information ◽

Processing Pipeline ◽

Machine Learning Models

BACKGROUND The COVID-19 pandemic has created a pressing need for integrating information from disparate sources, in order to assist decision makers. Social media is important in this respect, however, to make sense of the textual information it provides and be able to automate the processing of large amounts of data, natural language processing methods are needed. Social media posts are often noisy, yet they may provide valuable insights regarding the severity and prevalence of the disease in the population. In particular, machine learning techniques for triage and diagnosis could allow for a better understanding of what social media may offer in this respect. OBJECTIVE This study aims to develop an end-to-end natural language processing pipeline for triage and diagnosis of COVID-19 from patient-authored social media posts, in order to provide researchers and other interested parties with additional information on the symptoms, severity and prevalence of the disease. METHODS The text processing pipeline first extracts COVID-19 symptoms and related concepts such as severity, duration, negations, and body parts from patients’ posts using conditional random fields. An unsupervised rule-based algorithm is then applied to establish relations between concepts in the next step of the pipeline. The extracted concepts and relations are subsequently used to construct two different vector representations of each post. These vectors are applied separately to build support vector machine learning models to triage patients into three categories and diagnose them for COVID-19. RESULTS We report that Macro- and Micro-averaged F_{1\ }scores in the range of 71-96% and 61-87%, respectively, for the triage and diagnosis of COVID-19, when the models are trained on human labelled data. Our experimental results indicate that similar performance can be achieved when the models are trained using predicted labels from concept extraction and rule-based classifiers, thus yielding end-to-end machine learning. Also, we highlight important features uncovered by our diagnostic machine learning models and compare them with the most frequent symptoms revealed in another COVID-19 dataset. In particular, we found that the most important features are not always the most frequent ones. CONCLUSIONS Our preliminary results show that it is possible to automatically triage and diagnose patients for COVID-19 from natural language narratives using a machine learning pipeline, in order to provide additional information on the severity and prevalence of the disease through the eyes of social media.

Download Full-text

A Data Mining Approach for Risk Assessment in Car Insurance

International Journal of Business Intelligence Research ◽

10.4018/ijbir.2014070102 ◽

2014 ◽

Vol 5 (3) ◽

pp. 11-28

Author(s):

Ljiljana Kašćelan ◽

Vladimir Kašćelan ◽

Milijana Novović-Burić

Keyword(s):

Data Mining ◽

Risk Assessment ◽

Risk Premium ◽

Study Data ◽

Functional Dependencies ◽

Standard Methods ◽

Data Mining Techniques ◽

Data Mining Approach ◽

The One ◽

Car Insurance

This paper has proposed a data mining approach for risk assessment in car insurance. Standard methods imply classification of policies to great number of tariff classes and assessment of risk on basis of them. With application of data mining techniques, it is possible to get functional dependencies between the level of risk and risk factors as well as better results in predictions. On the case study data it has been proved that data mining techniques can, with better accuracy than the standard methods, predict claim sizes and occurrence of claims, and this represents the basis for calculation of net risk premium and risk classification. This paper, also, discusses advantages of data mining methods compared to standard methods for risk assessment in car insurance, as well as the specificities of the obtained results due to small insurance market, such is the one in Montenegro.

Download Full-text