Using Data Mining Algorithms to Discover Regular Sound Changes among Languages

This paper presents a method of using association rule data mining algorithms to discover regular sound changes among languages. The method presented has a great potential to facilitate linguistic studies aimed at identifying distantly related cognate languages. As an experimental example, this paper presents the application of the data mining method to the discovery of regular sound changes between the Hungarian and the Sumerian languages, which separated at least five thousand years ago when the Proto-Sumerian reached Mesopotamia. The data mining method discovered an important regular sound change between Hungarian word initial /f/ and Sumerian word initial /b/ phonemes.

A Survey of Feature Selection Techniques

Encyclopedia of Data Warehousing and Mining, Second Edition ◽

10.4018/978-1-60566-010-3.ch289 ◽

2011 ◽

pp. 1888-1895 ◽

Cited By ~ 17

Author(s):

Barak Chizi ◽

Lior Rokach ◽

Oded Maimon

Keyword(s):

Data Mining ◽

Feature Selection ◽

Mining Method ◽

Data Set ◽

Data Mining Method ◽

Data Mining Algorithms ◽

Wrapper Approach ◽

Computationally Intensive ◽

Filter Approach ◽

Dimensionality (i.e., the number of data set attributes or groups of attributes) constitutes a serious obstacle to the efficiency of most data mining algorithms (Maimon and Last, 2000). The main reason for this is that data mining algorithms are computationally intensive. This obstacle is sometimes known as the “curse of dimensionality” (Bellman, 1961). The objective of Feature Selection is to identify features in the data-set as important, and discard any other feature as irrelevant and redundant information. Since Feature Selection reduces the dimensionality of the data, data mining algorithms can be operated faster and more effectively by using Feature Selection. In some cases, as a result of feature selection, the performance of the data mining method can be improved. The reason for that is mainly a more compact, easily interpreted representation of the target concept. The filter approach (Kohavi , 1995; Kohavi and John ,1996) operates independently of the data mining method employed subsequently -- undesirable features are filtered out of the data before learning begins. These algorithms use heuristics based on general characteristics of the data to evaluate the merit of feature subsets. A sub-category of filter methods that will be refer to as rankers, are methods that employ some criterion to score each feature and provide a ranking. From this ordering, several feature subsets can be chosen by manually setting There are three main approaches for feature selection: wrapper, filter and embedded. The wrapper approach (Kohavi, 1995; Kohavi and John,1996), uses an inducer as a black box along with a statistical re-sampling technique such as cross-validation to select the best feature subset according to some predictive measure. The embedded approach (see for instance Guyon and Elisseeff, 2003) is similar to the wrapper approach in the sense that the features are specifically selected for a certain inducer, but it selects the features in the process of learning.

Lecture Notes in Computer Science - Knowledge-Based Intelligent Information and Engineering Systems ◽

Analysis Between Lifestyle, Family Medical History and Medical Abnormalities Using Data Mining Method – Association Rule Analysis

10.1007/11552451_22 ◽

2005 ◽

pp. 161-171 ◽

Cited By ~ 2

Author(s):

Mitsuhiro Ogasawara ◽

Hiroki Sugimori ◽

Yukiyasu Iida ◽

Katsumi Yoshida

Keyword(s):

Data Mining ◽

Medical History ◽

Association Rule ◽

Mining Method ◽

Data Mining Method ◽

Family Medical History ◽

Rule Analysis ◽

A Study on Detection of Small Size Malicious Code using Data Mining Method

Jouranl of Information and Security ◽

10.33778/kcsa.2019.19.1.011 ◽

2019 ◽

Vol 19 (1) ◽

pp. 11-17

Author(s):

Taek-Hyun Lee ◽

◽

Ho Kook Kwang

Keyword(s):

Data Mining ◽

Malicious Code ◽

Mining Method ◽

Data Mining Method ◽

Review for "Identification of faulted line section in microgrids using data mining method based on feature discretisation"

10.1002/2050-7038.12353/v1/review3 ◽

2019 ◽

Author(s):

ozan akdag

Keyword(s):

Data Mining ◽

Mining Method ◽

Data Mining Method ◽

Line Section ◽

Simple Correlation Between Weather and COVID-19 Pandemic Using Data Mining Algorithms

IOP Conference Series Materials Science and Engineering ◽

10.1088/1757-899x/982/1/012015 ◽

2020 ◽

Vol 982 ◽

pp. 012015

Author(s):

Ari Fadli ◽

Azis Wisnu Widhi Nugraha ◽

Muhammad Syaiful Aliim ◽

Acep Taryana ◽

Yogiek Indra Kurniawan ◽

...

Keyword(s):

Data Mining ◽

Simple Correlation ◽

Data Mining Algorithms ◽

Using Data ◽

2020 4th International Conference on Electronics, Communication and Aerospace Technology (ICECA) ◽

Diagnosis Using Data Mining Algorithms for Malignant Breast Cancer Cell Detection

10.1109/iceca49313.2020.9297481 ◽

2020 ◽

Author(s):

S. Saranya ◽

S. Sasikala

Keyword(s):

Breast Cancer ◽

Data Mining ◽

Cancer Cell ◽

Breast Cancer Cell ◽

Cell Detection ◽

Data Mining Algorithms ◽

Malignant Breast ◽

Using Data ◽

Cancer Cell Detection ◽

An Analysis on Watch Determinants of K-League Broadcasts Using Data Mining Method : Focused on A Portal Site Broadcasts

Journal of Sport and Leisure Studies ◽

10.51979/kssls.2016.05.64.195 ◽

2016 ◽

Vol 64 ◽

pp. 195-209

Author(s):

Wang-Sung Myung ◽

Young-Shin Won ◽

Min-Gyu Lee

Keyword(s):

Data Mining ◽

Mining Method ◽

Portal Site ◽

Data Mining Method ◽

Sustainable Communication Networks and Application - Lecture Notes on Data Engineering and Communications Technologies ◽

Social Media Analytics Using Data Mining Algorithms

10.1007/978-3-030-34515-0_2 ◽

2019 ◽

pp. 12-23

Author(s):

Harnoor Anand ◽

Sandeep Mathur

Keyword(s):

Data Mining ◽

Social Media ◽

Social Media Analytics ◽

Data Mining Algorithms ◽

Using Data ◽

Osteoporosis Risk Prediction Using Data Mining Algorithms

Journal of Community Health Research ◽

10.18502/jchr.v9i2.3401 ◽

2020 ◽

Author(s):

Efat Jabarpour ◽

Amin Abedini ◽

Abbasali Keshtkar

Keyword(s):

Data Mining ◽

Personal Information ◽

Disease Diagnosis ◽

Support Vector ◽

Data Mining Algorithms ◽

Industry Standard ◽

Disease Information ◽

Increased Risk ◽

Using Data ◽

Introduction: Osteoporosis is a disease that reduces bone density and loses the quality of bone microstructure leading to an increased risk of fractures. It is one of the major causes of inability and death in elderly people. The current study aims at determining the factors influencing the incidence of osteoporosis and providing a predictive model for the disease diagnosis to increase the diagnostic speed and reduce diagnostic costs. Methods: An Individual's data including personal information, lifestyle, and disease information were reviewed. A new model has been presented based on the Cross-Industry Standard Process CRISP methodology. Besides, Support Vector Machine (SVM) and Bayes methods (Tree Augmented Naïve Bayes (TAN)) and Clementine12 have been used as data mining tools. Results: Some features have been detected to affect this disease. The rules have been extracted that can be used as a pattern for the prediction of the patients' status. Classification precision was calculated to be 88.39% for SVM, and 91.29% for (TAN) when the precision of TAN is higher comparing to other methods. Conclusion: The most effective factors concerning osteoporosis are detected and can be used for a new sample with defined characteristics to predict the possibility of osteoporosis in a person.

Proceedings of the The 1st International Conference on Computer Science and Engineering Technology Universitas Muria Kudus ◽

Prediction of Job Suitability of College Graduate Candidates Using Data Mining Algorithms

10.4108/eai.24-10-2018.2280576 ◽

2018 ◽

Author(s):

Vanessa Stefanny ◽

Arief Wibowo

Keyword(s):

Data Mining ◽

College Graduate ◽

Data Mining Algorithms ◽

Using Data ◽