Discovering an Effective Measure in Data Mining

2008 ◽  
pp. 371-380
Author(s):  
Takao Ito

One of the most important issues in data mining is to discover an implicit relationship between words in a large corpus and labels in a large database. The relationship between words and labels often is expressed as a function of distance measures. An effective measure would be useful not only for getting the high precision of data mining, but also for time saving of the operation in data mining. In previous research, many measures for calculating the one-to-many relationship have been proposed, such as the complementary similarity measure, the mutual information, and the phi coefficient. Some research showed that the complementary similarity measure is the most effective. The author reviewed previous research related to the measures in one-to-many relationships and proposed a new idea to get an effective one, based on the heuristic approach in this article.

Author(s):  
Takao Ito

One of the most important issues in data mining is to discover an implicit relationship between words in a large corpus and labels in a large database. The relationship between words and labels often is expressed as a function of distance measures. An effective measure would be useful not only for getting the high precision of data mining, but also for time saving of the operation in data mining. In previous research, many measures for calculating the one-to-many relationship have been proposed, such as the complementary similarity measure, the mutual information, and the phi coefficient. Some research showed that the complementary similarity measure is the most effective. The author reviewed previous research related to the measures in one-to-many relationships and proposed a new idea to get an effective one, based on the heuristic approach in this article.


Author(s):  
Takao Ito

One of the most important issues in data mining is to discover an implicit relationship between words in a large corpus and labels in a large database. The relationship between words and labels often is expressed as a function of distance measures. An effective measure would be useful not only for getting the high precision of data mining, but also for time saving of the operation in data mining. In previous research, many measures for calculating the one-to-many relationship have been proposed, such as the complementary similarity measure, the mutual information, and the phi coefficient. Some research showed that the complementary similarity measure is the most effective. The author reviewed previous research related to the measures in one-to-many relationships and proposed a new idea to get an effective one, based on the heuristic approach in this article.


2013 ◽  
Vol 22 (04) ◽  
pp. 1350027
Author(s):  
JAGANATHAN PALANICHAMY ◽  
KUPPUCHAMY RAMASAMY

Feature selection is essential in data mining and pattern recognition, especially for database classification. During past years, several feature selection algorithms have been proposed to measure the relevance of various features to each class. A suitable feature selection algorithm normally maximizes the relevancy and minimizes the redundancy of the selected features. The mutual information measure can successfully estimate the dependency of features on the entire sampling space, but it cannot exactly represent the redundancies among features. In this paper, a novel feature selection algorithm is proposed based on maximum relevance and minimum redundancy criterion. The mutual information is used to measure the relevancy of each feature with class variable and calculate the redundancy by utilizing the relationship between candidate features, selected features and class variables. The effectiveness is tested with ten benchmarked datasets available in UCI Machine Learning Repository. The experimental results show better performance when compared with some existing algorithms.


2020 ◽  
pp. 1-11
Author(s):  
Li Huafeng

Under the background of the development of the new era, with the arrival of the big data era and the development of computer technology, people are more and more inclined to use data to analyze all kinds of problems encountered in teaching. Different educational data models are constructed through data analysis. By statistics of different educational data and analysis using correlation models, the relationship between different variables and the intensity of interaction in these activities can be determined. Using computer to complete some teaching tasks and construct a certain teaching mode can improve the efficiency of teachers’ teaching and students’ learning to a certain extent. In the process of discussing the use of computer in classroom teaching, we should analyze how to use computer to carry out new forms of teaching activities and how to evaluate the teaching quality of computer teaching. On the one hand, this paper summarizes the current situation of the development of digital teaching mode in colleges and universities in China; on the other hand, it analyzes the research results of computer teaching by professionals in the new era; Finally, the knowledge of data mining is expounded.


Determining the similarity or distance among data objects is an important part in many research fields such as statistics, data mining, machine learning etc. There are many measures available in the literature to define the distance between two numerical data objects. It is difficult to define such a metric to measure the similarity between two categorical data objects since categorical data objects are not ordered. Only a few distance measures are available in the literature to find the similarities among categorical data objects. This paper presents a comparative evaluation of various similarity measures for categorical data and also introduces a novel similarity measure for categorical data based on occurrence frequency and correlation. We evaluated the performance of these similarity measures in the context of outlier detection task in data mining using real world data sets. Experimental results show that the proposed similarity measure outperform the existing similarity measures to detect outliers in categorical datasets. The performances are evaluated in the context of outlier detection task in data mining


Author(s):  
Jesse Schotter

The first chapter of Hieroglyphic Modernisms exposes the complex history of Western misconceptions of Egyptian writing from antiquity to the present. Hieroglyphs bridge the gap between modern technologies and the ancient past, looking forward to the rise of new media and backward to the dispersal of languages in the mythical moment of the Tower of Babel. The contradictory ways in which hieroglyphs were interpreted in the West come to shape the differing ways that modernist writers and filmmakers understood the relationship between writing, film, and other new media. On the one hand, poets like Ezra Pound and film theorists like Vachel Lindsay and Sergei Eisenstein use the visual languages of China and of Egypt as a more primal or direct alternative to written words. But Freud, Proust, and the later Eisenstein conversely emphasize the phonetic qualities of Egyptian writing, its similarity to alphabetical scripts. The chapter concludes by arguing that even avant-garde invocations of hieroglyphics depend on narrative form through an examination of Hollis Frampton’s experimental film Zorns Lemma.


2015 ◽  
Vol 15 (3) ◽  
pp. 33-39 ◽  
Author(s):  
David Evans

This paper considers the relationship between social science and the food industry, and it suggests that collaboration can be intellectually productive and morally rewarding. It explores the middle ground that exists between paid consultancy models of collaboration on the one hand and a principled stance of nonengagement on the other. Drawing on recent experiences of researching with a major food retailer in the UK, I discuss the ways in which collaborating with retailers can open up opportunities for accessing data that might not otherwise be available to social scientists. Additionally, I put forward the argument that researchers with an interest in the sustainability—ecological or otherwise—of food systems, especially those of a critical persuasion, ought to be empirically engaging with food businesses. I suggest that this is important in terms of generating better understandings of the objectionable arrangements that they seek to critique, and in terms of opening up conduits through which to affect positive changes. Cutting across these points is the claim that while resistance to commercial engagement might be misguided, it is nevertheless important to acknowledge the power-geometries of collaboration and to find ways of leveling and/or leveraging them. To conclude, I suggest that universities have an important institutional role to play in defining the terms of engagement as well as maintaining the boundaries between scholarship and consultancy—a line that can otherwise become quite fuzzy when the worlds of commerce and academic research collide.


1968 ◽  
Vol 8 (4) ◽  
pp. 606-617
Author(s):  
Mohammad Anisur Rahman

The purpose of this paper is to re-examine the relationship between the degree of aggregate labour-intensity and the aggregate volume of saving in an economy where a Cobb-6ouglas production function in its traditional form can be assumed to give a good approximation to reality. The relationship in ques¬tion has an obviously important bearing on economic development policy in the area of choice of labour intensity. To the extent that and in the range where an increase in labour intensity would adversely affect the volume of savings, a con¬flict arises between two important social objectives, i.e., higher rate of capital formation on the one hand and greater employment and distributive equity on the other. If relative resource endowments in the economy are such that such a "competitive" range of labour-intensity falls within the nation's attainable range of choice, development planners will have to arrive at a compromise between these two social goals.


Author(s):  
Peter Coss

In the introduction to his great work of 2005, Framing the Early Middle Ages, Chris Wickham urged not only the necessity of carefully framing our studies at the outset but also the importance of closely defining the words and concepts that we employ, the avoidance ‘cultural sollipsism’ wherever possible and the need to pay particular attention to continuities and discontinuities. Chris has, of course, followed these precepts on a vast scale. My aim in this chapter is a modest one. I aim to review the framing of thirteenth-century England in terms of two only of Chris’s themes: the aristocracy and the state—and even then primarily in terms of the relationship between the two. By the thirteenth century I mean a long thirteenth century stretching from the period of the Angevin reforms of the later twelfth century on the one hand to the early to mid-fourteenth on the other; the reasons for taking this span will, I hope, become clearer during the course of the chapter, but few would doubt that it has a validity.


Cancers ◽  
2021 ◽  
Vol 13 (13) ◽  
pp. 3141
Author(s):  
Aurora Laborda-Illanes ◽  
Lidia Sánchez-Alcoholado ◽  
Soukaina Boutriq ◽  
Isaac Plaza-Andrades ◽  
Jesús Peralta-Linero ◽  
...  

In this review we summarize a possible connection between gut microbiota, melatonin production, and breast cancer. An imbalance in gut bacterial population composition (dysbiosis), or changes in the production of melatonin (circadian disruption) alters estrogen levels. On the one hand, this may be due to the bacterial composition of estrobolome, since bacteria with β-glucuronidase activity favour estrogens in a deconjugated state, which may ultimately lead to pathologies, including breast cancer. On the other hand, it has been shown that these changes in intestinal microbiota stimulate the kynurenine pathway, moving tryptophan away from the melatonergic pathway, thereby reducing circulating melatonin levels. Due to the fact that melatonin has antiestrogenic properties, it affects active and inactive estrogen levels. These changes increase the risk of developing breast cancer. Additionally, melatonin stimulates the differentiation of preadipocytes into adipocytes, which have low estrogen levels due to the fact that adipocytes do not express aromatase. Consequently, melatonin also reduces the risk of breast cancer. However, more studies are needed to determine the relationship between microbiota, melatonin, and breast cancer, in addition to clinical trials to confirm the sensitizing effects of melatonin to chemotherapy and radiotherapy, and its ability to ameliorate or prevent the side effects of these therapies.


Sign in / Sign up

Export Citation Format

Share Document