Metric Methods in Data Mining

2008 ◽  
pp. 849-879
Author(s):  
Dan A. Simovici

This chapter presents data mining techniques that make use of metrics defined on the set of partitions of finite sets. Partitions are naturally associated with object attributes and major data mining problem such as classification, clustering, and data preparation benefit from an algebraic and geometric study of the metric space of partitions. The metrics we find most useful are derived from a generalization of the entropic metric. We discuss techniques that produce smaller classifiers, allow incremental clustering of categorical data and help user to better prepare training data for constructing classifiers. Finally, we discuss open problems and future research directions.

2011 ◽  
pp. 1-31 ◽  
Author(s):  
Dan A. Simovici

This chapter presents data mining techniques that make use of metrics defined on the set of partitions of finite sets. Partitions are naturally associated with object attributes and major data mining problem such as classification, clustering, and data preparation benefit from an algebraic and geometric study of the metric space of partitions. The metrics we find most useful are derived from a generalization of the entropic metric. We discuss techniques that produce smaller classifiers, allow incremental clustering of categorical data and help user to better prepare training data for constructing classifiers. Finally, we discuss open problems and future research directions.


Author(s):  
Boutheina Fessi ◽  
Yacine Djemaiel ◽  
Noureddine Boudriga

This chapter provides a review about the usefulness of applying data mining techniques to detect intrusion within dynamic environments and its contribution in digital investigation. Numerous applications and models are described based on data mining analytics. The chapter addresses also different requirements that should be fulfilled to efficiently perform cyber-crime investigation based on data mining analytics. It states, at the end, future research directions related to cyber-crime investigation that could be investigated and presents new trends of data mining techniques that deal with big data to detect attacks.


Author(s):  
Boutheina A. Fessi ◽  
Yacine Djemaiel ◽  
Noureddine Boudriga

This chapter provides a review about the usefulness of applying data mining techniques to detect intrusion within dynamic environments and its contribution in digital investigation. Numerous applications and models are described based on data mining analytics. The chapter addresses also different requirements that should be fulfilled to efficiently perform cyber-crime investigation based on data mining analytics. It states, at the end, future research directions related to cyber-crime investigation that could be investigated and presents new trends of data mining techniques that deal with big data to detect attacks.


2021 ◽  
Vol 9 ◽  
pp. 1508-1528
Author(s):  
Piyawat Lertvittayakumjorn ◽  
Francesca Toni

Abstract Debugging a machine learning model is hard since the bug usually involves the training data and the learning process. This becomes even harder for an opaque deep learning model if we have no clue about how the model actually works. In this survey, we review papers that exploit explanations to enable humans to give feedback and debug NLP models. We call this problem explanation-based human debugging (EBHD). In particular, we categorize and discuss existing work along three dimensions of EBHD (the bug context, the workflow, and the experimental setting), compile findings on how EBHD components affect the feedback providers, and highlight open problems that could be future research directions.


2021 ◽  
Vol 23 (2) ◽  
pp. 13-22
Author(s):  
Debmalya Mandal ◽  
Sourav Medya ◽  
Brian Uzzi ◽  
Charu Aggarwal

Graph Neural Networks (GNNs), a generalization of deep neural networks on graph data have been widely used in various domains, ranging from drug discovery to recommender systems. However, GNNs on such applications are limited when there are few available samples. Meta-learning has been an important framework to address the lack of samples in machine learning, and in recent years, researchers have started to apply meta-learning to GNNs. In this work, we provide a comprehensive survey of different metalearning approaches involving GNNs on various graph problems showing the power of using these two approaches together. We categorize the literature based on proposed architectures, shared representations, and applications. Finally, we discuss several exciting future research directions and open problems.


Author(s):  
Constanţa-Nicoleta Bodea ◽  
Maria-Iuliana Dascalu ◽  
Radu Ioan Mogos ◽  
Stelian Stancu

Reinforcement of the technology-enhanced education transformed education into a data-intensive domain. As in many other data-intensive domains, the interest for data analysis through various analytics is growing. The article starts by defining LA, with relevant views on the literature. A discussion about the relationships between LA, educational data mining and academic analytics is included in the background section. In the main section of the article, the learning analytics, as an emerging trend in the educational systems is describe, by discussing the main issues, controversies, problems on this topic. Final part of the article presents the future research directions and the conclusion.


Author(s):  
Md Mahbubur Rahim ◽  
Maryam Jabberzadeh ◽  
Nergiz Ilhan

E-procurement systems that have been in place for over a decade have begun incorporating digital tools like big data, cloud computing, internet of things, and data mining. Hence, there exists a rich literature on earlier e-procurement systems and advanced digitally-enabled e-procurement systems. Existing literature on these systems addresses many research issues (e.g., adoption) associated with e-procurement. However, one critical issue that has so far received no rigorous attention is about “unit of analysis,” a methodological concern of importance, for e-procurement research context. Hence, the aim of this chapter is twofold: 1) to discuss how the notion of “unit of analysis” has been conceptualised in the e-procurement literature and 2) to discuss how its use has been justified by e-procurement scholars to address the research issues under investigation. Finally, the chapter provides several interesting findings and outlines future research directions.


2022 ◽  
pp. 1477-1503
Author(s):  
Ali Al Mazari

HIV/AIDS big data analytics evolved as a potential initiative enabling the connection between three major scientific disciplines: (1) the HIV biology emergence and evolution; (2) the clinical and medical complex problems and practices associated with the infections and diseases; and (3) the computational methods for the mining of HIV/AIDS biological, medical, and clinical big data. This chapter provides a review on the computational and data mining perspectives on HIV/AIDS in big data era. The chapter focuses on the research opportunities in this domain, identifies the challenges facing the development of big data analytics in HIV/AIDS domain, and then highlights the future research directions of big data in the healthcare sector.


Author(s):  
Ana Funes ◽  
Aristides Dasso

Nowadays, there is an increasing number of applications where artificial intelligence has fuelled the research and development of new methods, techniques, and tools related to knowledge acquisition and data mining. The development of data mining and other related disciplines has benefited from the existence of large volumes of data proceeding from the most diverse sources and domains. KDD process and methods of data mining allows for the discovery of knowledge in data that is hidden to humans, presenting this knowledge under different ways. In this chapter, the relation of data mining with other disciplines is analyzed, an overview of data mining tasks and methods is presented, and also a possible classification of them is given. Finally, a brief discussion on issues associated to the discipline and future research directions are also given.


Sign in / Sign up

Export Citation Format

Share Document