Tools, Technologies, and Methodologies to Support Data Science

Various devices such as smart phones, computers, tablets, biomedical equipment, sports equipment, and information systems generate a large amount of data and useful information in transactional information systems. However, these generate information that may not be perceptible or analyzed adequately for decision-making. There are technology, tools, algorithms, models that support analysis, visualization, learning, and prediction. Data science involves techniques, methods to abstract knowledge generated through diverse sources. It combines fields such as statistics, machine learning, data mining, visualization, and predictive analysis. This chapter aims to be a guide regarding applicable statistical and computational tools in data science.

Download Full-text

Latest Tools for Data Mining and Machine Learning

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.i1003.0789s19 ◽

2019 ◽

Vol 8 (9S) ◽

pp. 18-23 ◽

Cited By ~ 2

Keyword(s):

Machine Learning ◽

Data Mining ◽

Decision Making ◽

Feature Selection ◽

Open Source ◽

Predictive Analysis ◽

Learning Tools ◽

Pros And Cons ◽

Selection For ◽

Extract Information

Nowadays, Data Mining is used everywhere for extracting information from the data and in turn, acquires knowledge for decision making. Data Mining analyzes patterns which are used to extract information and knowledge for making decisions. Many open source and licensed tools like Weka, RapidMiner, KNIME, and Orange are available for Data Mining and predictive analysis. This paper discusses about different tools available for Data Mining and Machine Learning, followed by the description, pros and cons of these tools. The article provides details of all the algorithms like classification, regression, characterization, discretization, clustering, visualization and feature selection for Data Mining and Machine Learning tools. It will help people for efficient decision making and suggests which tool is suitable according to their requirement.

Download Full-text

An Interview with Dr. Michael Zeller, Winner of ACM SIGKDD 2020 Service Award

ACM SIGKDD Explorations Newsletter ◽

10.1145/3447556.3447561 ◽

2021 ◽

Vol 22 (2) ◽

pp. 6-7

Author(s):

Michael Zeller

Keyword(s):

Artificial Intelligence ◽

Machine Learning ◽

Data Mining ◽

Knowledge Discovery ◽

Data Science ◽

Professional Services ◽

Investment Company ◽

Service Award ◽

Virtual Conference ◽

Learning Data

Michael Zeller, Ph.D. is the recipient of the 2020 ACM SIGKDD Service Award, which is the highest service award in the field of knowledge discovery and data mining. Conferred annually on one individual or group in recognition of outstanding professional services and contributions to the field of knowledge discovery and data mining, Dr. Zeller was honored for his years of service and many accomplishments as the secretary and treasurer for ACM SIGKDD, the organizing body of the annual KDD conference. Zeller is also head of AI strategy and solutions at Temasek, a global investment company seeking to make a difference always with tomorrow in mind. He sat down with SIGKDD Explorations to discuss how he first got involved in the KDD conference in 1999, what he learned from the first-ever virtual conference, his work at Temasek, and what excites him about the future of machine learning, data science and artificial intelligence.

Download Full-text

An Interview with Dr. Shipeng Yu, Winner of ACM SIGKDD 2021 Service Award

ACM SIGKDD Explorations Newsletter ◽

10.1145/3510374.3510376 ◽

2021 ◽

Vol 23 (2) ◽

pp. 1-2

Author(s):

Shipeng Yu

Keyword(s):

Artificial Intelligence ◽

Machine Learning ◽

Data Mining ◽

Knowledge Discovery ◽

Data Science ◽

Professional Services ◽

Professional Network ◽

Service Award ◽

Years Of Service ◽

Learning Data

Shipeng Yu, Ph.D. is the recipient of the 2021 ACM SIGKDD Service Award, which is the highest service award in the field of knowledge discovery and data mining. Conferred annually on one individual or group in recognition of outstanding professional services and contributions to the field of knowledge discovery and data mining, Dr. Yu was honored for his years of service and many accomplishments as general chair of KDD 2017 and currently as sponsorship director for SIGKDD. Dr. Yu is Director of AI Engineering, Head of the Growth AI team at LinkedIn, the world's largest professional network. He sat down with SIGKDD Explorations to discuss how he first got involved in the KDD conference in 2006, the benefits and drawbacks of virtual conferences, his work at LinkedIn, and KDD's place in the field of machine learning, data science and artificial intelligence.

Download Full-text

Big Data Models and the Public Sector

Web Services ◽

10.4018/978-1-5225-7501-6.ch007 ◽

2019 ◽

pp. 105-126

Author(s):

N. Nawin Sona

Keyword(s):

Machine Learning ◽

Data Mining ◽

Big Data ◽

Public Sector ◽

Predictive Analytics ◽

Data Types ◽

The Public ◽

Emerging Trends ◽

Wide Range ◽

Learning Data

This chapter aims to give an overview of the wide range of Big Data approaches and technologies today. The data features of Volume, Velocity, and Variety are examined against new database technologies. It explores the complexity of data types, methodologies of storage, access and computation, current and emerging trends of data analysis, and methods of extracting value from data. It aims to address the need for clarity regarding the future of RDBMS and the newer systems. And it highlights the methods in which Actionable Insights can be built into public sector domains, such as Machine Learning, Data Mining, Predictive Analytics and others.

Download Full-text

Data Science

Deep Learning Innovations and Their Convergence With Big Data - Advances in Data Mining and Database Management ◽

10.4018/978-1-5225-3015-2.ch008 ◽

2018 ◽

pp. 141-151

Author(s):

Sabitha Rajagopal

Keyword(s):

Data Mining ◽

Decision Making ◽

Research And Development ◽

Real Time ◽

Data Science ◽

Data Warehousing ◽

Holistic Approach ◽

Digital Data ◽

Data Application ◽

Development Analysis

Data Science employs techniques and theories to create data products. Data product is merely a data application that acquires its value from the data itself, and creates more data as a result; it's not just an application with data. Data science involves the methodical study of digital data employing techniques of observation, development, analysis, testing and validation. It tackles the real time challenges by adopting a holistic approach. It ‘creates' knowledge about large and dynamic bases, ‘develops' methods to manage data and ‘optimizes' processes to improve its performance. The goal includes vital investigation and innovation in conjunction with functional exploration intended to notify decision-making for individuals, businesses, and governments. This paper discusses the emergence of Data Science and its subsequent developments in the fields of Data Mining and Data Warehousing. The research focuses on need, challenges, impact, ethics and progress of Data Science. Finally the insights of the subsequent phases in research and development of Data Science is provided.

Download Full-text

Special Issue on Machine Learning, Data Science, and Artificial Intelligence in Plasma Research

IEEE Transactions on Plasma Science ◽

10.1109/tps.2019.2961571 ◽

2020 ◽

Vol 48 (1) ◽

pp. 1-2 ◽

Cited By ~ 5

Author(s):

Zhehui Wang ◽

J. Luc Peterson ◽

Cristina Rea ◽

David Humphreys

Keyword(s):

Artificial Intelligence ◽

Machine Learning ◽

Data Science ◽

Special Issue ◽

Plasma Research ◽

Learning Data

Download Full-text

Machine learning for determining accurate outcomes in criminal trials

Law Probability and Risk ◽

10.1093/lpr/mgaa003 ◽

2020 ◽

Vol 19 (1) ◽

pp. 43-65

Author(s):

Jane Mitchell ◽

Simon Mitchell ◽

Cliff Mitchell

Keyword(s):

Machine Learning ◽

Decision Making ◽

Data Science ◽

Positive Impact ◽

Machine Learning Algorithms ◽

Test Cases ◽

Judicial Process ◽

Criminal Trials ◽

Wrongful Convictions ◽

Potential Applications

Abstract Advances in mathematical and computational technologies have brought unique and ground-breaking benefits to diverse fields throughout society (engineering, medicine, economics, etc.). Within legal systems, however, the potential applications of data science and innovative mathematical tools have yet to be embraced with the same ambition. The complex decision-making that is needed for reaching just verdicts is often seen as out of reach for such approaches and, in the case of criminal trials, this inhibits exploration into whether machine learning could have a positive impact. Here, through assigning numerical scores to prosecution and defence evidence, and employing an approach based on dimensionality reduction, we showed that evidence strands presented at historical murder trials could be used to train effective machine-learning algorithms (or models). We tested the evidence quantification approach with the trained model and showed that, through machine learning, criminal cases could be clearly classified (probability >99.9%) as belonging to either a guilty or a not-guilty category. The classification was found to be as expected for all test cases. All guilty test cases that were not wrongful convictions were correctly assigned to the guilty category by our model and, crucially, test cases that were wrongful convictions were correctly assigned to the not-guilty category. This work demonstrated the potential for machine learning to benefit criminal trial decision-making, and should motivate further testing and development of the model and datasets for assisting the judicial process.

Download Full-text

How Data Mining and Machine Learning Evolved from Relational Data Base to Data Science

Studies in Big Data - A Comprehensive Guide Through the Italian Database Research Over the Last 25 Years ◽

10.1007/978-3-319-61893-7_17 ◽

2017 ◽

pp. 287-306 ◽

Cited By ~ 3

Author(s):

G. Amato ◽

L. Candela ◽

D. Castelli ◽

A. Esuli ◽

F. Falchi ◽

...

Keyword(s):

Machine Learning ◽

Data Mining ◽

Data Base ◽

Data Science ◽

Relational Data ◽

Relational Data Base

Download Full-text

Statistical and Machine-Learning Data Mining

10.1201/b11508 ◽

2011 ◽

Cited By ~ 12

Author(s):

Bruce Ratner ◽

Stephen Day ◽

Christopher Davies

Keyword(s):

Machine Learning ◽

Data Mining ◽

Learning Data

Download Full-text

Big Data Methods

Organizational Research Methods ◽

10.1177/1094428116677299 ◽

2016 ◽

Vol 21 (3) ◽

pp. 525-547 ◽

Cited By ~ 55

Author(s):

Scott Tonidandel ◽

Eden B. King ◽

Jose M. Cortina

Keyword(s):

Machine Learning ◽

Data Mining ◽

Big Data ◽

Data Analytics ◽

Data Science ◽

Data Sources ◽

Future Research ◽

Organizational Science ◽

Associated Data ◽

Organizational Sciences

Advances in data science, such as data mining, data visualization, and machine learning, are extremely well-suited to address numerous questions in the organizational sciences given the explosion of available data. Despite these opportunities, few scholars in our field have discussed the specific ways in which the lens of our science should be brought to bear on the topic of big data and big data's reciprocal impact on our science. The purpose of this paper is to provide an overview of the big data phenomenon and its potential for impacting organizational science in both positive and negative ways. We identifying the biggest opportunities afforded by big data along with the biggest obstacles, and we discuss specifically how we think our methods will be most impacted by the data analytics movement. We also provide a list of resources to help interested readers incorporate big data methods into their existing research. Our hope is that we stimulate interest in big data, motivate future research using big data sources, and encourage the application of associated data science techniques more broadly in the organizational sciences.

Download Full-text