DATA PREPARATION ON LARGE DATASETS FOR DATA SCIENCE

In recent times, the diagnosis of heart disease has become a very critical task in the medical field. In the modern age, one person dies every minute due to heart disease. Data science has an important role in processing big amounts of data in the field of health sciences. Since the diagnosis of heart disease is a complex task, the assessment process should be automated to avoid the risks associated with it and alert the patient in advance. This paper uses the heart disease dataset available in the UCI Machine Learning Repository. The proposed work assesses the risk of heart disease in a patient by applying various data mining methods such as Naive Bayes, Decision Tree, KNN, Linear SVM, RBF SVM, Gaussian Process, Neural Network, Adabost, QDA and Random Forest. This paper provides a comparative study by analyzing the performance of various machine learning algorithms. Test results confirm that the KNN algorithm achieved the highest 97% accuracy compared to other implemented ML algorithms.

Download Full-text

Machine Learning Algorithms to Classify Future Returns Using Structured and Unstructured Data

The Journal of Investing ◽

10.3905/joi.2021.1.169 ◽

2021 ◽

pp. joi.2021.1.169

Author(s):

Joshua Livnat ◽

Jyoti Singh

Keyword(s):

Machine Learning ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Unstructured Data

Download Full-text

Machine Learning Techniques for Internet of Things

Advances in Systems Analysis, Software Engineering, and High Performance Computing - Integrating the Internet of Things Into Software Engineering Practices ◽

10.4018/978-1-5225-7790-4.ch008 ◽

2019 ◽

pp. 160-180

Author(s):

P. Priakanth ◽

S. Gopikrishnan

Keyword(s):

Machine Learning ◽

Data Science ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Independent Learning ◽

Machine Learning Techniques ◽

Analytical Models ◽

Guided Learning ◽

Learning Techniques ◽

Learning Machine

The idea of an intelligent, independent learning machine has fascinated humans for decades. The philosophy behind machine learning is to automate the creation of analytical models in order to enable algorithms to learn continuously with the help of available data. Since IoT will be among the major sources of new data, data science will make a great contribution to make IoT applications more intelligent. Machine learning can be applied in cases where the desired outcome is known (guided learning) or the data is not known beforehand (unguided learning) or the learning is the result of interaction between a model and the environment (reinforcement learning). This chapter answers the questions: How could machine learning algorithms be applied to IoT smart data? What is the taxonomy of machine learning algorithms that can be adopted in IoT? And what are IoT data characteristics in real-world which requires data analytics?

Download Full-text

A Literature Review on Thyroid Hormonal Problems in Women Using Data Science and Analytics

Advances in Data Mining and Database Management - Handbook of Research on Engineering, Business, and Healthcare Applications of Data Science and Analytics ◽

10.4018/978-1-7998-3053-5.ch021 ◽

2021 ◽

pp. 416-428

Author(s):

R. Suganya ◽

Rajaram S. ◽

Kameswari M.

Keyword(s):

Machine Learning ◽

Literature Review ◽

Data Science ◽

Learning Algorithms ◽

Research Literature ◽

Machine Learning Algorithms ◽

Thyroid Disorder ◽

Classification Models ◽

Indian Women ◽

Using Data

Currently, thyroid disorders are more common and widespread among women worldwide. In India, seven out of ten women are suffering from thyroid problems. Various research literature studies predict that about 35% of Indian women are examined with prevalent goiter. It is very necessary to take preventive measures at its early stages, otherwise it causes infertility problem among women. The recent review discusses various analytics models that are used to handle different types of thyroid problems in women. This chapter is planned to analyze and compare different classification models, both machine learning algorithms and deep leaning algorithms, to classify different thyroid problems. Literature from both machine learning and deep learning algorithms is considered. This literature review on thyroid problems will help to analyze the reason and characteristics of thyroid disorder. The dataset used to build and to validate the algorithms was provided by UCI machine learning repository.

Download Full-text

Machine Learning Techniques for Internet of Things

Research Anthology on Artificial Intelligence Applications in Security ◽

10.4018/978-1-7998-7705-9.ch067 ◽

2021 ◽

pp. 1490-1506

Author(s):

P. Priakanth ◽

S. Gopikrishnan

Keyword(s):

Machine Learning ◽

Data Science ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Independent Learning ◽

Machine Learning Techniques ◽

Analytical Models ◽

Guided Learning ◽

Learning Techniques ◽

Learning Machine

The idea of an intelligent, independent learning machine has fascinated humans for decades. The philosophy behind machine learning is to automate the creation of analytical models in order to enable algorithms to learn continuously with the help of available data. Since IoT will be among the major sources of new data, data science will make a great contribution to make IoT applications more intelligent. Machine learning can be applied in cases where the desired outcome is known (guided learning) or the data is not known beforehand (unguided learning) or the learning is the result of interaction between a model and the environment (reinforcement learning). This chapter answers the questions: How could machine learning algorithms be applied to IoT smart data? What is the taxonomy of machine learning algorithms that can be adopted in IoT? And what are IoT data characteristics in real-world which requires data analytics?

Download Full-text

Introduction to Data Science and Machine Learning Algorithms

Data Science and Multiple Criteria Decision Making Approaches in Finance - Multiple Criteria Decision Making ◽

10.1007/978-3-030-74176-1_1 ◽

2021 ◽

pp. 1-15

Author(s):

Gökhan Silahtaroğlu ◽

Hasan Dinçer ◽

Serhat Yüksel

Keyword(s):

Machine Learning ◽

Data Science ◽

Learning Algorithms ◽

Machine Learning Algorithms

Download Full-text

Bio-informatics and psychiatric epidemiology

Practical Psychiatric Epidemiology ◽

10.1093/med/9780198735564.003.0021 ◽

2020 ◽

pp. 359-372

Author(s):

Nicola Voyle ◽

Maximilian Kerz ◽

Steven Kiddle ◽

Richard Dobson

Keyword(s):

Machine Learning ◽

Data Cleaning ◽

Epidemiological Studies ◽

Selection Method ◽

Psychiatric Epidemiology ◽

Large Datasets ◽

Data Exploration ◽

Feature Identification ◽

Data Formats ◽

Method Selection

This chapter highlights the methodologies which are increasingly being applied to large datasets or ‘big data’, with an emphasis on bio-informatics. The first stage of any analysis is to collect data from a well-designed study. The chapter begins by looking at the raw data that arises from epidemiological studies and highlighting the first stages in creating clean data that can be used to draw informative conclusions through analysis. The remainder of the chapter covers data formats, data exploration, data cleaning, missing data (i.e. the lack of data for a variable in an observation), reproducibility, classification versus regression, feature identification and selection, method selection (e.g. supervised versus unsupervised machine learning), training a classifier, and drawing conclusions from modelling.

Download Full-text