Evaluation of Machine Learning Predictions of a Highly Resolved Time Series of Chlorophyll-a Concentration

Pelagic chlorophyll-a concentrations are key for evaluation of the environmental status and productivity of marine systems, and data can be provided by in situ measurements, remote sensing and modelling. However, modelling chlorophyll-a is not trivial due to its nonlinear dynamics and complexity. In this study, chlorophyll-a concentrations for the Helgoland Roads time series were modeled using a number of measured water and environmental parameters. We chose three common machine learning algorithms from the literature: the support vector machine regressor, neural networks multi-layer perceptron regressor and random forest regressor. Results showed that the support vector machine regressor slightly outperformed other models. The evaluation with a test dataset and verification with an independent validation dataset for chlorophyll-a concentrations showed a good generalization capacity, evaluated by the root mean squared errors of less than 1 µg L−1. Feature selection and engineering are important and improved the models significantly, as measured in performance, improving the adjusted R2 by a minimum of 48%. We tested SARIMA in comparison and found that the univariate nature of SARIMA does not allow for better results than the machine learning models. Additionally, the computer processing time needed was much higher (prohibitive) for SARIMA.

Download Full-text

Evaluation of Machine Learning predictions of a highly resolved long time series of Chlorophyll-a concentration

10.1101/2021.05.12.443749 ◽

2021 ◽

Author(s):

Felipe de Luca Lopes de Amorim ◽

Johannes Rick ◽

Gerrit Lohmann ◽

Karen Helen Wiltshire

Keyword(s):

Machine Learning ◽

Time Series ◽

Support Vector Machine ◽

Chlorophyll A ◽

Environmental Parameters ◽

Machine Learning Algorithms ◽

Validation Dataset ◽

Support Vector ◽

Environmental Status ◽

Long Time

Pelagic Chlorophyll-a concentrations are key for evaluation of the environmental status and productivity of marine systems. In this study, chlorophyll-a concentrations for the Helgoland Roads Time Series were modeled using a number of measured water and environmental parameters. We chose three common Machine Learning algorithms from the literature: Support Vector Machine Regressor, Neural Networks Multi-layer Perceptron Regressor and Random Forest Regressor. Results showed that Support Vector Machine Regressor slightly outperformed other models. The evaluation with a test dataset and verification with an independent validation dataset for chlorophyll-a concentrations showed a good generalization capacity, evaluated by the root mean squared errors of less than 1 µg L-1. Feature selection and engineering are important and improved the models significantly, as measured in performance, improving by a minimum of 48% the adjusted R2. We tested SARIMA in comparison and found that the univariate nature of SARIMA does not allow for better results than the Machine Learning models. Additionally, the computer processing time needed was much higher (prohibitive) for SARIMA.

Download Full-text

Predicting Future Occurrence of Acute Hypotensive Episodes Using Noninvasive and Invasive Features

Military Medicine ◽

10.1093/milmed/usaa418 ◽

2021 ◽

Vol 186 (Supplement_1) ◽

pp. 445-451

Author(s):

Yifei Sun ◽

Navid Rashedi ◽

Vikrant Vaze ◽

Parikshit Shah ◽

Ryan Halter ◽

...

Keyword(s):

Machine Learning ◽

Support Vector Machine ◽

Real World ◽

Short Term Memory ◽

Model Performance ◽

Learning Technologies ◽

Machine Learning Algorithms ◽

Support Vector ◽

K Nearest Neighbor ◽

Continuous Map

ABSTRACT Introduction Early prediction of the acute hypotensive episode (AHE) in critically ill patients has the potential to improve outcomes. In this study, we apply different machine learning algorithms to the MIMIC III Physionet dataset, containing more than 60,000 real-world intensive care unit records, to test commonly used machine learning technologies and compare their performances. Materials and Methods Five classification methods including K-nearest neighbor, logistic regression, support vector machine, random forest, and a deep learning method called long short-term memory are applied to predict an AHE 30 minutes in advance. An analysis comparing model performance when including versus excluding invasive features was conducted. To further study the pattern of the underlying mean arterial pressure (MAP), we apply a regression method to predict the continuous MAP values using linear regression over the next 60 minutes. Results Support vector machine yields the best performance in terms of recall (84%). Including the invasive features in the classification improves the performance significantly with both recall and precision increasing by more than 20 percentage points. We were able to predict the MAP with a root mean square error (a frequently used measure of the differences between the predicted values and the observed values) of 10 mmHg 60 minutes in the future. After converting continuous MAP predictions into AHE binary predictions, we achieve a 91% recall and 68% precision. In addition to predicting AHE, the MAP predictions provide clinically useful information regarding the timing and severity of the AHE occurrence. Conclusion We were able to predict AHE with precision and recall above 80% 30 minutes in advance with the large real-world dataset. The prediction of regression model can provide a more fine-grained, interpretable signal to practitioners. Model performance is improved by the inclusion of invasive features in predicting AHE, when compared to predicting the AHE based on only the available, restricted set of noninvasive technologies. This demonstrates the importance of exploring more noninvasive technologies for AHE prediction.

Download Full-text

Implementation of IoT Framework with Data Analysis Using Deep Learning Methods for Occupancy Prediction in a Building

Future Internet ◽

10.3390/fi13030067 ◽

2021 ◽

Vol 13 (3) ◽

pp. 67

Author(s):

Eric Hitimana ◽

Gaurav Bajpai ◽

Richard Musabe ◽

Louis Sibomana ◽

Jayavel Kayalvizhi

Keyword(s):

Machine Learning ◽

Time Series ◽

Deep Learning ◽

Time Series Data ◽

Multivariate Time Series ◽

Machine Learning Algorithms ◽

Series Data ◽

Support Vector ◽

Human Beings ◽

Feed Forward Network

Many countries worldwide face challenges in controlling building incidence prevention measures for fire disasters. The most critical issues are the localization, identification, detection of the room occupant. Internet of Things (IoT) along with machine learning proved the increase of the smartness of the building by providing real-time data acquisition using sensors and actuators for prediction mechanisms. This paper proposes the implementation of an IoT framework to capture indoor environmental parameters for occupancy multivariate time-series data. The application of the Long Short Term Memory (LSTM) Deep Learning algorithm is used to infer the knowledge of the presence of human beings. An experiment is conducted in an office room using multivariate time-series as predictors in the regression forecasting problem. The results obtained demonstrate that with the developed system it is possible to obtain, process, and store environmental information. The information collected was applied to the LSTM algorithm and compared with other machine learning algorithms. The compared algorithms are Support Vector Machine, Naïve Bayes Network, and Multilayer Perceptron Feed-Forward Network. The outcomes based on the parametric calibrations demonstrate that LSTM performs better in the context of the proposed application.

Download Full-text

Machine Learning Models for Finger Bend Evaluation using Implemented Low cost Flex Sensor

International Journal for Research in Applied Science and Engineering Technology ◽

10.22214/ijraset.2021.35742 ◽

2021 ◽

Vol 9 (VI) ◽

pp. 3605-3611

Author(s):

Pratyush Kaware

Keyword(s):

Machine Learning ◽

Support Vector Machine ◽

Low Cost ◽

Learning Algorithms ◽

Cost Effective ◽

Machine Learning Algorithms ◽

Support Vector ◽

Learning Models ◽

Machine Learning Models

In this paper a cost-effective sensor has been implemented to read finger bend signals, by attaching the sensor to a finger, so as to classify them based on the degree of bent as well as the joint about which the finger was being bent. This was done by testing with various machine learning algorithms to get the most accurate and consistent classifier. Finally, we found that Support Vector Machine was the best algorithm suited to classify our data, using we were able predict live state of a finger, i.e., the degree of bent and the joints involved. The live voltage values from the sensor were transmitted using a NodeMCU micro-controller which were converted to digital and uploaded on a database for analysis.

Download Full-text

Prediction and Analysis of Gold Prices using Ensemble Machine Learning Algorithms

International Journal for Research in Applied Science and Engineering Technology ◽

10.22214/ijraset.2021.36028 ◽

2021 ◽

Vol 9 (VI) ◽

pp. 4367-4374

Author(s):

Gudipally Chandrashakar

Keyword(s):

Machine Learning ◽

Time Series ◽

Time Series Data ◽

Gold Price ◽

Machine Learning Algorithms ◽

Series Data ◽

Gradient Boosting ◽

Support Vector ◽

Average Value ◽

Ensemble Machine Learning

In this article, we used historical time series data up to the current day gold price. In this study of predicting gold price, we consider few correlating factors like silver price, copper price, standard, and poor’s 500 value, dollar-rupee exchange rate, Dow Jones Industrial Average Value. Considering the prices of every correlating factor and gold price data where dates ranging from 2008 January to 2021 February. Few algorithms of machine learning are used to analyze the time-series data are Random Forest Regression, Support Vector Regressor, Linear Regressor, ExtraTrees Regressor and Gradient boosting Regression. While seeing the results the Extra Tree Regressor algorithm gives the predicted value of gold prices more accurately.

Download Full-text

Predicting Student’s Performance Using Machine Learning Algorithm

International Journal of Advanced Research in Science, Communication and Technology ◽

10.48175/ijarsct-1209 ◽

2021 ◽

pp. 53-58

Author(s):

Sheela Rani P ◽

Dhivya S ◽

Dharshini Priya M ◽

Dharmila Chowdary A

Keyword(s):

Machine Learning ◽

Support Vector Machine ◽

Prediction Model ◽

Naive Bayes ◽

Learning Algorithm ◽

Naïve Bayes ◽

Machine Learning Algorithms ◽

Support Vector ◽

Learning Approaches ◽

K Nearest Neighbors

Machine learning is a new analysis discipline that uses knowledge to boost learning, optimizing the training method and developing the atmosphere within which learning happens. There square measure 2 sorts of machine learning approaches like supervised and unsupervised approach that square measure accustomed extract the knowledge that helps the decision-makers in future to require correct intervention. This paper introduces an issue that influences students' tutorial performance prediction model that uses a supervised variety of machine learning algorithms like support vector machine , KNN(k-nearest neighbors), Naïve Bayes and supplying regression and logistic regression. The results supported by various algorithms are compared and it is shown that the support vector machine and Naïve Bayes performs well by achieving improved accuracy as compared to other algorithms. The final prediction model during this paper may have fairly high prediction accuracy .The objective is not just to predict future performance of students but also provide the best technique for finding the most impactful features that influence student’s while studying.

Download Full-text

Detection of Blood Cancer Cells Using Microscopic Images

International Journal for Research in Applied Science and Engineering Technology ◽

10.22214/ijraset.2021.37536 ◽

2021 ◽

Vol 9 (8) ◽

pp. 995-1003

Author(s):

Vidyashree M S

Keyword(s):

Machine Learning ◽

Support Vector Machine ◽

Cancer Cells ◽

Blood Cells ◽

White Blood Cells ◽

Machine Learning Algorithms ◽

Support Vector ◽

Support Vector Machine Algorithm ◽

Blood Cancer ◽

Acute Myeloid

Abstract: Blood Cancer cells forming a tissue is called lymphoma. Thus, disease decreases the cells to fight against the infection or cancer blood cells. Blood cancer is also categorized in too many types. The two main categories of blood cancer are Acute Lymphocytic Lymphoma and Acute Myeloid Lymphoma. In this project proposes a approach that robotic detects and segments the nucleolus from white blood cells in the microscopic Blood images. Here in this project, we have used the two Machine learning algorithms that are k-means algorithm, Support vector machine algorithm. K-mean algorithm is use for segmentation and clustering. Support vector machine algorithm is used for classification. Keywords: k-means, Support vector machine, Lymphoma, Acute Lymphocytic Lymphoma, Machine Learning

Download Full-text

Book Genre Categorization Using Machine Learning Algorithms (K-Nearest Neighbor, Support Vector Machine and Logistic Regression) using Customized Dataset

International Journal of Computer Science and Mobile Computing ◽

10.47760/ijcsmc.2021.v10i03.002 ◽

2021 ◽

Vol 10 (3) ◽

pp. 14-25

Author(s):

Parilkumar Shiroya

Keyword(s):

Machine Learning ◽

Support Vector Machine ◽

Logistic Regression ◽

Nearest Neighbor ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Support Vector ◽

K Nearest Neighbor

Download Full-text

Research on Parallel Support Vector Machine Based on Spark Big Data Platform

Scientific Programming ◽

10.1155/2021/7998417 ◽

2021 ◽

Vol 2021 ◽

pp. 1-9

Author(s):

Yao Huimin

Keyword(s):

Machine Learning ◽

Support Vector Machine ◽

Big Data ◽

Support Vector Machines ◽

Cross Validation ◽

Machine Learning Algorithms ◽

Support Vector ◽

Lambda Architecture ◽

Vector Machines ◽

Data Platform

With the development of cloud computing and distributed cluster technology, the concept of big data has been expanded and extended in terms of capacity and value, and machine learning technology has also received unprecedented attention in recent years. Traditional machine learning algorithms cannot solve the problem of effective parallelization, so a parallelization support vector machine based on Spark big data platform is proposed. Firstly, the big data platform is designed with Lambda architecture, which is divided into three layers: Batch Layer, Serving Layer, and Speed Layer. Secondly, in order to improve the training efficiency of support vector machines on large-scale data, when merging two support vector machines, the “special points” other than support vectors are considered, that is, the points where the nonsupport vectors in one subset violate the training results of the other subset, and a cross-validation merging algorithm is proposed. Then, a parallelized support vector machine based on cross-validation is proposed, and the parallelization process of the support vector machine is realized on the Spark platform. Finally, experiments on different datasets verify the effectiveness and stability of the proposed method. Experimental results show that the proposed parallelized support vector machine has outstanding performance in speed-up ratio, training time, and prediction accuracy.

Download Full-text

PREDICTIVE MODELLING AND ANALYTICS FOR DIABETES USING A MACHINE LEARNING APPROACH

INFORMATION TECHNOLOGY IN INDUSTRY ◽

10.17762/itii.v9i1.121 ◽

2021 ◽

Vol 9 (1) ◽

pp. 215-223

Author(s):

Prateek Mishra, Dr.Anurag Sharma, Dr. Abhishek Badholia

Keyword(s):

Machine Learning ◽

Support Vector Machine ◽

Machine Learning Algorithms ◽

Computational Method ◽

Supervised Machine Learning ◽

Undiagnosed Diabetes ◽

Support Vector ◽

Entire Body ◽

Data Manipulation ◽

Kernel Support Vector Machine

Adverse effects can be seen in the entire body due to the major disorders known as Diabetes. The risk of dangers like diabetic nephropathy, cardiac stroke and other disorders can increase severally because of the undiagnosed diabetes. Around the globe the people are suffering from this disease. For a healthy life early detection of this disease is very curtail. As the causes of the diabetes is increasing rapidly this disease might turn up as a reason for worldwide concern. Increasing the chances for a more accurate predictions and form experiences automatic learning by computational method may be provided by Machine Learning (ML). With the help of R data manipulation tool for trends development and with risk factor patterns detection in Pima Indian diabetes technique of machine learning is been used in the current researches. With the use of R data manipulation tool analysis and development five different predictive models is done for the categorization of patients into diabetic and non- diabetic. supervised machine learning algorithms namely multifactor dimensionality reduction (MDR), k-nearest neighbor (k-NN), artificial neural network (ANN) radial basis function (RBF) kernel support vector machine and linear kernel support vector machine (SVM-linear) are used for this purpose.

Download Full-text

Evaluation of Machine Learning Predictions of a Highly Resolved Time Series of Chlorophyll-a Concentration

Evaluation of Machine Learning predictions of a highly resolved long time series of Chlorophyll-a concentration

Predicting Future Occurrence of Acute Hypotensive Episodes Using Noninvasive and Invasive Features

Implementation of IoT Framework with Data Analysis Using Deep Learning Methods for Occupancy Prediction in a Building

Machine Learning Models for Finger Bend Evaluation using Implemented Low cost Flex Sensor

Prediction and Analysis of Gold Prices using Ensemble Machine Learning Algorithms

Predicting Student’s Performance Using Machine Learning Algorithm

Detection of Blood Cancer Cells Using Microscopic Images

Book Genre Categorization Using Machine Learning Algorithms (K-Nearest Neighbor, Support Vector Machine and Logistic Regression) using Customized Dataset﻿

Research on Parallel Support Vector Machine Based on Spark Big Data Platform

PREDICTIVE MODELLING AND ANALYTICS FOR DIABETES USING A MACHINE LEARNING APPROACH

Book Genre Categorization Using Machine Learning Algorithms (K-Nearest Neighbor, Support Vector Machine and Logistic Regression) using Customized Dataset