E-Gadget to Detect Food Freshness using IoT and ML

Dr. M. P. Borawake

doi:10.22214/ijraset.2021.39615

E-Gadget to Detect Food Freshness using IoT and ML

International Journal for Research in Applied Science and Engineering Technology ◽

10.22214/ijraset.2021.39615 ◽

2021 ◽

Vol 9 (12) ◽

pp. 2072-2076

Author(s):

Dr. M. P. Borawake

Keyword(s):

Machine Learning ◽

Food Item ◽

Good Health ◽

Learning Model ◽

Training Data ◽

Data Set ◽

Health And Nutrition ◽

Machine Learning Model ◽

Spoilage Bacteria ◽

Food Freshness

Abstract: The food we consume plays an important role in our daily life. It provides us energy which is needed to work, grow, be active, and to learn and think. The healthy food is essential for good health and nutrition. Light, oxygen, heat, humidity, temperature and spoilage bacteria can all affect both safety and quality of perishable foods. Food kept at room temperature undergoes some chemical reactions after certain period of time, which affects the taste, texture and smell of a food. Consuming spoiled food is harmful for consumers as it can lead to foodborne diseases. This project aims at detecting spoiled food using appropriate sensors and monitoring gases released by the particular food item. Sensors will measure the different parameters of food such as pH, ammonia gas, oxygen level, moisture, etc. The microcontroller takes the readings from sensors and these readings then given as an input to a machine learning model which can decide whether the food is spoilt or not based on training data set. Also, we plan to implement a machine learning model which can calculate the lifespan of that food item. Index Terms: Arduino Uno, Food spoilage, IoT, Machine Learning, Sensors.

Download Full-text

Machine Learning Model for IRIS Flower Classification using Tensor Flow and PyTorch

International Journal of Advanced Research in Science, Communication and Technology ◽

10.48175/ijarsct-780 ◽

2021 ◽

pp. 15-23

Author(s):

Dr. Kalaivazhi Vijayaragavan ◽

S. Prakathi ◽

S. Rajalakshmi ◽

M Sandhiya

Keyword(s):

Neural Network ◽

Machine Learning ◽

Classification Accuracy ◽

Learning Model ◽

Training Data ◽

Data Set ◽

Machine Learning Model ◽

Learning Frameworks ◽

Learning Concept ◽

Iris Data

Machine learning is a subfield of artificial intelligence, which is learning algorithms to make decision-based on data and try to behave like a human being. Classification is one of the most fundamental concepts in machine learning. It is a process of recognizing, understanding, and grouping ideas and objects into pre-set categories or sub-populations. Using precategorized training datasets, machine learning concept use variety of algorithms to classify the future datasets into categories. Classification algorithms use input training data in machine learning to predict the subsequent data that fall into one of the predetermined categories. To improve the classification accuracy design of neural network is regarded as effective model to obtain better accuracy. However, design of neural network is usually consider scaling layer, perceptron layers and probabilistic layer. In this paper, an enhanced model selection can be evaluated with training and testing strategy. Further, the classification accuracy can be predicted. Finally by using two popular machine learning frameworks: PyTorch and Tensor Flow the prediction of classification accuracy is compared. Results demonstrate that the proposed method can predict with more accuracy. After the deployment of our machine learning model the performance of the model has been evaluated with the help of iris data set.

Download Full-text

Building a predictive model for warfarin dosing via machine learning

European Heart Journal ◽

10.1093/ehjci/ehaa946.3491 ◽

2020 ◽

Vol 41 (Supplement_2) ◽

Author(s):

O Tan ◽

J.Z Wu ◽

K.K Yeo ◽

S.Y.A Lee ◽

J.S Hon ◽

...

Keyword(s):

Machine Learning ◽

Missing Values ◽

Warfarin Dose ◽

Learning Model ◽

Training Data ◽

Multiple Drug ◽

Data Set ◽

Machine Learning Model ◽

Data Points ◽

Test Sets

Abstract Background Warfarin titration via International Normalised Ratio (INR) monitoring can be a challenge for patients on long term anticoagulation due to its multiple drug interactions, long half life and patient's individual reponse, yet critically important due to its narrow therapeutic index. Machine learning may have a role in learning and replicating existing warfarin prescribing practices, to be potentially incorporated into an automated warfarin titration model. Purpose We aim to explore the feasibility of using machine learning to develop a model that can learn and predict actual warfarin titration practices. Methods A retrospective dataset of 4,247 patients with 48,895 data points of INR values were obtained from our institutional database. Patients who had less than 5 visits recorded, invalid or missing values were excluded. Variables studied included age, warfarin indication, warfarin dose, target INR range, actual INR values, time between titration and time in therapeutic range (TTR as defined by the Rosenthal formula). The machine learning model was developed on an unbiased training data set (1,805 patients), further refined on a handpicked balanced validation set (400 patients), before being evaluated on two balanced test sets of 100 patients each. The test sets were handpicked based on the criteria of TTR (“in vs out of range”) and stability of INR results (“low vs high fluctuation”) (Table 1). Given the time series nature of the data, a Recurrent Neural Network (RNN) was chosen to learn warfarin prescription practices. Long-short term memory (LSTM) cells were further employed to address the problem of time gaps between warfarin titration visits which could result in vanishing gradients. Results A total of 2,163 patients with 42,622 data points were studied (mean age 65±11.7 years, 54.7% male). The mean TTR was 65.4%. The total warfarin dose per week as predicted by the RNN was compared with actual total warfarin dose per week prescribed for each patient in the test sets. The coefficient of determination for the RNN in the “in vs out of range” and “low vs high fluctuation” test sets were 0.941 and 0.962 respectively (Figure 1). Conclusion This proof of concept study demonstrated that a RNN based machine learning model was able to learn and predict warfarin dosage changes with reasonable accuracy. The findings merit further evaluation of the potential use of machine learning in an automated warfarin titration model. Funding Acknowledgement Type of funding source: None

Download Full-text

AN EFFICIENT MACHINE LEARNING MODEL FOR PREDICTION OF ACUTE MYOCARDIAL INFARCTION

Recent Advances in Computer Science and Communications ◽

10.2174/2666255813666200325104317 ◽

2020 ◽

Vol 13 ◽

Author(s):

Dhilsath Fathima.M ◽

S. Justin Samuel ◽

R. Hari Haran

Keyword(s):

Machine Learning ◽

Myocardial Infarction ◽

Acute Myocardial Infarction ◽

Logistic Regression ◽

Decision Tree ◽

Learning Model ◽

Training Dataset ◽

Data Set ◽

Machine Learning Model ◽

Proposed Model

Aim: This proposed work is used to develop an improved and robust machine learning model for predicting Myocardial Infarction (MI) could have substantial clinical impact. Objectives: This paper explains how to build machine learning based computer-aided analysis system for an early and accurate prediction of Myocardial Infarction (MI) which utilizes framingham heart study dataset for validation and evaluation. This proposed computer-aided analysis model will support medical professionals to predict myocardial infarction proficiently. Methods: The proposed model utilize the mean imputation to remove the missing values from the data set, then applied principal component analysis to extract the optimal features from the data set to enhance the performance of the classifiers. After PCA, the reduced features are partitioned into training dataset and testing dataset where 70% of the training dataset are given as an input to the four well-liked classifiers as support vector machine, k-nearest neighbor, logistic regression and decision tree to train the classifiers and 30% of test dataset is used to evaluate an output of machine learning model using performance metrics as confusion matrix, classifier accuracy, precision, sensitivity, F1-score, AUC-ROC curve. Results: Output of the classifiers are evaluated using performance measures and we observed that logistic regression provides high accuracy than K-NN, SVM, decision tree classifiers and PCA performs sound as a good feature extraction method to enhance the performance of proposed model. From these analyses, we conclude that logistic regression having good mean accuracy level and standard deviation accuracy compared with the other three algorithms. AUC-ROC curve of the proposed classifiers is analyzed from the output figure.4, figure.5 that logistic regression exhibits good AUC-ROC score, i.e. around 70% compared to k-NN and decision tree algorithm. Conclusion: From the result analysis, we infer that this proposed machine learning model will act as an optimal decision making system to predict the acute myocardial infarction at an early stage than an existing machine learning based prediction models and it is capable to predict the presence of an acute myocardial Infarction with human using the heart disease risk factors, in order to decide when to start lifestyle modification and medical treatment to prevent the heart disease.

Download Full-text

MODES: model-based optimization on distributed embedded systems

Machine Learning ◽

10.1007/s10994-021-06014-6 ◽

2021 ◽

Author(s):

Junjie Shi ◽

Jiang Bian ◽

Jakob Richter ◽

Kuan-Hsun Chen ◽

Jörg Rahnenführer ◽

...

Keyword(s):

Machine Learning ◽

Embedded Systems ◽

Learning Model ◽

Black Box ◽

Distributed Embedded Systems ◽

Data Set ◽

Individual Model ◽

Model Based ◽

Machine Learning Model ◽

Distributed Machine Learning

AbstractThe predictive performance of a machine learning model highly depends on the corresponding hyper-parameter setting. Hence, hyper-parameter tuning is often indispensable. Normally such tuning requires the dedicated machine learning model to be trained and evaluated on centralized data to obtain a performance estimate. However, in a distributed machine learning scenario, it is not always possible to collect all the data from all nodes due to privacy concerns or storage limitations. Moreover, if data has to be transferred through low bandwidth connections it reduces the time available for tuning. Model-Based Optimization (MBO) is one state-of-the-art method for tuning hyper-parameters but the application on distributed machine learning models or federated learning lacks research. This work proposes a framework $$\textit{MODES}$$ MODES that allows to deploy MBO on resource-constrained distributed embedded systems. Each node trains an individual model based on its local data. The goal is to optimize the combined prediction accuracy. The presented framework offers two optimization modes: (1) $$\textit{MODES}$$ MODES -B considers the whole ensemble as a single black box and optimizes the hyper-parameters of each individual model jointly, and (2) $$\textit{MODES}$$ MODES -I considers all models as clones of the same black box which allows it to efficiently parallelize the optimization in a distributed setting. We evaluate $$\textit{MODES}$$ MODES by conducting experiments on the optimization for the hyper-parameters of a random forest and a multi-layer perceptron. The experimental results demonstrate that, with an improvement in terms of mean accuracy ($$\textit{MODES}$$ MODES -B), run-time efficiency ($$\textit{MODES}$$ MODES -I), and statistical stability for both modes, $$\textit{MODES}$$ MODES outperforms the baseline, i.e., carry out tuning with MBO on each node individually with its local sub-data set.

Download Full-text

Generation of geometric interpolations of building types with deep variational autoencoders

Design Science ◽

10.1017/dsj.2020.31 ◽

2020 ◽

Vol 6 ◽

Author(s):

Jaime de Miguel Rodríguez ◽

Maria Eugenia Villafañe ◽

Luka Piškorec ◽

Fernando Sancho Caparrini

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Large Data ◽

Learning Model ◽

Large Data Sets ◽

Data Sets ◽

Connectivity Map ◽

Data Set ◽

3D Objects ◽

Machine Learning Model

Abstract This work presents a methodology for the generation of novel 3D objects resembling wireframes of building types. These result from the reconstruction of interpolated locations within the learnt distribution of variational autoencoders (VAEs), a deep generative machine learning model based on neural networks. The data set used features a scheme for geometry representation based on a ‘connectivity map’ that is especially suited to express the wireframe objects that compose it. Additionally, the input samples are generated through ‘parametric augmentation’, a strategy proposed in this study that creates coherent variations among data by enabling a set of parameters to alter representative features on a given building type. In the experiments that are described in this paper, more than 150 k input samples belonging to two building types have been processed during the training of a VAE model. The main contribution of this paper has been to explore parametric augmentation for the generation of large data sets of 3D geometries, showcasing its problems and limitations in the context of neural networks and VAEs. Results show that the generation of interpolated hybrid geometries is a challenging task. Despite the difficulty of the endeavour, promising advances are presented.

Download Full-text

A Novel XGBoost Method to Infer the Primary Lesion of 20 Solid Tumor Types From Gene Expression Data

Frontiers in Genetics ◽

10.3389/fgene.2021.632761 ◽

2021 ◽

Vol 12 ◽

Author(s):

Sijie Chen ◽

Wenjing Zhou ◽

Jinghui Tu ◽

Jian Li ◽

Bo Wang ◽

...

Keyword(s):

Machine Learning ◽

Learning Model ◽

Training Data ◽

Diagnostic Efficiency ◽

Metastatic Tumors ◽

Pathological Conditions ◽

Machine Learning Model ◽

Independent Test ◽

Tumor Types ◽

Fold Cross Validation

PurposeEstablish a suitable machine learning model to identify its primary lesions for primary metastatic tumors in an integrated learning approach, making it more accurate to improve primary lesions’ diagnostic efficiency.MethodsAfter deleting the features whose expression level is lower than the threshold, we use two methods to perform feature selection and use XGBoost for classification. After the optimal model is selected through 10-fold cross-validation, it is verified on an independent test set.ResultsSelecting features with around 800 genes for training, theR2-score of a 10-fold CV of training data can reach 96.38%, and theR2-score of test data can reach 83.3%.ConclusionThese findings suggest that by combining tumor data with machine learning methods, each cancer has its corresponding classification accuracy, which can be used to predict primary metastatic tumors’ location. The machine-learning-based method can be used as an orthogonal diagnostic method to judge the machine learning model processing and clinical actual pathological conditions.

Download Full-text

Application of Machine Learning to Interpret Steady State Drainage Relative Permeability Experiments

10.2118/207877-ms ◽

2021 ◽

Author(s):

Eric Sonny Mathew ◽

Moussa Tembely ◽

Waleed AlAmeri ◽

Emad W. Al-Shalabi ◽

Abdul Ravoof Shaik

Keyword(s):

Neural Network ◽

Machine Learning ◽

Experimental Data ◽

Steady State ◽

Relative Permeability ◽

Learning Model ◽

Gradient Boosting ◽

Data Set ◽

Machine Learning Model ◽

Extreme Gradient Boosting

Abstract A meticulous interpretation of steady-state or unsteady-state relative permeability (Kr) experimental data is required to determine a complete set of Kr curves. In this work, three different machine learning models was developed to assist in a faster estimation of these curves from steady-state drainage coreflooding experimental runs. The three different models that were tested and compared were extreme gradient boosting (XGB), deep neural network (DNN) and recurrent neural network (RNN) algorithms. Based on existing mathematical models, a leading edge framework was developed where a large database of Kr and Pc curves were generated. This database was used to perform thousands of coreflood simulation runs representing oil-water drainage steady-state experiments. The results obtained from these simulation runs, mainly pressure drop along with other conventional core analysis data, were utilized to estimate Kr curves based on Darcy's law. These analytically estimated Kr curves along with the previously generated Pc curves were fed as features into the machine learning model. The entire data set was split into 80% for training and 20% for testing. K-fold cross validation technique was applied to increase the model accuracy by splitting the 80% of the training data into 10 folds. In this manner, for each of the 10 experiments, 9 folds were used for training and the remaining one was used for model validation. Once the model is trained and validated, it was subjected to blind testing on the remaining 20% of the data set. The machine learning model learns to capture fluid flow behavior inside the core from the training dataset. The trained/tested model was thereby employed to estimate Kr curves based on available experimental results. The performance of the developed model was assessed using the values of the coefficient of determination (R2) along with the loss calculated during training/validation of the model. The respective cross plots along with comparisons of ground-truth versus AI predicted curves indicate that the model is capable of making accurate predictions with error percentage between 0.2 and 0.6% on history matching experimental data for all the three tested ML techniques (XGB, DNN, and RNN). This implies that the AI-based model exhibits better efficiency and reliability in determining Kr curves when compared to conventional methods. The results also include a comparison between classical machine learning approaches, shallow and deep neural networks in terms of accuracy in predicting the final Kr curves. The various models discussed in this research work currently focusses on the prediction of Kr curves for drainage steady-state experiments; however, the work can be extended to capture the imbibition cycle as well.

Download Full-text

Numerical Investigations on the Shape Optimization of Stainless-Steel Ring Joint with Machine Learning

Applied Sciences ◽

10.3390/app11010223 ◽

2020 ◽

Vol 11 (1) ◽

pp. 223

Author(s):

Minsoo Kim ◽

Sarang Yi ◽

Seokmoo Hong

Keyword(s):

Machine Learning ◽

Finite Element ◽

Learning Model ◽

Training Data ◽

Metal Ring ◽

Rubber Ring ◽

Water Pipes ◽

Metal Rings ◽

Machine Learning Model ◽

Joint Method

Since pipes used for water pipes are thin and difficult to fasten using welding or screws, they are fastened by a crimping joint method using a metal ring and a rubber ring. In the conventional crimping joint method, the metal ring and the rubber ring are arranged side by side. However, if water leaks from the rubber ring, there is a problem that the adjacent metal ring is rapidly corroded. In this study, to delay and minimize the corrosion of connected water pipes, we propose a spaced crimping joint method in which metal rings and rubber rings are separated at appropriate intervals. This not only improves the contact performance between the connected water pipes but also minimizes the load applied to the crimping jig during crimping to prevent damage to the jig. For this, finite element analyses were performed for the crimp tool and process analysis, and the design parameters were set as the curling length at the top of the joint, the distance between the metal rings and rubber rings, and the crimp jig radius. Through FEA of 100 cases, data to be trained in machine learning were acquired. After that, training data were trained on a machine learning model and compared with a regression model to verify the model’s performance. If the number of training data is small, the two methods are similar. However, the greater the number of training data, the higher the accuracy predicted by the machine learning model. Finally, the spaced crimping joint to which the derived optimal shape was applied was manufactured, and the maximum pressure and pressure distribution applied during compression were obtained using a pressure film. This is almost similar to the value obtained by finite element analysis under the same conditions, and through this, the validity of the approach proposed in this study was verified.

Download Full-text

An Explainable Machine Learning Model for Early Prediction of Sepsis Using ICU Data

10.5772/intechopen.98957 ◽

2021 ◽

Author(s):

Naimahmed Nesaragi ◽

Shivnarayan Patidar

Keyword(s):

Machine Learning ◽

Utility Score ◽

Learning Model ◽

Training Data ◽

Early Prediction ◽

Clinical Variables ◽

Machine Learning Model ◽

Hospital Systems ◽

Prediction Risk ◽

The Given

Early identification of individuals with sepsis is very useful in assisting clinical triage and decision-making, resulting in early intervention and improved outcomes. This study aims to develop an explainable machine learning model with the clinical interpretability to predict sepsis onset before 6 hours and validate with improved prediction risk power for every time interval since admission to the ICU. The retrospective observational cohort study is carried out using PhysioNet Challenge 2019 ICU data from three distinct hospital systems, viz. A, B, and C. Data from A and B were shared publicly for training and validation while sequestered data from all three cohorts were used for scoring. However, this study is limited only to publicly available training data. Training data contains 15,52,210 patient records of 40,336 ICU patients with up to 40 clinical variables (sourced for each hour of their ICU stay) divided into two datasets, based on hospital systems A and B. The clinical feature exploration and interpretation for early prediction of sepsis is achieved using the proposed framework, viz. the explainable Machine Learning model for Early Prediction of Sepsis (xMLEPS). A total of 85 features comprising the given 40 clinical variables augmented with 10 derived physiological features and 35 time-lag difference features are fed to xMLEPS for the said prediction task of sepsis onset. A ten-fold cross-validation scheme is employed wherein an optimal prediction risk threshold is searched for each of the 10 LightGBM models. These optimum threshold values are later used by the corresponding models to refine the predictive power in terms of utility score for the prediction of labels in each fold. The entire framework is designed via Bayesian optimization and trained with the resultant feature set of 85 features, yielding an average normalized utility score of 0.4214 and area under receiver operating characteristic curve of 0.8591 on publicly available training data. This study establish a practical and explainable sepsis onset prediction model for ICU data using applied ML approach, mainly gradient boosting. The study highlights the clinical significance of physiological inter-relations among the given and proposed clinical signs via feature importance and SHapley Additive exPlanations (SHAP) plots for visualized interpretation.

Download Full-text

Overfitting, Model Tuning, and Evaluation of Prediction Performance

Multivariate Statistical Machine Learning Methods for Genomic Prediction ◽

10.1007/978-3-030-89010-0_4 ◽

2022 ◽

pp. 109-139

Author(s):

Osval Antonio Montesinos López ◽

Abelardo Montesinos López ◽

Jose Crossa

Keyword(s):

Machine Learning ◽

Predictive Modeling ◽

Learning Model ◽

Prediction Performance ◽

Training Data ◽

Statistical Machine Learning ◽

Machine Learning Model ◽

Data Points ◽

The Difference ◽

Model Tuning

AbstractThe overfitting phenomenon happens when a statistical machine learning model learns very well about the noise as well as the signal that is present in the training data. On the other hand, an underfitted phenomenon occurs when only a few predictors are included in the statistical machine learning model that represents the complete structure of the data pattern poorly. This problem also arises when the training data set is too small and thus an underfitted model does a poor job of fitting the training data and unsatisfactorily predicts new data points. This chapter describes the importance of the trade-off between prediction accuracy and model interpretability, as well as the difference between explanatory and predictive modeling: Explanatory modeling minimizes bias, whereas predictive modeling seeks to minimize the combination of bias and estimation variance. We assess the importance and different methods of cross-validation as well as the importance and strategies of tuning that are key to the successful use of some statistical machine learning methods. We explain the most important metrics for evaluating the prediction performance for continuous, binary, categorical, and count response variables.

Download Full-text