scholarly journals Integrating a Low-Cost Electronic Nose and Machine Learning Modelling to Assess Coffee Aroma Profile and Intensity

Sensors ◽  
2021 ◽  
Vol 21 (6) ◽  
pp. 2016
Author(s):  
Claudia Gonzalez Viejo ◽  
Eden Tongson ◽  
Sigfredo Fuentes

Aroma is one of the main attributes that consumers consider when appreciating and selecting a coffee; hence it is considered an important quality trait. However, the most common methods to assess aroma are based on expensive equipment or human senses through sensory evaluation, which is time-consuming and requires highly trained assessors to avoid subjectivity. Therefore, this study aimed to estimate the coffee intensity and aromas using a low-cost and portable electronic nose (e-nose) and machine learning modeling. For this purpose, triplicates of six commercial coffee samples with different intensity levels were used for this study. Two machine learning models were developed based on artificial neural networks using the data from the e-nose as inputs to (i) classify the samples into low, medium, and high-intensity (Model 1) and (ii) to predict the relative abundance of 45 different aromas (Model 2). Results showed that it is possible to estimate the intensity of coffees with high accuracy (98%; Model 1), as well as to predict the specific aromas obtaining a high correlation coefficient (R = 0.99), and no under- or over-fitting of the models were detected. The proposed contactless, nondestructive, rapid, reliable, and low-cost method showed to be effective in evaluating volatile compounds in coffee, which is a potential technique to be applied within all stages of the production process to detect any undesirable characteristics on–time and ensure high-quality products.

Author(s):  
Pratyush Kaware

In this paper a cost-effective sensor has been implemented to read finger bend signals, by attaching the sensor to a finger, so as to classify them based on the degree of bent as well as the joint about which the finger was being bent. This was done by testing with various machine learning algorithms to get the most accurate and consistent classifier. Finally, we found that Support Vector Machine was the best algorithm suited to classify our data, using we were able predict live state of a finger, i.e., the degree of bent and the joints involved. The live voltage values from the sensor were transmitted using a NodeMCU micro-controller which were converted to digital and uploaded on a database for analysis.


2021 ◽  
Author(s):  
Siddharth Ghule ◽  
Sayan Bagchi ◽  
Kumar Vanka

<div>Electricity generation is a major contributing factor for greenhouse gas emissions. Energy storage systems available today have a combined capacity to store less than 1% of the electricity being consumed worldwide. Redox Flow Batteries (RFBs) are promising candidates for green and efficient energy storage systems. RFBs are being used in renewable energy systems, but their widespread adoption is limited due to high production costs and toxicity associated with the transition-metal-based redox-active species. Therefore, cheaper and greener alternative organic redox-active species are being investigated. Recent reports have shown organic molecules based on phenazine are promising candidates for redox-active species in RFBs. However, the large number of available organic compounds makes the conventional experimental and DFT methods impractical to screen thousands of molecules in a reasonable amount of time. In contrast, machine-learning models have low development time, short prediction time, and high accuracy; thus, are being heavily investigated for virtual screening applications. In this work, we developed machine-learning models to predict the redox potential of phenazine derivatives in DME solvent using a small dataset of 185 molecules. 2D, 3D, and Molecular Fingerprint features were computed using readily available and easy-to-use python libraries, making our approach easily adaptable to similar work. Twenty linear and non-linear machine-learning models were investigated in this work. These models achieved excellent performance on the unseen data (i.e., R<sup>2</sup> > 0.98, MSE < 0.008 V2 and MAE < 0.07 V). Model performance was assessed in a consistent manner using the training and evaluation pipeline developed in this work. We showed that 2D molecular features are most informative and achieve the best prediction accuracy among four feature sets. We also showed that often less preferred but relatively faster linear models could perform better than non-linear models when the feature set contains different types of features (i.e., 2D, 3D, and Molecular Fingerprints). Further investigations revealed that it is possible to reduce the training and inference time without sacrificing prediction accuracy by using a small subset of features. Moreover, models were able to predict the previously reported promising redox-active compounds with high accuracy. Also, significantly low prediction errors were observed for the functional groups. Although some functional groups had only one compound in the training set, best-performing models could achieve errors (MAPE) less than 10%. The major source of error was a lack of data near-zero and in the positive region. Therefore, this work shows that it is possible to develop accurate machine-learning models that could potentially screen millions of compounds in a short amount of time with a small training set and limited number of easy to compute features. Thus, results obtained in this report would help in the adoption of green energy by accelerating the field of materials discovery for energy storage applications.</div>


PLoS ONE ◽  
2021 ◽  
Vol 16 (4) ◽  
pp. e0249285
Author(s):  
Limin Yu ◽  
Alexandra Halalau ◽  
Bhavinkumar Dalal ◽  
Amr E. Abbas ◽  
Felicia Ivascu ◽  
...  

Background The Coronavirus disease 2019 (COVID-19) pandemic has affected millions of people across the globe. It is associated with a high mortality rate and has created a global crisis by straining medical resources worldwide. Objectives To develop and validate machine-learning models for prediction of mechanical ventilation (MV) for patients presenting to emergency room and for prediction of in-hospital mortality once a patient is admitted. Methods Two cohorts were used for the two different aims. 1980 COVID-19 patients were enrolled for the aim of prediction ofMV. 1036 patients’ data, including demographics, past smoking and drinking history, past medical history and vital signs at emergency room (ER), laboratory values, and treatments were collected for training and 674 patients were enrolled for validation using XGBoost algorithm. For the second aim to predict in-hospital mortality, 3491 hospitalized patients via ER were enrolled. CatBoost, a new gradient-boosting algorithm was applied for training and validation of the cohort. Results Older age, higher temperature, increased respiratory rate (RR) and a lower oxygen saturation (SpO2) from the first set of vital signs were associated with an increased risk of MV amongst the 1980 patients in the ER. The model had a high accuracy of 86.2% and a negative predictive value (NPV) of 87.8%. While, patients who required MV, had a higher RR, Body mass index (BMI) and longer length of stay in the hospital were the major features associated with in-hospital mortality. The second model had a high accuracy of 80% with NPV of 81.6%. Conclusion Machine learning models using XGBoost and catBoost algorithms can predict need for mechanical ventilation and mortality with a very high accuracy in COVID-19 patients.


2018 ◽  
Vol 20 (47) ◽  
pp. 30006-30020 ◽  
Author(s):  
Wenwen Li ◽  
Yasunobu Ando

Recently, the machine learning (ML) force field has emerged as a powerful atomic simulation approach because of its high accuracy and low computational cost.


Processes ◽  
2020 ◽  
Vol 8 (2) ◽  
pp. 224 ◽  
Author(s):  
Sami Sader ◽  
István Husti ◽  
Miklós Daróczi

In this paper, multiclass classification is used to develop a novel approach to enhance failure mode and effects analysis and the generation of risk priority number. This is done by developing four machine learning models using auto machine learning. Failure mode and effects analysis is a technique that is used in industry to identify possible failures that may occur and the effects of these failures on the system. Meanwhile, risk priority number is a numeric value that is calculated by multiplying three associated parameters namely severity, occurrence and detectability. The value of risk priority number determines the next actions to be made. A dataset that includes a one-year registry of 1532 failures with their description, severity, occurrence, and detectability is used to develop four models to predict the values of severity, occurrence, and detectability. Meanwhile, the resulted models are evaluated using 10% of the dataset. Evaluation results show that the proposed models have high accuracy whereas the average value of precision, recall, and F1 score are in the range of 86.6–93.2%, 67.9–87.9%, 0.892–0.765% respectively. The proposed work helps in carrying out failure mode and effects analysis in a more efficient way as compared to the conventional techniques.


2021 ◽  
Author(s):  
Siddharth Ghule ◽  
Sayan Bagchi ◽  
Kumar Vanka

<div>Electricity generation is a major contributing factor for greenhouse gas emissions. Energy storage systems available today have a combined capacity to store less than 1% of the electricity being consumed worldwide. Redox Flow Batteries (RFBs) are promising candidates for green and efficient energy storage systems. RFBs are being used in renewable energy systems, but their widespread adoption is limited due to high production costs and toxicity associated with the transition-metal-based redox-active species. Therefore, cheaper and greener alternative organic redox-active species are being investigated. Recent reports have shown organic molecules based on phenazine are promising candidates for redox-active species in RFBs. However, the large number of available organic compounds makes the conventional experimental and DFT methods impractical to screen thousands of molecules in a reasonable amount of time. In contrast, machine-learning models have low development time, short prediction time, and high accuracy; thus, are being heavily investigated for virtual screening applications. In this work, we developed machine-learning models to predict the redox potential of phenazine derivatives in DME solvent using a small dataset of 185 molecules. 2D, 3D, and Molecular Fingerprint features were computed using readily available and easy-to-use python libraries, making our approach easily adaptable to similar work. Twenty linear and non-linear machine-learning models were investigated in this work. These models achieved excellent performance on the unseen data (i.e., R<sup>2</sup> > 0.98, MSE < 0.008 V2 and MAE < 0.07 V). Model performance was assessed in a consistent manner using the training and evaluation pipeline developed in this work. We showed that 2D molecular features are most informative and achieve the best prediction accuracy among four feature sets. We also showed that often less preferred but relatively faster linear models could perform better than non-linear models when the feature set contains different types of features (i.e., 2D, 3D, and Molecular Fingerprints). Further investigations revealed that it is possible to reduce the training and inference time without sacrificing prediction accuracy by using a small subset of features. Moreover, models were able to predict the previously reported promising redox-active compounds with high accuracy. Also, significantly low prediction errors were observed for the functional groups. Although some functional groups had only one compound in the training set, best-performing models could achieve errors (MAPE) less than 10%. The major source of error was a lack of data near-zero and in the positive region. Therefore, this work shows that it is possible to develop accurate machine-learning models that could potentially screen millions of compounds in a short amount of time with a small training set and limited number of easy to compute features. Thus, results obtained in this report would help in the adoption of green energy by accelerating the field of materials discovery for energy storage applications.</div>


2021 ◽  
Author(s):  
Siddharth Ghule ◽  
Sayan Bagchi ◽  
Kumar Vanka

<div>Electricity generation is a major contributing factor for greenhouse gas emissions. Energy storage systems available today have a combined capacity to store less than 1% of the electricity being consumed worldwide. Redox Flow Batteries (RFBs) are promising candidates for green and efficient energy storage systems. RFBs are being used in renewable energy systems, but their widespread adoption is limited due to high production costs and toxicity associated with the transition-metal-based redox-active species. Therefore, cheaper and greener alternative organic redox-active species are being investigated. Recent reports have shown organic molecules based on phenazine are promising candidates for redox-active species in RFBs. However, the large number of available organic compounds makes the conventional experimental and DFT methods impractical to screen thousands of molecules in a reasonable amount of time. In contrast, machine-learning models have low development time, short prediction time, and high accuracy; thus, are being heavily investigated for virtual screening applications. In this work, we developed machine-learning models to predict the redox potential of phenazine derivatives in DME solvent using a small dataset of 185 molecules. 2D, 3D, and Molecular Fingerprint features were computed using readily available and easy-to-use python libraries, making our approach easily adaptable to similar work. Twenty linear and non-linear machine-learning models were investigated in this work. These models achieved excellent performance on the unseen data (i.e., R<sup>2</sup> > 0.98, MSE < 0.008 V2 and MAE < 0.07 V). Model performance was assessed in a consistent manner using the training and evaluation pipeline developed in this work. We showed that 2D molecular features are most informative and achieve the best prediction accuracy among four feature sets. We also showed that often less preferred but relatively faster linear models could perform better than non-linear models when the feature set contains different types of features (i.e., 2D, 3D, and Molecular Fingerprints). Further investigations revealed that it is possible to reduce the training and inference time without sacrificing prediction accuracy by using a small subset of features. Moreover, models were able to predict the previously reported promising redox-active compounds with high accuracy. Also, significantly low prediction errors were observed for the functional groups. Although some functional groups had only one compound in the training set, best-performing models could achieve errors (MAPE) less than 10%. The major source of error was a lack of data near-zero and in the positive region. Therefore, this work shows that it is possible to develop accurate machine-learning models that could potentially screen millions of compounds in a short amount of time with a small training set and limited number of easy to compute features. Thus, results obtained in this report would help in the adoption of green energy by accelerating the field of materials discovery for energy storage applications.</div>


2020 ◽  
Author(s):  
Maleeha Naseem ◽  
Hajra Arshad ◽  
Syeda Amrah Hashimi ◽  
Furqan Irfan ◽  
Fahad Shabbir Ahmed

ABSTRACTBackgroundThe second wave of COVID-19 pandemic is anticipated to be worse than the initial one and will strain the healthcare systems even more during the winter months. Our aim was to develop a machine learning-based model to predict mortality using the deep learning Neo-V framework. We hypothesized this novel machine learning approach could be applied to COVID-19 patients to predict mortality successfully with high accuracy.MethodsThe current Deep-Neo-V model is built on our previously statistically rigorous machine learning framework [Fahad-Liaqat-Ahmad Intensive Machine (FLAIM) framework] that evaluated statistically significant risk factors, generated new combined variables and then supply these risk factors to deep neural network to predict mortality in RT-PCR positive COVID-19 patients in the inpatient setting. We analyzed adult patients (≥18 years) admitted to the Aga Khan University Hospital, Pakistan with a working diagnosis of COVID-19 infection (n=1228). We excluded patients that were negative on COVID-19 on RT-PCR, had incomplete or missing health records. The first phase selection of risk factor was done using Cox-regression univariate and multivariate analyses. In the second phase, we generated new variables and tested those statistically significant for mortality and in the third and final phase we applied deep neural networks and other traditional machine learning models like Decision Tree Model, k-nearest neighbor models and others.ResultsA total of 1228 cases were diagnosed as COVID-19 infection, we excluded 14 patients after the exclusion criteria and (n=)1214 patients were analyzed. We observed that several clinical and laboratory-based variables were statistically significant for both univariate and multivariate analyses while others were not. With most significant being septic shock (hazard ratio [HR], 4.30; 95% confidence interval [CI], 2.91-6.37), supportive treatment (HR, 3.51; 95% CI, 2.01-6.14), abnormal international normalized ratio (INR) (HR, 3.24; 95% CI, 2.28-4.63), admission to the intensive care unit (ICU) (HR, 3.24; 95% CI, 2.22-4.74), treatment with invasive ventilation (HR, 3.21; 95% CI, 2.15-4.79) and laboratory lymphocytic derangement (HR, 2.79; 95% CI, 1.6-4.86). Machine learning results showed our DNN (Neo-V) model outperformed all conventional machine learning models with test set accuracy of 99.53%, sensitivity of 89.87%, and specificity of 95.63%; positive predictive value, 50.00%; negative predictive value, 91.05%; and area under the curve of the receiver-operator curve of 88.5.ConclusionOur novel Deep-Neo-V model outperformed all other machine learning models. The model is easy to implement, user friendly and with high accuracy.


Sign in / Sign up

Export Citation Format

Share Document