Optimising an FFQ Using a Machine Learning Pipeline to teach an Efficient Nutrient Intake Predictive Model

Food frequency questionnaires (FFQs) are the most commonly selected tools in nutrition monitoring, as they are inexpensive, easily implemented and provide useful information regarding dietary intake. They are usually carefully drafted by experts from nutritional and/or medical fields and can be validated by using other dietary monitoring techniques. FFQs can get very extensive, which could indicate that some of the questions are less significant than others and could be omitted without losing too much information. In this paper, machine learning is used to explore how reducing the number of questions affects the predicted nutrient values and diet quality score. The paper addresses the problem of removing redundant questions and finding the best subset of questions in the Extended Short Form Food Frequency Questionnaire (ESFFFQ), developed as part of the H2020 project WellCo. Eight common machine-learning algorithms were compared on different subsets of questions by using the PROMETHEE method, which compares methods and subsets via multiple performance measures. According to the results, for some of the targets, specifically sugar intake, fiber intake and protein intake, a smaller subset of questions are sufficient to predict diet quality scores. Additionally, for smaller subsets of questions, machine-learning algorithms generally perform better than statistical methods for predicting intake and diet quality scores. The proposed method could therefore be useful for finding the most informative subsets of questions in other FFQs as well. This could help experts develop FFQs that provide the necessary information and are not overbearing for those answering.

Download Full-text

Algorithmic and human prediction of success in human collaboration from visual features

Scientific Reports ◽

10.1038/s41598-021-81145-3 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Martin Saveski ◽

Edmond Awad ◽

Iyad Rahwan ◽

Manuel Cebrian

Keyword(s):

Machine Learning ◽

Visual Cues ◽

Success Factors ◽

Group Performance ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Adventure Game ◽

Group Success ◽

The Relationship ◽

Better Than

AbstractAs groups are increasingly taking over individual experts in many tasks, it is ever more important to understand the determinants of group success. In this paper, we study the patterns of group success in Escape The Room, a physical adventure game in which a group is tasked with escaping a maze by collectively solving a series of puzzles. We investigate (1) the characteristics of successful groups, and (2) how accurately humans and machines can spot them from a group photo. The relationship between these two questions is based on the hypothesis that the characteristics of successful groups are encoded by features that can be spotted in their photo. We analyze >43K group photos (one photo per group) taken after groups have completed the game—from which all explicit performance-signaling information has been removed. First, we find that groups that are larger, older and more gender but less age diverse are significantly more likely to escape. Second, we compare humans and off-the-shelf machine learning algorithms at predicting whether a group escaped or not based on the completion photo. We find that individual guesses by humans achieve 58.3% accuracy, better than random, but worse than machines which display 71.6% accuracy. When humans are trained to guess by observing only four labeled photos, their accuracy increases to 64%. However, training humans on more labeled examples (eight or twelve) leads to a slight, but statistically insignificant improvement in accuracy (67.4%). Humans in the best training condition perform on par with two, but worse than three out of the five machine learning algorithms we evaluated. Our work illustrates the potentials and the limitations of machine learning systems in evaluating group performance and identifying success factors based on sparse visual cues.

Download Full-text

Application of machine learning algorithms to predict permeability in tight sandstone formations

Nafta-Gaz ◽

10.18668/ng.2021.05.01 ◽

2021 ◽

Vol 77 (5) ◽

pp. 283-292

Author(s):

Tomasz Topór ◽

Keyword(s):

Machine Learning ◽

Oil And Gas ◽

Learning Algorithms ◽

Confining Pressure ◽

Machine Learning Algorithms ◽

Core Material ◽

Confining Stress ◽

Tight Sandstone ◽

Oil And Gas Exploration ◽

Better Than

The application of machine learning algorithms in petroleum geology has opened a new chapter in oil and gas exploration. Machine learning algorithms have been successfully used to predict crucial petrophysical properties when characterizing reservoirs. This study utilizes the concept of machine learning to predict permeability under confining stress conditions for samples from tight sandstone formations. The models were constructed using two machine learning algorithms of varying complexity (multiple linear regression [MLR] and random forests [RF]) and trained on a dataset that combined basic well information, basic petrophysical data, and rock type from a visual inspection of the core material. The RF algorithm underwent feature engineering to increase the number of predictors in the models. In order to check the training models’ robustness, 10-fold cross-validation was performed. The MLR and RF applications demonstrated that both algorithms can accurately predict permeability under constant confining pressure (R2 0.800 vs. 0.834). The RF accuracy was about 3% better than that of the MLR and about 6% better than the linear reference regression (LR) that utilized only porosity. Porosity was the most influential feature of the models’ performance. In the case of RF, the depth was also significant in the permeability predictions, which could be evidence of hidden interactions between the variables of porosity and depth. The local interpretation revealed the common features among outliers. Both the training and testing sets had moderate-low porosity (3–10%) and a lack of fractures. In the test set, calcite or quartz cementation also led to poor permeability predictions. The workflow that utilizes the tidymodels concept will be further applied in more complex examples to predict spatial petrophysical features from seismic attributes using various machine learning algorithms.

Download Full-text

Novel Privacy Preserving Non-Invasive Sensing-Based Diagnoses of Pneumonia Disease Leveraging Deep Network Model

Sensors ◽

10.3390/s22020461 ◽

2022 ◽

Vol 22 (2) ◽

pp. 461

Author(s):

Mujeeb Ur Rehman ◽

Arslan Shafique ◽

Kashif Hesham Khan ◽

Sohail Khalid ◽

Abdullah Alhumaidi Alotaibi ◽

...

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Medical Records ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

X Ray ◽

Non Invasive ◽

Proposed Model ◽

Pneumonia Diagnosis ◽

Better Than

This article presents non-invasive sensing-based diagnoses of pneumonia disease, exploiting a deep learning model to make the technique non-invasive coupled with security preservation. Sensing and securing healthcare and medical images such as X-rays that can be used to diagnose viral diseases such as pneumonia is a challenging task for researchers. In the past few years, patients’ medical records have been shared using various wireless technologies. The wireless transmitted data are prone to attacks, resulting in the misuse of patients’ medical records. Therefore, it is important to secure medical data, which are in the form of images. The proposed work is divided into two sections: in the first section, primary data in the form of images are encrypted using the proposed technique based on chaos and convolution neural network. Furthermore, multiple chaotic maps are incorporated to create a random number generator, and the generated random sequence is used for pixel permutation and substitution. In the second part of the proposed work, a new technique for pneumonia diagnosis using deep learning, in which X-ray images are used as a dataset, is proposed. Several physiological features such as cough, fever, chest pain, flu, low energy, sweating, shaking, chills, shortness of breath, fatigue, loss of appetite, and headache and statistical features such as entropy, correlation, contrast dissimilarity, etc., are extracted from the X-ray images for the pneumonia diagnosis. Moreover, machine learning algorithms such as support vector machines, decision trees, random forests, and naive Bayes are also implemented for the proposed model and compared with the proposed CNN-based model. Furthermore, to improve the CNN-based proposed model, transfer learning and fine tuning are also incorporated. It is found that CNN performs better than other machine learning algorithms as the accuracy of the proposed work when using naive Bayes and CNN is 89% and 97%, respectively, which is also greater than the average accuracy of the existing schemes, which is 90%. Further, K-fold analysis and voting techniques are also incorporated to improve the accuracy of the proposed model. Different metrics such as entropy, correlation, contrast, and energy are used to gauge the performance of the proposed encryption technology, while precision, recall, F1 score, and support are used to evaluate the effectiveness of the proposed machine learning-based model for pneumonia diagnosis. The entropy and correlation of the proposed work are 7.999 and 0.0001, respectively, which reflects that the proposed encryption algorithm offers a higher security of the digital data. Moreover, a detailed comparison with the existing work is also made and reveals that both the proposed models work better than the existing work.

Download Full-text

Comparing Machine Learning Algorithms for Predicting ICU Admission and Mortality in COVID-19

10.1101/2020.11.20.20235598 ◽

2020 ◽

Author(s):

Sonu Subudhi ◽

Ashish Verma ◽

Ankit B. Patel ◽

C. Corey Hardin ◽

Melin J. Khandekar ◽

...

Keyword(s):

Machine Learning ◽

Clinical Decision Making ◽

Learning Algorithms ◽

Disease Outbreaks ◽

Clinical Decision ◽

Machine Learning Algorithms ◽

Healthcare Database ◽

Icu Admission ◽

Infectious Disease Outbreaks ◽

Better Than

AbstractAs predicting the trajectory of COVID-19 disease is challenging, machine learning models could assist physicians determine high-risk individuals. This study compares the performance of 18 machine learning algorithms for predicting ICU admission and mortality among COVID-19 patients. Using COVID-19 patient data from the Mass General Brigham (MGB) healthcare database, we developed and internally validated models using patients presenting to Emergency Department (ED) between March-April 2020 (n = 1144) and externally validated them using those individuals who encountered ED between May-August 2020 (n = 334). We show that ensemble-based models perform better than other model types at predicting both 5-day ICU admission and 28-day mortality from COVID-19. CRP, LDH, and procalcitonin levels were important for ICU admission models whereas eGFR <60 ml/min/1.73m2, ventilator use, and potassium levels were the most important variables for predicting mortality. Implementing such models would help in clinical decision-making for future COVID-19 and other infectious disease outbreaks.

Download Full-text

Proposed approach to detect distributed denial of service attacks in software defined network using machine learning algorithms

International Journal of Engineering & Technology ◽

10.14419/ijet.v7i2.8.10488 ◽

2018 ◽

Vol 7 (2.8) ◽

pp. 472 ◽

Cited By ~ 1

Author(s):

Shruti Banerjee ◽

Partha Sarathi Chakraborty ◽

. .

Keyword(s):

Machine Learning ◽

Denial Of Service ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Paper Machine ◽

Software Defined Network ◽

Distributed Denial Of Service ◽

Control Plane ◽

Data Plane ◽

And Control

SDN (Software Defined Network) is rapidly gaining importance of ‘programmable network’ infrastructure. The SDN architecture separates the Data plane (forwarding devices) and Control plane (controller of the SDN). This makes it easy to deploy new versions to the infrastructure and provides straightforward network virtualization. Distributed Denial-of-Service attack is a major cyber security threat to the SDN. It is equally vulnerable to both data plane and control plane. In this paper, machine learning algorithms such as Naïve Bayesian, KNN, K Means, K-Medoids, Linear Regression, use to classify the incoming traffic as usual or unusual. Above mentioned algorithms are measured using the two metrics: accuracy and detection rate. The best fit algorithm is applied to implement the signature IDS which forms the module 1 of the proposed IDS. Second Module uses open connections to state the exact node which is an attacker and to block that particular IP address by placing it in Access Control List (ACL), thus increasing the processing speed of SDN as a whole.

Download Full-text

Is Predicting Software Security Bugs Using Deep Learning Better Than the Traditional Machine Learning Algorithms?

2018 IEEE International Conference on Software Quality, Reliability and Security (QRS) ◽

10.1109/qrs.2018.00023 ◽

2018 ◽

Cited By ~ 1

Author(s):

Caesar Jude Clemente ◽

Fehmi Jaafar ◽

Yasir Malik

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Software Security ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Better Than

Download Full-text

A systematic review of the machine learning algorithms for the computational analysis in different domains

International Journal of Advanced Technology and Engineering Exploration ◽

10.19101/ijatee.2020.762057 ◽

2020 ◽

Vol 7 (71) ◽

pp. 147-164

Author(s):

Ravita Chahar ◽

Deepinder Kaur

Keyword(s):

Machine Learning ◽

Computational Analysis ◽

Learning Algorithm ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Support Vector ◽

Second Phase ◽

Paper Machine ◽

K Nearest Neighbors ◽

Systematic Analysis

In this paper machine learning algorithms have been discussed and analyzed. It has been discussed considering computational aspects in different domains. These algorithms have the capability of building mathematical and analytical model. These models may be helpful in the decision-making process. This paper elaborates the computational analysis in three different ways. The background and analytical aspect have been presented with the learning application in the first phase. In the second phase detail literature has been explored along with the pros and cons of the applied techniques in different domains. Based on the literatures, gap identification and the limitations have been discussed and highlighted in the third phase. Finally, computational analysis has been presented along with the machine learning results in terms of accuracy. The results mainly focus on the exploratory data analysis, domain applicability and the predictive problems. Our systematic analysis shows that the applicability of machine learning is wide and the results may be improved based on these algorithms. It is also inferred from the literature analysis that at the applicability of machine learning algorithm has the capability in the performance improvement. The main methods discussed here are classification and regression trees (CART), logistic regression, naïve Bayes (NB), k-nearest neighbors (KNN), support vector machine (SVM) and decision tree (DT). The domain covered mainly are disease detection, business intelligence, industry automation and sentiment analysis.

Download Full-text

Phishing websites blacklisting using machine learning algorithms

International Journal of Engineering & Technology ◽

10.14419/ijet.v7i1.7.10646 ◽

2018 ◽

Vol 7 (1.7) ◽

pp. 179

Author(s):

Nivedhitha G ◽

Carmel Mary Belinda M.J ◽

Rupavathy N

Keyword(s):

Machine Learning ◽

Data Mining ◽

Feature Extraction ◽

Learning Algorithms ◽

Source Code ◽

Machine Learning Algorithms ◽

Paper Machine ◽

Data Mining Algorithms ◽

Mining Algorithms ◽

The Web

The development of the phishing sites is by all accounts amazing. Despite the fact that the web clients know about these sorts of phishing assaults, part of clients move toward becoming casualty to these assaults. Quantities of assaults are propelled with the point of making web clients trust that they are speaking with a trusted entity. Phishing is one among them. Phishing is consistently developing since it is anything but difficult to duplicate a whole site utilizing the HTML source code. By rolling out slight improvements in the source code, it is conceivable to guide the victim to the phishing site. Phishers utilize part of strategies to draw the unsuspected web client. Consequently an efficient mechanism is required to recognize the phishing sites from the real sites keeping in mind the end goal to spare credential data. To detect the phishing websites and to identify it as information leaking sites, the system proposes data mining algorithms. In this paper, machine-learning algorithms have been utilized for modeling the prediction task. The process of identity extraction and feature extraction are discussed in this paper and the various experiments carried out to discover the performance of the models are demonstrated.

Download Full-text

Detection of diabetes mellitus using machine learning algorithms

International Journal of Research in Pharmaceutical Sciences ◽

10.26452/ijrps.v11i4.3662 ◽

2020 ◽

Vol 11 (4) ◽

pp. 6881-6887

Author(s):

Ramya. G. Franklin ◽

Muthukumar B

Keyword(s):

Diabetes Mellitus ◽

Machine Learning ◽

Learning Algorithms ◽

Preventive Measure ◽

Heterogeneous Data ◽

Machine Learning Algorithms ◽

Health Issues ◽

Health Care Data ◽

Diagnosis Of Diabetes Mellitus ◽

Better Than

The growth of technology has brought in sophistication in our day to day activities. This sophistication has brought in many health issues. One among the most important problems that has currently become a typical issue is Diabetics Mellitus. Diabetes Mellitus has affected over 246 million people worldwide with a majority of them being women. The WHO reports that by 2025 this number is expected to rise to over 380 million. Prevention is better than cure. The health care data is large and complex. It is better to predict the disease at an earlier stage which may save the life and also have a preventive measure in controlling the diseases. In this paper, we have taken up a heterogeneous data to analyze the various factors which are affecting this disease. The various machine learning algorithms used in this paper help us to decide the attributes which play a major role in diagnosis of Diabetes Mellitus.

Download Full-text

Credit Card Fraud Detection System

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.b6258.0710221 ◽

2021 ◽

Vol 10 (2) ◽

pp. 158-162

Author(s):

Kartik Madkaikar ◽

◽

Manthan Nagvekar ◽

Preity Parab ◽

Riya Raika ◽

...

Keyword(s):

Machine Learning ◽

Financial Institutions ◽

Credit Card ◽

Detection System ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Gradient Boosting ◽

Support Vector ◽

Paper Machine ◽

Credit Card Fraud

Credit card fraud is a serious criminal offense. It costs individuals and financial institutions billions of dollars annually. According to the reports of the Federal Trade Commission (FTC), a consumer protection agency, the number of theft reports doubled in the last two years. It makes the detection and prevention of fraudulent activities critically important to financial institutions. Machine learning algorithms provide a proactive mechanism to prevent credit card fraud with acceptable accuracy. In this paper Machine Learning algorithms such as Logistic Regression, Naïve Bayes, Random Forest, K- Nearest Neighbor, Gradient Boosting, Support Vector Machine, and Neural Network algorithms are implemented for detection of fraudulent transactions. A comparative analysis of these algorithms is performed to identify an optimal solution.

Download Full-text