Predicting Food Safety Compliance for Informed Food Outlet Inspections: A Machine Learning Approach

Rachel A. Oldroyd; Michelle A. Morris; Mark Birkin

doi:10.3390/ijerph182312635

Predicting Food Safety Compliance for Informed Food Outlet Inspections: A Machine Learning Approach

International Journal of Environmental Research and Public Health ◽

10.3390/ijerph182312635 ◽

2021 ◽

Vol 18 (23) ◽

pp. 12635

Author(s):

Rachel A. Oldroyd ◽

Michelle A. Morris ◽

Mark Birkin

Keyword(s):

Machine Learning ◽

Methodological Approach ◽

Sampling Strategy ◽

Learning Approaches ◽

Food Outlet ◽

Fine Grained ◽

Geographic Context ◽

Safety Compliance ◽

Machine Learning Approach ◽

Data Driven Approach

Consumer food environments have transformed dramatically in the last decade. Food outlet prevalence has increased, and people are eating food outside the home more than ever before. Despite these developments, national spending on food control has reduced. The National Audit Office report that only 14% of local authorities are up to date with food business inspections, exposing consumers to unknown levels of risk. Given the scarcity of local authority resources, this paper presents a data-driven approach to predict compliance for newly opened businesses and those awaiting repeat inspections. This work capitalizes on the theory that food outlet compliance is a function of its geographic context, namely the characteristics of the neighborhood within which it sits. We explore the utility of three machine learning approaches to predict non-compliant food outlets in England and Wales using openly accessible socio-demographic, business type, and urbanness features at the output area level. We find that the synthetic minority oversampling technique alongside a random forest algorithm with a 1:1 sampling strategy provides the best predictive power. Our final model retrieves and identifies 84% of total non-compliant outlets in a test set of 92,595 (sensitivity = 0.843, specificity = 0.745, precision = 0.274). The originality of this work lies in its unique and methodological approach which combines the use of machine learning with fine-grained neighborhood data to make robust predictions of compliance.

Get full-text (via PubEx)

Mol2vec: Unsupervised Machine Learning Approach with Chemical Intuition

10.26434/chemrxiv.5513581.v1 ◽

2017 ◽

Author(s):

Sabrina Jaeger ◽

Simone Fulle ◽

Samo Turk

Keyword(s):

Machine Learning ◽

Language Processing ◽

Supervised Machine Learning ◽

Learning Approach ◽

Learning Approaches ◽

Unsupervised Machine Learning ◽

Feature Representations ◽

Machine Learning Approach ◽

The Individual ◽

Vector Representations

Inspired by natural language processing techniques we here introduce Mol2vec which is an unsupervised machine learning approach to learn vector representations of molecular substructures. Similarly, to the Word2vec models where vectors of closely related words are in close proximity in the vector space, Mol2vec learns vector representations of molecular substructures that are pointing in similar directions for chemically related substructures. Compounds can finally be encoded as vectors by summing up vectors of the individual substructures and, for instance, feed into supervised machine learning approaches to predict compound properties. The underlying substructure vector embeddings are obtained by training an unsupervised machine learning approach on a so-called corpus of compounds that consists of all available chemical matter. The resulting Mol2vec model is pre-trained once, yields dense vector representations and overcomes drawbacks of common compound feature representations such as sparseness and bit collisions. The prediction capabilities are demonstrated on several compound property and bioactivity data sets and compared with results obtained for Morgan fingerprints as reference compound representation. Mol2vec can be easily combined with ProtVec, which employs the same Word2vec concept on protein sequences, resulting in a proteochemometric approach that is alignment independent and can be thus also easily used for proteins with low sequence similarities.

Get full-text (via PubEx)

Predictors of remission from body dysmorphic disorder after internet-delivered cognitive behavior therapy: a machine learning approach

10.31234/osf.io/eqcdx ◽

2019 ◽

Author(s):

Oskar Flygare ◽

Jesper Enander ◽

Erik Andersson ◽

Brjánn Ljótsson ◽

Volen Z Ivanov ◽

...

Keyword(s):

Machine Learning ◽

Logistic Regression ◽

Random Forests ◽

Clinical Utility ◽

Body Dysmorphic Disorder ◽

Prediction Models ◽

Behavioral Therapy ◽

Learning Approach ◽

Learning Approaches ◽

Machine Learning Approach

**Background:** Previous attempts to identify predictors of treatment outcomes in body dysmorphic disorder (BDD) have yielded inconsistent findings. One way to increase precision and clinical utility could be to use machine learning methods, which can incorporate multiple non-linear associations in prediction models. **Methods:** This study used a random forests machine learning approach to test if it is possible to reliably predict remission from BDD in a sample of 88 individuals that had received internet-delivered cognitive behavioral therapy for BDD. The random forest models were compared to traditional logistic regression analyses. **Results:** Random forests correctly identified 78% of participants as remitters or non-remitters at post-treatment. The accuracy of prediction was lower in subsequent follow-ups (68%, 66% and 61% correctly classified at 3-, 12- and 24-month follow-ups, respectively). Depressive symptoms, treatment credibility, working alliance, and initial severity of BDD were among the most important predictors at the beginning of treatment. By contrast, the logistic regression models did not identify consistent and strong predictors of remission from BDD. **Conclusions:** The results provide initial support for the clinical utility of machine learning approaches in the prediction of outcomes of patients with BDD. **Trial registration:** ClinicalTrials.gov ID: NCT02010619.

Get full-text (via PubEx)

Validation of an Internationally Derived Patient Severity Phenotype to Support COVID-19 Analytics from Electronic Health Record Data

Journal of the American Medical Informatics Association ◽

10.1093/jamia/ocab018 ◽

2021 ◽

Author(s):

Jeffrey G Klann ◽

Griffin M Weber ◽

Hossein Estiri ◽

Bertrand Moal ◽

Paul Avillach ◽

...

Keyword(s):

Machine Learning ◽

Electronic Health Record ◽

Chart Review ◽

Learning Approach ◽

Health Record ◽

Learning Approaches ◽

Electronic Health Record Data ◽

Icu Admission ◽

Machine Learning Approach ◽

Electronic Health

Abstract Introduction The Consortium for Clinical Characterization of COVID-19 by EHR (4CE) is an international collaboration addressing COVID-19 with federated analyses of electronic health record (EHR) data. Objective We sought to develop and validate a computable phenotype for COVID-19 severity. Methods Twelve 4CE sites participated. First we developed an EHR-based severity phenotype consisting of six code classes, and we validated it on patient hospitalization data from the 12 4CE clinical sites against the outcomes of ICU admission and/or death. We also piloted an alternative machine-learning approach and compared selected predictors of severity to the 4CE phenotype at one site. Results The full 4CE severity phenotype had pooled sensitivity of 0.73 and specificity 0.83 for the combined outcome of ICU admission and/or death. The sensitivity of individual code categories for acuity had high variability - up to 0.65 across sites. At one pilot site, the expert-derived phenotype had mean AUC 0.903 (95% CI: 0.886, 0.921), compared to AUC 0.956 (95% CI: 0.952, 0.959) for the machine-learning approach. Billing codes were poor proxies of ICU admission, with as low as 49% precision and recall compared to chart review. Discussion We developed a severity phenotype using 6 code classes that proved resilient to coding variability across international institutions. In contrast, machine-learning approaches may overfit hospital-specific orders. Manual chart review revealed discrepancies even in the gold-standard outcomes, possibly due to heterogeneous pandemic conditions. Conclusion We developed an EHR-based severity phenotype for COVID-19 in hospitalized patients and validated it at 12 international sites.

Get full-text (via PubEx)

Analysis of Machine Learning Approach for the modemodel in SWC Mapping in Automotive Systems

Embedded Selforganising Systems ◽

10.14464/ess.v7i1.447 ◽

2021 ◽

Vol 7 (1) ◽

pp. 16-19

Author(s):

Owes Khan ◽

Geri Shahini ◽

Wolfram Hardt

Keyword(s):

Machine Learning ◽

Autonomous Driving ◽

Learning Approach ◽

Software Components ◽

Control Mechanisms ◽

Learning Approaches ◽

Software Applications ◽

Automotive Systems ◽

Machine Learning Approach ◽

Development Processes

Automotive technologies are ever-increasinglybecoming digital. Highly autonomous driving togetherwith digital E/E control mechanisms include thousandsof software applications which are called as software components. Together with the industry requirements, and rigorous software development processes, mappingof components as a software pool becomes very difficult.This article analyses and discusses the integration possiblilities of machine learning approaches to our previously introduced concept of mapping of software components through a common software pool.

Get full-text (via PubEx)

Effectiveness of Machine Learning Approaches Towards Credibility Assessment of Crowdfunding Projects for Reliable Recommendations

Applied Sciences ◽

10.3390/app10249062 ◽

2020 ◽

Vol 10 (24) ◽

pp. 9062

Author(s):

Wafa Shafqat ◽

Yung-Cheol Byun ◽

Namje Park

Keyword(s):

Machine Learning ◽

Latent Dirichlet Allocation ◽

Short Term Memory ◽

Research Work ◽

Learning Approaches ◽

Credibility Assessment ◽

User Interests ◽

Machine Learning Approach ◽

Hybrid Machine ◽

Numeric Data

Recommendation systems aim to decipher user interests, preferences, and behavioral patterns automatically. However, it becomes trickier to make the most trustworthy and reliable recommendation to users, especially when their hardest earned money is at risk. The credibility of the recommendation is of magnificent importance in crowdfunding project recommendations. This research work devises a hybrid machine learning-based approach for credible crowdfunding projects’ recommendations by wisely incorporating backers’ sentiments and other influential features. The proposed model has four modules: a feature extraction module, a hybrid LDA-LSTM (latent Dirichlet allocation and long short-term memory) based latent topics evaluation module, credibility formulation, and recommendation module. The credibility analysis proffers a process of correlating project creator’s proficiency, reviewers’ sentiments, and their influence to estimate a project’s authenticity level that makes our model robust to unauthentic and untrustworthy projects and profiles. The recommendation module selects projects based on the user’s interests with the highest credible scores and recommends them. The proposed recommendation method harnesses numeric data and sentiment expressions linked with comments, backers’ preferences, profile data, and the creator’s credibility for quantitative examination of several alternative projects. The proposed model’s evaluation depicts that credibility assessment based on the hybrid machine learning approach contributes efficient results (with 98% accuracy) than existing recommendation models. We have also evaluated our credibility assessment technique on different categories of the projects, i.e., suspended, canceled, delivered, and never delivered projects, and achieved satisfactory outcomes, i.e., 93%, 84%, 58%, and 93%, projects respectively accurately classify into our desired range of credibility.

Get full-text (via PubEx)

Retinal Area Segmentation using Adaptive Superpixalation and its Classification using RBFN

International Journal of Electrical and Computer Engineering (IJECE) ◽

10.11591/ijece.v6i6.pp2674-2681 ◽

2016 ◽

Vol 6 (6) ◽

pp. 2674

Author(s):

Nimisha Singh ◽

Rana Gill

Keyword(s):

Machine Learning ◽

Retinal Disease ◽

Learning Approach ◽

Learning Approaches ◽

Retinal Area ◽

Original Image ◽

Feature Generation ◽

Medical Field ◽

Machine Learning Approach ◽

Image Pattern

<p class="Abstract">Retinal disease is the very important issue in medical field. To diagnose the disease, it needs to detect the true retinal area. Artefacts like eyelids and eyelashes are come along with retinal part so removal of artefacts is the big task for better diagnosis of disease into the retinal part. In this paper, we have proposed the segmentation and use machine learning approaches to detect the true retinal part. Preprocessing is done on the original image using Gamma Normalization which helps to enhance the image that can gives detail information about the image. Then the segmentation is performed on the Gamma Normalized image by Superpixel method. Superpixel is the group of pixel into different regions which is based on compactness and regional size. Superpixel is used to reduce the complexity of image processing task and provide suitable primitive image pattern. Then feature generation must be done and machine learning approach helps to extract true retinal area. The experimental evaluation gives the better result with accuracy of 96%.</p>

Get full-text (via PubEx)

Validation of a Derived International Patient Severity Algorithm to Support COVID-19 Analytics from Electronic Health Record Data

10.1101/2020.10.13.20201855 ◽

2020 ◽

Cited By ~ 2

Author(s):

Jeffrey G Klann ◽

Griffin M Weber ◽

Hossein Estiri ◽

Bertrand Moal ◽

Paul Avillach ◽

...

Keyword(s):

Machine Learning ◽

Chart Review ◽

Learning Approach ◽

Learning Approaches ◽

Electronic Health Record Data ◽

Icu Admission ◽

Machine Learning Approach ◽

Proxy Measure ◽

Definition Of

AbstractIntroductionThe Consortium for Clinical Characterization of COVID-19 by EHR (4CE) includes hundreds of hospitals internationally using a federated computational approach to COVID-19 research using the EHR.ObjectiveWe sought to develop and validate a standard definition of COVID-19 severity from readily accessible EHR data across the Consortium.MethodsWe developed an EHR-based severity algorithm and validated it on patient hospitalization data from 12 4CE clinical sites against the outcomes of ICU admission and/or death. We also used a machine learning approach to compare selected predictors of severity to the 4CE algorithm at one site.ResultsThe 4CE severity algorithm performed with pooled sensitivity of 0.73 and specificity 0.83 for the combined outcome of ICU admission and/or death. The sensitivity of single code categories for acuity were unacceptably inaccurate - varying by up to 0.65 across sites. A multivariate machine learning approach identified codes resulting in mean AUC 0.956 (95% CI: 0.952, 0.959) compared to 0.903 (95% CI: 0.886, 0.921) using expert-derived codes. Billing codes were poor proxies of ICU admission, with 49% precision and recall compared against chart review at one partner institution.DiscussionWe developed a proxy measure of severity that proved resilient to coding variability internationally by using a set of 6 code classes. In contrast, machine-learning approaches may tend to overfit hospital-specific orders. Manual chart review revealed discrepancies even in the gold standard outcomes, possibly due to pandemic conditions.ConclusionWe developed an EHR-based algorithm for COVID-19 severity and validated it at 12 international sites.

Get full-text (via PubEx)

Twitter Discussions and Emotions About the COVID-19 Pandemic: Machine Learning Approach (Preprint)

10.2196/preprints.20550 ◽

2020 ◽

Author(s):

Jia Xue ◽

Junxiang Chen ◽

Ran Hu ◽

Chen Chen ◽

Chengda Zheng ◽

...

Keyword(s):

Public Health ◽

Machine Learning ◽

Latent Dirichlet Allocation ◽

The United States ◽

Response Monitoring ◽

Learning Approach ◽

Learning Approaches ◽

Public Response ◽

The Public ◽

Machine Learning Approach

BACKGROUND It is important to measure the public response to the COVID-19 pandemic. Twitter is an important data source for infodemiology studies involving public response monitoring. OBJECTIVE The objective of this study is to examine COVID-19–related discussions, concerns, and sentiments using tweets posted by Twitter users. METHODS We analyzed 4 million Twitter messages related to the COVID-19 pandemic using a list of 20 hashtags (eg, “coronavirus,” “COVID-19,” “quarantine”) from March 7 to April 21, 2020. We used a machine learning approach, Latent Dirichlet Allocation (LDA), to identify popular unigrams and bigrams, salient topics and themes, and sentiments in the collected tweets. RESULTS Popular unigrams included “virus,” “lockdown,” and “quarantine.” Popular bigrams included “COVID-19,” “stay home,” “corona virus,” “social distancing,” and “new cases.” We identified 13 discussion topics and categorized them into 5 different themes: (1) public health measures to slow the spread of COVID-19, (2) social stigma associated with COVID-19, (3) COVID-19 news, cases, and deaths, (4) COVID-19 in the United States, and (5) COVID-19 in the rest of the world. Across all identified topics, the dominant sentiments for the spread of COVID-19 were anticipation that measures can be taken, followed by mixed feelings of trust, anger, and fear related to different topics. The public tweets revealed a significant feeling of fear when people discussed new COVID-19 cases and deaths compared to other topics. CONCLUSIONS This study showed that Twitter data and machine learning approaches can be leveraged for an infodemiology study, enabling research into evolving public discussions and sentiments during the COVID-19 pandemic. As the situation rapidly evolves, several topics are consistently dominant on Twitter, such as confirmed cases and death rates, preventive measures, health authorities and government policies, COVID-19 stigma, and negative psychological reactions (eg, fear). Real-time monitoring and assessment of Twitter discussions and concerns could provide useful data for public health emergency responses and planning. Pandemic-related fear, stigma, and mental health concerns are already evident and may continue to influence public trust when a second wave of COVID-19 occurs or there is a new surge of the current pandemic.

Get full-text (via PubEx)

Classifying Non-Sentential Utterances in Dialogue: A Machine Learning Approach

Computational Linguistics ◽

10.1162/coli.2007.33.3.397 ◽

2007 ◽

Vol 33 (3) ◽

pp. 397-427 ◽

Cited By ~ 17

Author(s):

Raquel Fernández ◽

Jonathan Ginzburg ◽

Shalom Lappin

Keyword(s):

Machine Learning ◽

Pilot Study ◽

Full Range ◽

Learning Approach ◽

Learning Methods ◽

Fine Grained ◽

Machine Learning Methods ◽

Machine Learning Approach ◽

The Right

In this article we use well-known machine learning methods to tackle a novel task, namely the classification of non-sentential utterances (NSUs) in dialogue. We introduce a fine-grained taxonomy of NSU classes based on corpus work, and then report on the results of several machine learning experiments. First, we present a pilot study focused on one of the NSU classes in the taxonomy—bare wh-phrases or “sluices”—and explore the task of disambiguating between the different readings that sluices can convey. We then extend the approach to classify the full range of NSU classes, obtaining results of around an 87% weighted F-score. Thus our experiments show that, for the taxonomy adopted, the task of identifying the right NSU class can be successfully learned, and hence provide a very encouraging basis for the more general enterprise of fully processing NSUs.

Get full-text (via PubEx)

Comparative Study of Machine Learning Approaches for Predicting Creep Behavior of Polyurethane Elastomer

Polymers ◽

10.3390/polym13111768 ◽

2021 ◽

Vol 13 (11) ◽

pp. 1768

Author(s):

Chunhao Yang ◽

Wuning Ma ◽

Jianlin Zhong ◽

Zhendong Zhang

Keyword(s):

Machine Learning ◽

Support Vector ◽

Polyurethane Elastomer ◽

Learning Approaches ◽

Creep Stress ◽

Machine Learning Approach ◽

Creep Time ◽

Multilayer Perceptron Network ◽

Testing Set

The long-term mechanical properties of viscoelastic polymers are among their most important aspects. In the present research, a machine learning approach was proposed for creep properties’ prediction of polyurethane elastomer considering the effect of creep time, creep temperature, creep stress and the hardness of the material. The approaches are based on multilayer perceptron network, random forest and support vector machine regression, respectively. While the genetic algorithm and k-fold cross-validation were used to tune the hyper-parameters. The results showed that the three models all proposed excellent fitting ability for the training set. Moreover, the three models had different prediction capabilities for the testing set by focusing on various changing factors. The correlation coefficient values between the predicted and experimental strains were larger than 0.913 (mostly larger than 0.998) on the testing set when choosing the reasonable model.

Get full-text (via PubEx)