Towards personalized nutritional treatment for malnutrition using machine learning-based screening tools

2021 ◽  
Vol 40 (10) ◽  
pp. 5249-5251
Author(s):  
Orit Raphaeli ◽  
Pierre Singer
2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Nita Vangeepuram ◽  
Bian Liu ◽  
Po-hsiang Chiu ◽  
Linhua Wang ◽  
Gaurav Pandey

AbstractPrediabetes and diabetes mellitus (preDM/DM) have become alarmingly prevalent among youth in recent years. However, simple questionnaire-based screening tools to reliably assess diabetes risk are only available for adults, not youth. As a first step in developing such a tool, we used a large-scale dataset from the National Health and Nutritional Examination Survey (NHANES) to examine the performance of a published pediatric clinical screening guideline in identifying youth with preDM/DM based on American Diabetes Association diagnostic biomarkers. We assessed the agreement between the clinical guideline and biomarker criteria using established evaluation measures (sensitivity, specificity, positive/negative predictive value, F-measure for the positive/negative preDM/DM classes, and Kappa). We also compared the performance of the guideline to those of machine learning (ML) based preDM/DM classifiers derived from the NHANES dataset. Approximately 29% of the 2858 youth in our study population had preDM/DM based on biomarker criteria. The clinical guideline had a sensitivity of 43.1% and specificity of 67.6%, positive/negative predictive values of 35.2%/74.5%, positive/negative F-measures of 38.8%/70.9%, and Kappa of 0.1 (95%CI: 0.06–0.14). The performance of the guideline varied across demographic subgroups. Some ML-based classifiers performed comparably to or better than the screening guideline, especially in identifying preDM/DM youth (p = 5.23 × 10−5).We demonstrated that a recommended pediatric clinical screening guideline did not perform well in identifying preDM/DM status among youth. Additional work is needed to develop a simple yet accurate screener for youth diabetes risk, potentially by using advanced ML methods and a wider range of clinical and behavioral health data.


SLEEP ◽  
2021 ◽  
Vol 44 (Supplement_2) ◽  
pp. A166-A166
Author(s):  
Ankita Paul ◽  
Karen Wong ◽  
Anup Das ◽  
Diane Lim ◽  
Miranda Tan

Abstract Introduction Cancer patients are at an increased risk of moderate-to-severe obstructive sleep apnea (OSA). The STOP-Bang score is a commonly used screening questionnaire to assess risk of OSA in the general population. We hypothesize that cancer-relevant features, like radiation therapy (RT), may be used to determine the risk of OSA in cancer patients. Machine learning (ML) with non-parametric regression is applied to increase the prediction accuracy of OSA risk. Methods Ten features namely STOP-Bang score, history of RT to the head/neck/thorax, cancer type, cancer stage, metastasis, hypertension, diabetes, asthma, COPD, and chronic kidney disease were extracted from a database of cancer patients with a sleep study. The ML technique, K-Nearest-Neighbor (KNN), with a range of k values (5 to 20), was chosen because, unlike Logistic Regression (LR), KNN is not presumptive of data distribution and mapping function, and supports non-linear relationships among features. A correlation heatmap was computed to identify features having high correlation with OSA. Principal Component Analysis (PCA) was performed on the correlated features and then KNN was applied on the components to predict the risk of OSA. Receiver Operating Characteristic (ROC) - Area Under Curve (AUC) and Precision-Recall curves were computed to compare and validate performance for different test sets and majority class scenarios. Results In our cohort of 174 cancer patients, the accuracy in determining OSA among cancer patients using STOP-Bang score was 82.3% (LR) and 90.69% (KNN) but reduced to 89.9% in KNN using all 10 features mentioned above. PCA + KNN application using STOP-Bang score and RT as features, increased prediction accuracy to 94.1%. We validated our ML approach using a separate cohort of 20 cancer patients; the accuracies in OSA prediction were 85.57% (LR), 91.1% (KNN), and 92.8% (PCA + KNN). Conclusion STOP-Bang score and history of RT can be useful to predict risk of OSA in cancer patients with the PCA + KNN approach. This ML technique can refine screening tools to improve prediction accuracy of OSA in cancer patients. Larger studies investigating additional features using ML may improve OSA screening accuracy in various populations Support (if any):


2020 ◽  
Author(s):  
Haishuai Wang ◽  
Paul Avillach

BACKGROUND In the United States, about 3 million people have autism spectrum disorder (ASD), and around 1 out of 59 children are diagnosed with ASD. People with ASD have characteristic social communication deficits and repetitive behaviors. The causes of this disorder remain unknown; however, in up to 25% of cases, a genetic cause can be identified. Detecting ASD as early as possible is desirable because early detection of ASD enables timely interventions in children with ASD. Identification of ASD based on objective pathogenic mutation screening is the major first step toward early intervention and effective treatment of affected children. OBJECTIVE Recent investigation interrogated genomics data for detecting and treating autism disorders, in addition to the conventional clinical interview as a diagnostic test. Since deep neural networks perform better than shallow machine learning models on complex and high-dimensional data, in this study, we sought to apply deep learning to genetic data obtained across thousands of simplex families at risk for ASD to identify contributory mutations and to create an advanced diagnostic classifier for autism screening. METHODS After preprocessing the genomics data from the Simons Simplex Collection, we extracted top ranking common variants that may be protective or pathogenic for autism based on a chi-square test. A convolutional neural network–based diagnostic classifier was then designed using the identified significant common variants to predict autism. The performance was then compared with shallow machine learning–based classifiers and randomly selected common variants. RESULTS The selected contributory common variants were significantly enriched in chromosome X while chromosome Y was also discriminatory in determining the identification of autistic from nonautistic individuals. The ARSD, MAGEB16, and MXRA5 genes had the largest effect in the contributory variants. Thus, screening algorithms were adapted to include these common variants. The deep learning model yielded an area under the receiver operating characteristic curve of 0.955 and an accuracy of 88% for identifying autistic from nonautistic individuals. Our classifier demonstrated a significant improvement over standard autism screening tools by average 13% in terms of classification accuracy. CONCLUSIONS Common variants are informative for autism identification. Our findings also suggest that the deep learning process is a reliable method for distinguishing the diseased group from the control group based on the common variants of autism.


Reports ◽  
2019 ◽  
Vol 2 (4) ◽  
pp. 26 ◽  
Author(s):  
Govind Chada

Increasing radiologist workloads and increasing primary care radiology services make it relevant to explore the use of artificial intelligence (AI) and particularly deep learning to provide diagnostic assistance to radiologists and primary care physicians in improving the quality of patient care. This study investigates new model architectures and deep transfer learning to improve the performance in detecting abnormalities of upper extremities while training with limited data. DenseNet-169, DenseNet-201, and InceptionResNetV2 deep learning models were implemented and evaluated on the humerus and finger radiographs from MURA, a large public dataset of musculoskeletal radiographs. These architectures were selected because of their high recognition accuracy in a benchmark study. The DenseNet-201 and InceptionResNetV2 models, employing deep transfer learning to optimize training on limited data, detected abnormalities in the humerus radiographs with 95% CI accuracies of 83–92% and high sensitivities greater than 0.9, allowing for these models to serve as useful initial screening tools to prioritize studies for expedited review. The performance in the case of finger radiographs was not as promising, possibly due to the limitations of large inter-radiologist variation. It is suggested that the causes of this variation be further explored using machine learning approaches, which may lead to appropriate remediation.


2021 ◽  
Author(s):  
Anna Goldenberg ◽  
Bret Nestor ◽  
Jaryd Hunter ◽  
Raghu Kainkaryam ◽  
Erik Drysdale ◽  
...  

Abstract Commercial wearable devices are surfacing as an appealing mechanism to detect COVID-19 and potentially other public health threats, due to their widespread use. To assess the validity of wearable devices as population health screening tools, it is essential to evaluate predictive methodologies based on wearable devices by mimicking their real-world deployment. Several points must be addressed to transition from statistically significant differences between infected and uninfected cohorts to COVID-19 inferences on individuals. We demonstrate the strengths and shortcomings of existing approaches on a cohort of 32,198 individuals who experience influenza like illness (ILI), 204 of which report testing positive for COVID-19. We show that, despite commonly made design mistakes resulting in overestimation of performance, when properly designed wearables can be effectively used as a part of the detection pipeline. For example, knowing the week of year, combined with naive randomised test set generation leads to substantial overestimation of COVID-19 classification performance at 0.73 AUROC. However, an average AUROC of only 0.55 +/- 0.02 would be attainable in a simulation of real-world deployment, due to the shifting prevalence of COVID-19 and non-COVID-19 ILI to trigger further testing. In this work we show how to train a machine learning model to differentiate ILI days from healthy days, followed by a survey to differentiate COVID-19 from influenza and unspecified ILI based on symptoms. In a forthcoming week, models can expect a sensitivity of 0.50 (0-0.74, 95% CI), while utilising the wearable device to reduce the burden of surveys by 35%. The corresponding false positive rate is 0.22 (0.02-0.47, 95% CI). In the future, serious consideration must be given to the design, evaluation, and reporting of wearable device interventions if they are to be relied upon as part of frequent COVID-19 or other public health threat testing infrastructures.


2019 ◽  
Vol 32 (3) ◽  
pp. 137-144 ◽  
Author(s):  
Boaz Levy ◽  
Courtney Hess ◽  
Jacqueline Hogan ◽  
Matthew Hogan ◽  
James M. Ellison ◽  
...  

Background: Incorporation of cognitive screening into the busy primary care will require the development of highly efficient screening tools. We report the convergence validity of a very brief, self-administered, computerized assessment protocol against one of the most extensively used, clinician-administered instruments—the Montreal Cognitive Assessment (MoCA). Method: Two hundred six participants (mean age = 67.44, standard deviation [SD] = 11.63) completed the MoCA and the computerized test. Three machine learning algorithms (ie, Support Vector Machine, Random Forest, and Gradient Boosting Trees) were trained to classify participants according to the clinical cutoff score of the MoCA (ie, < 26) from participant performance on 25 features of the computerized test. Analysis employed Synthetic Minority Oversampling TEchnic to correct the sample for class imbalance. Results: Gradient Boosting Trees achieved the highest performance (accuracy = 0.81, specificity = 0.88, sensitivity = 0.74, F1 score = 0.79, and area under the curve = 0.81). A subsequent K-means clustering of the prediction features yielded 3 categories that corresponded to the unimpaired (mean = 26.98, SD = 2.35), mildly impaired (mean = 23.58, SD = 3.19), and moderately impaired (mean = 17.24, SD = 4.23) ranges of MoCA score ( F = 222.36, P < .00). In addition, compared to the MoCA, the computerized test correlated more strongly with age in unimpaired participants (ie, MoCA ≥26, n = 165), suggesting greater sensitivity to age-related changes in cognitive functioning. Conclusion: Future studies should examine ways to improve the sensitivity of the computerized test by expanding the cognitive domains it measures without compromising its efficiency.


2020 ◽  
Vol 35 ◽  
pp. 153331752092716
Author(s):  
Jin-Hyuck Park

Background: The mobile screening test system for mild cognitive impairment (mSTS-MCI) was developed and validated to address the low sensitivity and specificity of the Montreal Cognitive Assessment (MoCA) widely used clinically. Objective: This study was to evaluate the efficacy machine learning algorithms based on the mSTS-MCI and Korean version of MoCA. Method: In total, 103 healthy individuals and 74 patients with MCI were randomly divided into training and test data sets, respectively. The algorithm using TensorFlow was trained based on the training data set, and then its accuracy was calculated based on the test data set. The cost was calculated via logistic regression in this case. Result: Predictive power of the algorithms was higher than those of the original tests. In particular, the algorithm based on the mSTS-MCI showed the highest positive-predictive value. Conclusion: The machine learning algorithms predicting MCI showed the comparable findings with the conventional screening tools.


2021 ◽  
Vol 21 (1) ◽  
Author(s):  
Jingyi Zhang ◽  
Huolan Zhu ◽  
Yongkai Chen ◽  
Chenguang Yang ◽  
Huimin Cheng ◽  
...  

Abstract Background Extensive clinical evidence suggests that a preventive screening of coronary heart disease (CHD) at an earlier stage can greatly reduce the mortality rate. We use 64 two-dimensional speckle tracking echocardiography (2D-STE) features and seven clinical features to predict whether one has CHD. Methods We develop a machine learning approach that integrates a number of popular classification methods together by model stacking, and generalize the traditional stacking method to a two-step stacking method to improve the diagnostic performance. Results By borrowing strengths from multiple classification models through the proposed method, we improve the CHD classification accuracy from around 70–87.7% on the testing set. The sensitivity of the proposed method is 0.903 and the specificity is 0.843, with an AUC of 0.904, which is significantly higher than those of the individual classification models. Conclusion Our work lays a foundation for the deployment of speckle tracking echocardiography-based screening tools for coronary heart disease.


2019 ◽  
Author(s):  
Nita Vangeepuram ◽  
Bian Liu ◽  
Po-hsiang Chiu ◽  
Linhua Wang ◽  
Gaurav Pandey

AbstractType 2 diabetes has become alarmingly prevalent among youth in recent years. However, simple questionnaire-based screening tools to reliably identify diabetes risk and prevent the adverse effects of this serious disease are only available for adults, not for youth. As a first step in developing such a tool, we used a large-scale dataset from the National Health and Nutritional Examination Survey (NHANES), to examine the performance of a well-known adult diabetes risk self-assessment screener and published pediatric clinical screening guidelines in identifying youth with pre- diabetes/diabetes (pre-DM/DM) based on American Diabetes Association diagnostic biomarkers. We assessed the agreement between the adult screener/pediatric screening guidelines and biomarker diagnostic criteria by conducting comparisons using the overall data set and sub-datasets stratified by sex, race/ethnicity, and age. While the pediatric guidelines performed better than the adult screener in identifying youth with pre-DM/DM (sensitivity 43.1% vs 7.2%), both are inadequate for general deployment among youth. There were also notable differences in the performance of the pediatric guidelines across subgroups based on age, sex and race/ethnicity. In an effort to improve pre-DM/DM screening, we also evaluated data-driven machine learning-based classification algorithms, several of which performed slightly but statistically significantly better than the pediatric screening guidelines.


2021 ◽  
Author(s):  
Mohammed Alghazal

Abstract Employers commonly use time-consuming screening tools or online matching engines that are driven by manual roles and predefined keywords, to search for potential job applicants. Such traditional techniques have not kept pace with the new digital revolution in machine learning and big data analytics. This paper presents advanced artificial intelligent solutions employed for ranking resumes and CV-to-Job Description matching. Open source resumes and job descriptions' documents were used to construct and validate the machine learning models in this paper. Documents were converted to images and processed via Google cloud using Optical Character Recognition algorithm (OCR) to extract text information from all resumes and job descriptions' documents, with more than 97% accuracy. Prior to modeling, the extracted text were processed via a series of Natural Language Processing (NLP) techniques by splitting/tokenizing common words, grouping together inflected form of words, i.e. lemmatization, and removal of stop words and punctuation marks. After text processing, resumes were trained using the unsupervised machine learning algorithm, Latent Dirichlet Allocation (LDA), for topic modeling and categorization. Given the type of resumes used, the algorithm was able to categorize them into 4 main job sectors: marketing and business, engineering, computer science/IT and health. Scores were assigned to each resume to represent the maximum LDA probability for ranking. Another more advanced deep learning algorithm, called Doc2Vec, was also used to train and match potential resumes to relevant job descriptions. In this model, resumes are represented by unique vectors that can be used to group similar documents, match and retrieve resumes related to a given job description document provided by HR. The similarity is measured between each resume and the given job description file to query the top job candidates. The model was tested against several job description files related to engineering, IT and human resources, and was able to identify the top-ranking resumes from over hundreds of trained resumes. This paper presents an innovative method for processing, categorizing and ranking resumes using advanced computational models empowered by the latest fourth industrial resolution technologies. This solution is beneficial to both job seekers and employers, providing efficient and unbiased data-driven method for finding top applicants for a given job.


Sign in / Sign up

Export Citation Format

Share Document