scholarly journals A machine learning approach to predict extreme inactivity in COPD patients using non-activity-related clinical data

PLoS ONE ◽  
2021 ◽  
Vol 16 (8) ◽  
pp. e0255977
Author(s):  
Bernard Aguilaniu ◽  
David Hess ◽  
Eric Kelkel ◽  
Amandine Briault ◽  
Marie Destors ◽  
...  

Facilitating the identification of extreme inactivity (EI) has the potential to improve morbidity and mortality in COPD patients. Apart from patients with obvious EI, the identification of a such behavior during a real-life consultation is unreliable. We therefore describe a machine learning algorithm to screen for EI, as actimetry measurements are difficult to implement. Complete datasets for 1409 COPD patients were obtained from COLIBRI-COPD, a database of clinicopathological data submitted by French pulmonologists. Patient- and pulmonologist-reported estimates of PA quantity (daily walking time) and intensity (domestic, recreational, or fitness-directed) were first used to assign patients to one of four PA groups (extremely inactive [EI], overtly active [OA], intermediate [INT], inconclusive [INC]). The algorithm was developed by (i) using data from 80% of patients in the EI and OA groups to identify ‘phenotype signatures’ of non-PA-related clinical variables most closely associated with EI or OA; (ii) testing its predictive validity using data from the remaining 20% of EI and OA patients; and (iii) applying the algorithm to identify EI patients in the INT and INC groups. The algorithm’s overall error for predicting EI status among EI and OA patients was 13.7%, with an area under the receiver operating characteristic curve of 0.84 (95% confidence intervals: 0.75–0.92). Of the 577 patients in the INT/INC groups, 306 (53%) were reclassified as EI by the algorithm. Patient- and physician- reported estimation may underestimate EI in a large proportion of COPD patients. This algorithm may assist physicians in identifying patients in urgent need of interventions to promote PA.

2019 ◽  
Vol 06 (01) ◽  
pp. 17-28 ◽  
Author(s):  
Hoang Pham ◽  
David H. Pham

In real-life applications, we often do not have population data but we can collect several samples from a large sample size of data. In this paper, we propose a median-based machine-learning approach and algorithm to predict the parameter of the Bernoulli distribution. We illustrate the proposed median approach by generating various sample datasets from Bernoulli population distribution to validate the accuracy of the proposed approach. We also analyze the effectiveness of the median methods using machine-learning techniques including correction method and logistic regression. Our results show that the median-based measure outperforms the mean measure in the applications of machine learning using sampling distribution approaches.


Entropy ◽  
2019 ◽  
Vol 21 (10) ◽  
pp. 1015 ◽  
Author(s):  
Carles Bretó ◽  
Priscila Espinosa ◽  
Penélope Hernández ◽  
Jose M. Pavía

This paper applies a Machine Learning approach with the aim of providing a single aggregated prediction from a set of individual predictions. Departing from the well-known maximum-entropy inference methodology, a new factor capturing the distance between the true and the estimated aggregated predictions presents a new problem. Algorithms such as ridge, lasso or elastic net help in finding a new methodology to tackle this issue. We carry out a simulation study to evaluate the performance of such a procedure and apply it in order to forecast and measure predictive ability using a dataset of predictions on Spanish gross domestic product.


Author(s):  
B.D. Britt ◽  
T. Glagowski

AbstractThis paper describes current research toward automating the redesign process. In redesign, a working design is altered to meet new problem specifications. This process is complicated by interactions between different parts of the design, and many researchers have addressed these issues. An overview is given of a large design tool under development, the Circuit Designer's Apprentice. This tool integrates various techniques for reengineering existing circuits so that they meet new circuit requirements. The primary focus of the paper is one particular technique being used to reengineer circuits when they cannot be transformed to meet the new problem requirements. In these cases, a design plan is automatically generated for the circuit, and then replayed to solve all or part of the new problem. This technique is based upon the derivational analogy approach to design reuse. Derivational Analogy is a machine learning algorithm in which a design plan is saved at the time of design so that it can be replayed on a new design problem. Because design plans were not saved for the circuits available to the Circuit Designer's Apprentice, an algorithm was developed that automatically reconstructs a design plan for any circuit. This algorithm, Reconstructive Derivational Analogy, is described in detail, including a quantitative analysis of the implementation of this algorithm.


2020 ◽  
Vol 25 (4) ◽  
pp. 174-189 ◽  
Author(s):  
Guillaume  Palacios ◽  
Arnaud Noreña ◽  
Alain Londero

Introduction: Subjective tinnitus (ST) and hyperacusis (HA) are common auditory symptoms that may become incapacitating in a subgroup of patients who thereby seek medical advice. Both conditions can result from many different mechanisms, and as a consequence, patients may report a vast repertoire of associated symptoms and comorbidities that can reduce dramatically the quality of life and even lead to suicide attempts in the most severe cases. The present exploratory study is aimed at investigating patients’ symptoms and complaints using an in-depth statistical analysis of patients’ natural narratives in a real-life environment in which, thanks to the anonymization of contributions and the peer-to-peer interaction, it is supposed that the wording used is totally free of any self-limitation and self-censorship. Methods: We applied a purely statistical, non-supervised machine learning approach to the analysis of patients’ verbatim exchanged on an Internet forum. After automated data extraction, the dataset has been preprocessed in order to make it suitable for statistical analysis. We used a variant of the Latent Dirichlet Allocation (LDA) algorithm to reveal clusters of symptoms and complaints of HA patients (topics). The probability of distribution of words within a topic uniquely characterizes it. The convergence of the log-likelihood of the LDA-model has been reached after 2,000 iterations. Several statistical parameters have been tested for topic modeling and word relevance factor within each topic. Results: Despite a rather small dataset, this exploratory study demonstrates that patients’ free speeches available on the Internet constitute a valuable material for machine learning and statistical analysis aimed at categorizing ST/HA complaints. The LDA model with K = 15 topics seems to be the most relevant in terms of relative weights and correlations with the capability to individualizing subgroups of patients displaying specific characteristics. The study of the relevance factor may be useful to unveil weak but important signals that are present in patients’ narratives. Discussion/Conclusion: We claim that the LDA non-supervised approach would permit to gain knowledge on the patterns of ST- and HA-related complaints and on patients’ centered domains of interest. The merits and limitations of the LDA algorithms are compared with other natural language processing methods and with more conventional methods of qualitative analysis of patients’ output. Future directions and research topics emerging from this innovative algorithmic analysis are proposed.


Author(s):  
Patrick Schwab ◽  
Walter Karlen

Parkinson’s disease is a neurodegenerative disease that can affect a person’s movement, speech, dexterity, and cognition. Clinicians primarily diagnose Parkinson’s disease by performing a clinical assessment of symptoms. However, misdiagnoses are common. One factor that contributes to misdiagnoses is that the symptoms of Parkinson’s disease may not be prominent at the time the clinical assessment is performed. Here, we present a machine-learning approach towards distinguishing between people with and without Parkinson’s disease using long-term data from smartphone-based walking, voice, tapping and memory tests. We demonstrate that our attentive deep-learning models achieve significant improvements in predictive performance over strong baselines (area under the receiver operating characteristic curve = 0.85) in data from a cohort of 1853 participants. We also show that our models identify meaningful features in the input data. Our results confirm that smartphone data collected over extended periods of time could in the future potentially be used as a digital biomarker for the diagnosis of Parkinson’s disease.


2021 ◽  
Author(s):  
Diti Roy ◽  
Md. Ashiq Mahmood ◽  
Tamal Joyti Roy

<p>Heart Disease is the most dominating disease which is taking a large number of deaths every year. A report from WHO in 2016 portrayed that every year at least 17 million people die of heart disease. This number is gradually increasing day by day and WHO estimated that this death toll will reach the summit of 75 million by 2030. Despite having modern technology and health care system predicting heart disease is still beyond limitations. As the Machine Learning algorithm is a vital source predicting data from available data sets we have used a machine learning approach to predict heart disease. We have collected data from the UCI repository. In our study, we have used Random Forest, Zero R, Voted Perceptron, K star classifier. We have got the best result through the Random Forest classifier with an accuracy of 97.69.<i><b></b></i></p> <p><b> </b></p>


Author(s):  
Ganesh K. Shinde

Abstract: Most important part of information gathering is to focus on how people think. There are so many opinion resources such as online review sites and personal blogs are available. In this paper we focused on the Twitter. Twitter allow user to express his opinion on variety of entities. We performed sentiment analysis on tweets using Text Mining methods such as Lexicon and Machine Learning Approach. We performed Sentiment Analysis in two steps, first by searching the polarity words from the pool of words that are already predefined in lexicon dictionary and in Second step training the machine learning algorithm using polarities given in the first step. Keywords: Sentiment analysis, Social Media, Twitter, Lexicon Dictionary, Machine Learning Classifiers, SVM.


PLoS ONE ◽  
2020 ◽  
Vol 15 (11) ◽  
pp. e0241239
Author(s):  
Kai On Wong ◽  
Osmar R. Zaïane ◽  
Faith G. Davis ◽  
Yutaka Yasui

Background Canada is an ethnically-diverse country, yet its lack of ethnicity information in many large databases impedes effective population research and interventions. Automated ethnicity classification using machine learning has shown potential to address this data gap but its performance in Canada is largely unknown. This study conducted a large-scale machine learning framework to predict ethnicity using a novel set of name and census location features. Methods Using census 1901, the multiclass and binary class classification machine learning pipelines were developed. The 13 ethnic categories examined were Aboriginal (First Nations, Métis, Inuit, and all-combined)), Chinese, English, French, Irish, Italian, Japanese, Russian, Scottish, and others. Machine learning algorithms included regularized logistic regression, C-support vector, and naïve Bayes classifiers. Name features consisted of the entire name string, substrings, double-metaphones, and various name-entity patterns, while location features consisted of the entire location string and substrings of province, district, and subdistrict. Predictive performance metrics included sensitivity, specificity, positive predictive value, negative predictive value, F1, Area Under the Curve for Receiver Operating Characteristic curve, and accuracy. Results The census had 4,812,958 unique individuals. For multiclass classification, the highest performance achieved was 76% F1 and 91% accuracy. For binary classifications for Chinese, French, Italian, Japanese, Russian, and others, the F1 ranged 68–95% (median 87%). The lower performance for English, Irish, and Scottish (F1 ranged 63–67%) was likely due to their shared cultural and linguistic heritage. Adding census location features to the name-based models strongly improved the prediction in Aboriginal classification (F1 increased from 50% to 84%). Conclusions The automated machine learning approach using only name and census location features can predict the ethnicity of Canadians with varying performance by specific ethnic categories.


Sign in / Sign up

Export Citation Format

Share Document