scholarly journals Predicting poverty. Data mining approaches to the health and demographic surveillance system in Cuatro Santos, Nicaragua

2019 ◽  
Vol 18 (1) ◽  
Author(s):  
C. Källestål ◽  
E. Blandón Zelaya ◽  
R. Peña ◽  
W. Peréz ◽  
M. Contreras ◽  
...  

Abstract Background In order to further identify the needed interventions for continued poverty reduction in our study area Cuatro Santos, northern Nicaragua, we aimed to elucidate what predicts poverty, measured by the Unsatisfied Basic Need index. This analysis was done by using decision tree methodology applied to the Cuatro Santos health and demographic surveillance databases. Methods Using variables derived from the health and demographic surveillance update 2014, transferring individual data to the household level we used the decision tree framework Conditional Inference trees to predict the outcome “poverty” defined as two to four unsatisfied basic needs using the Unsatisfied Basic Need Index. We further validated the trees by applying Conditional random forest analyses in order to assess and rank the importance of predictors about their ability to explain the variation of the outcome “poverty.” The majority of the Cuatro Santos households provided information and the included variables measured housing conditions, assets, and demographic experiences since the last update (5 yrs), earlier participation in interventions and food security during the last 4 weeks. Results Poverty was rare in households that have some assets and someone in the household that has a higher education than primary school. For these households participating in the intervention that installed piped water with water meter was most important, but also when excluding this variable, the resulting tree showed the same results. When assets were not taken into consideration, the importance of education was pronounced as a predictor for welfare. The results were further strengthened by the validation using Conditional random forest modeling showing the same variables being important as predicting the outcome in the CI tree analysis. As assets can be a result, rather than a predictor of more affluence our results in summary point specifically to the importance of education and participation in the water installation intervention as predictors for more affluence. Conclusion Predictors of poverty are useful for directing interventions and in the Cuatro Santos area education seems most important to prioritize. Hopefully, the lessons learned can continue to develop the Cuatro Santos communities as well as development in similar poor rural settings around the world.

2020 ◽  
Vol 48 (1) ◽  
pp. 101-135
Author(s):  
Volodymyr Dekalo

AbstractThe present paper deals with item- and feature-based changes of the modal semi-schematic construction with verstehen in written German during the 20th century. To understand this development, the century is divided into four equal periods. Applying a simple collexeme analysis for each time span, the study ascertains which lexical verbs appear as typical items in a schematic slot constituting its collostructional profile. Comparing the distributional behavior manually in a pairwise fashion, the analysis reveals that solely three verbs, namely machen, umgehen and meistern, stay constantly highly attracted within the top collexemes of the verstehen-construction during the 20th century. Using a dependency-based semantic space model, the study demonstrates that the collostructional profile of the fourth period differs considerably from the previous time span. Utilizing random forest of conditional inference trees, changes in terms of usage features of the modal construction are pinpointed. As a result, its grammaticality degree has not increased demonstrating solely minor changes in temporal functionality as well as in realization of subject forms.


2019 ◽  
Vol 11 (4) ◽  
pp. 555-581
Author(s):  
JOAN BYBEE ◽  
RICARDO NAPOLEÃO DE SOUZA

abstractUsing ten English adjectives, this study tests the hypothesis that the vowels in adjectives in predicative constructions are longer than those in attributive constructions in spoken conversation. The analyses considered a number of factors: occurrence before a pause, lexical adjective, vowel identity, probability given surrounding words, and others. Two sets of statistical techniques were used: a Mixed-effects model and the Random Forest Analysis based on Conditional Inference Trees (CIT). Both analyses showed strong effects of predicative vs. attributive constructions and individual lexical adjectives on vowel duration in the predicted direction, as well as effects of many of the phonological variables tested. The results showed that the longer duration in the predicative construction is not due to lengthening before a pause, though it is related to whether the adjective is internal or final in the predicative construction. Nor is the effect attributable solely to the probability of the occurrence of the adjective; rather construction type has to be taken into account. The two statistical techniques complement each other, with the Mixed-effects model showing very general trends over all the data, and the Random Forest / CIT analysis showing factors that affect only subsets of the data.


2018 ◽  
Vol 16 (3) ◽  
pp. 325-339
Author(s):  
Abbas A. Rezaee ◽  
Majid Nemati ◽  
Seyyed Ehsan Golparvar

The present research is aimed at examining the relative importance of the competing motivators of the sequencing of reason clauses in a corpus of research articles of applied linguistics. All the finite reason clauses accompanied by their main clauses in this corpus were collected. Random forest of conditional inference trees is the statistical modelling in this study. The findings showed that sentence-final reason clauses outnumber sentenceinitial ones. Moreover, subordinator choice and bridging, which are discourse-pragmatic constraints on clause positioning, emerged as the two more powerful predictors of the ordering of reason clauses in this corpus. Furthermore, the complexity of the clause turned out to be a stronger processing related predictor than the length of the clause.


2020 ◽  
Vol 4 (Supplement_1) ◽  
pp. 268-269
Author(s):  
Jaime Speiser ◽  
Kathryn Callahan ◽  
Jason Fanning ◽  
Thomas Gill ◽  
Anne Newman ◽  
...  

Abstract Advances in computational algorithms and the availability of large datasets with clinically relevant characteristics provide an opportunity to develop machine learning prediction models to aid in diagnosis, prognosis, and treatment of older adults. Some studies have employed machine learning methods for prediction modeling, but skepticism of these methods remains due to lack of reproducibility and difficulty understanding the complex algorithms behind models. We aim to provide an overview of two common machine learning methods: decision tree and random forest. We focus on these methods because they provide a high degree of interpretability. We discuss the underlying algorithms of decision tree and random forest methods and present a tutorial for developing prediction models for serious fall injury using data from the Lifestyle Interventions and Independence for Elders (LIFE) study. Decision tree is a machine learning method that produces a model resembling a flow chart. Random forest consists of a collection of many decision trees whose results are aggregated. In the tutorial example, we discuss evaluation metrics and interpretation for these models. Illustrated in data from the LIFE study, prediction models for serious fall injury were moderate at best (area under the receiver operating curve of 0.54 for decision tree and 0.66 for random forest). Machine learning methods may offer improved performance compared to traditional models for modeling outcomes in aging, but their use should be justified and output should be carefully described. Models should be assessed by clinical experts to ensure compatibility with clinical practice.


2021 ◽  
Vol 11 (4) ◽  
pp. 1378
Author(s):  
Seung Hyun Lee ◽  
Jaeho Son

It has been pointed out that the act of carrying a heavy object that exceeds a certain weight by a worker at a construction site is a major factor that puts physical burden on the worker’s musculoskeletal system. However, due to the nature of the construction site, where there are a large number of workers simultaneously working in an irregular space, it is difficult to figure out the weight of the object carried by the worker in real time or keep track of the worker who carries the excess weight. This paper proposes a prototype system to track the weight of heavy objects carried by construction workers by developing smart safety shoes with FSR (Force Sensitive Resistor) sensors. The system consists of smart safety shoes with sensors attached, a mobile device for collecting initial sensing data, and a web-based server computer for storing, preprocessing and analyzing such data. The effectiveness and accuracy of the weight tracking system was verified through the experiments where a weight was lifted by each experimenter from +0 kg to +20 kg in 5 kg increments. The results of the experiment were analyzed by a newly developed machine learning based model, which adopts effective classification algorithms such as decision tree, random forest, gradient boosting algorithm (GBM), and light GBM. The average accuracy classifying the weight by each classification algorithm showed similar, but high accuracy in the following order: random forest (90.9%), light GBM (90.5%), decision tree (90.3%), and GBM (89%). Overall, the proposed weight tracking system has a significant 90.2% average accuracy in classifying how much weight each experimenter carries.


2021 ◽  
Vol 15 (1) ◽  
Author(s):  
Moaz Hiba ◽  
Ahmed Farid Ibrahim ◽  
Salaheldin Elkatatny ◽  
Abdulwahab Ali

Sign in / Sign up

Export Citation Format

Share Document