scholarly journals A multivariate model for predicting segmental body composition

2013 ◽  
Vol 110 (12) ◽  
pp. 2260-2270 ◽  
Author(s):  
Simiao Tian ◽  
Laurence Mioche ◽  
Jean-Baptiste Denis ◽  
Béatrice Morio

The aims of the present study were to propose a multivariate model for predicting simultaneously body, trunk and appendicular fat and lean masses from easily measured variables and to compare its predictive capacity with that of the available univariate models that predict body fat percentage (BF%). The dual-energy X-ray absorptiometry (DXA) dataset (52 % men and 48 % women) with White, Black and Hispanic ethnicities (1999–2004, National Health and Nutrition Examination Survey) was randomly divided into three sub-datasets: a training dataset (TRD), a test dataset (TED); a validation dataset (VAD), comprising 3835, 1917 and 1917 subjects. For each sex, several multivariate prediction models were fitted from the TRD using age, weight, height and possibly waist circumference. The most accurate model was selected from the TED and then applied to the VAD and a French DXA dataset (French DB) (526 men and 529 women) to assess the prediction accuracy in comparison with that of five published univariate models, for which adjusted formulas were re-estimated using the TRD. Waist circumference was found to improve the prediction accuracy, especially in men. For BF%, the standard error of prediction (SEP) values were 3·26 (3·75) % for men and 3·47 (3·95) % for women in the VAD (French DB), as good as those of the adjusted univariate models. Moreover, the SEP values for the prediction of body and appendicular lean masses ranged from 1·39 to 2·75 kg for both the sexes. The prediction accuracy was best for age < 65 years, BMI < 30 kg/m2and the Hispanic ethnicity. The application of our multivariate model to large populations could be useful to address various public health issues.

2018 ◽  
Vol 2018 ◽  
pp. 1-15 ◽  
Author(s):  
Tanuj Chopra ◽  
Manoranjan Parida ◽  
Naveen Kwatra ◽  
Palika Chopra

The objective of the present study is to develop models to predict the deterioration of pavement distress of the urban road network. Genetic programming (GP) has been used to develop five models for the prediction of pavement distress: Model 1 for the cracking progression, Model 2 for the ravelling progression, Model 3 for the pothole progression, Model 4 for the rutting progression, and Model 5 for the roughness progression. The data have been collected from the roads of Patiala City, Punjab, India; during the years 2012–2015, the network of 16 roads have been selected for the data collection purposes. The data have been divided into two sets, that is, training dataset (data collected during the years 2012 and 2013) and validation dataset (data collected during the years 2014 and 2015). The two fitness functions have been used for the evaluation of the models, that is, coefficient of determination (R2) and root mean square error (RMSE), and it is inferred that GP models predict with high accuracy for pavement distress and help the decision makers for adequate and timely fund allocations for preservation of the urban road network.


2020 ◽  
Vol 11 ◽  
pp. 374
Author(s):  
Masahito Katsuki ◽  
Yukinari Kakizawa ◽  
Akihiro Nishikawa ◽  
Yasunaga Yamamoto ◽  
Toshiya Uchiyama

Background: Reliable prediction models of subarachnoid hemorrhage (SAH) outcomes are needed for decision-making of the treatment. SAFIRE score using only four variables is a good prediction scoring system. However, making such prediction models needs a large number of samples and time-consuming statistical analysis. Deep learning (DL), one of the artificial intelligence, is attractive, but there were no reports on prediction models for SAH outcomes using DL. We herein made a prediction model using DL software, Prediction One (Sony Network Communications Inc., Tokyo, Japan) and compared it to SAFIRE score. Methods: We used 153 consecutive aneurysmal SAH patients data in our hospital between 2012 and 2019. Modified Rankin Scale (mRS) 0–3 at 6 months was defined as a favorable outcome. We randomly divided them into 102 patients training dataset and 51 patients external validation dataset. Prediction one made the prediction model using the training dataset with internal cross-validation. We used both the created model and SAFIRE score to predict the outcomes using the external validation set. The areas under the curve (AUCs) were compared. Results: The model made by Prediction One using 28 variables had AUC of 0.848, and its AUC for the validation dataset was 0.953 (95%CI 0.900–1.000). AUCs calculated using SAFIRE score were 0.875 for the training dataset and 0.960 for the validation dataset, respectively. Conclusion: We easily and quickly made prediction models using Prediction One, even with a small single-center dataset. The accuracy of the model was not so inferior to those of previous statistically calculated prediction models.


2021 ◽  
Author(s):  
Lubna Maryam ◽  
Anjali Dhall ◽  
Sumeet Patiyal ◽  
Salman Sadullah Usmani ◽  
Neelam Sharma ◽  
...  

Number of beta-lactamase variants have ability to deactivate ceftazidime antibiotic, which is the most commonly used antibiotic for treating infection by Gram-negative bacteria. In this study an attempt has been made to develop a method that can predict ceftazidime resistant strains of bacteria from amino acid sequence of beta-lactamases. We obtained beta-lactamases proteins from the β-lactamase database, corresponding to 87 ceftazidime-sensitive and 112 ceftazidime-resistant bacterial strains. All models developed in this study were trained, tested, and evaluated on a dataset of 199 beta-lactamases proteins. We generate 9149 features for beta-lactamases using Pfeature and select relevant features using different algorithms in scikit-learn package. A wide range of machine learning techniques (like KNN, DT, RF, GNB, LR, SVC, XGB) has been used to develop prediction models. Our random forest-based model achieved maximum performance with AUROC of 0.80 on training dataset and 0.79 on the validation dataset. The study also revealed that ceftazidime-resistant beta-lactamases have amino acids with non-polar side chains in abundance. In contrast, ceftazidime-sensitive beta-lactamases have amino acids with polar side chains and charged entities in abundance. Finally, we developed a webserver- ABCRpred, for the scientific community working in the era of antibiotic resistance to predict the antibiotic resistance/susceptibility of beta-lactamase protein sequences. The server is freely available at (http://webs.iiitd.edu.in/raghava/abcrpred/ ).


Author(s):  
Neelam Sharma ◽  
Sumeet Patiyal ◽  
Anjali Dhall ◽  
Akshara Pande ◽  
Chakit Arora ◽  
...  

Abstract AlgPred 2.0 is a web server developed for predicting allergenic proteins and allergenic regions in a protein. It is an updated version of AlgPred developed in 2006. The dataset used for training, testing and validation consists of 10 075 allergens and 10 075 non-allergens. In addition, 10 451 experimentally validated immunoglobulin E (IgE) epitopes were used to identify antigenic regions in a protein. All models were trained on 80% of data called training dataset, and the performance of models was evaluated using 5-fold cross-validation technique. The performance of the final model trained on the training dataset was evaluated on 20% of data called validation dataset; no two proteins in any two sets have more than 40% similarity. First, a Basic Local Alignment Search Tool (BLAST) search has been performed against the dataset, and allergens were predicted based on the level of similarity with known allergens. Second, IgE epitopes obtained from the IEDB database were searched in the dataset to predict allergens based on their presence in a protein. Third, motif-based approaches like multiple EM for motif elicitation/motif alignment and search tool have been used to predict allergens. Fourth, allergen prediction models have been developed using a wide range of machine learning techniques. Finally, the ensemble approach has been used for predicting allergenic protein by combining prediction scores of different approaches. Our best model achieved maximum performance in terms of area under receiver operating characteristic curve 0.98 with Matthew’s correlation coefficient 0.85 on the validation dataset. A web server AlgPred 2.0 has been developed that allows the prediction of allergens, mapping of IgE epitope, motif search and BLAST search (https://webs.iiitd.edu.in/raghava/algpred2/).


Author(s):  
Md Didarul Islam ◽  
Kazi Saiful Islam ◽  
Mohammad Mia

Land use and land cover (LULC) change have significant consequences on habitat and environment. Scholars have developed several LULC models to identify the factors behind the changes and to simulate future LULC scenarios to assist in policymaking. Nevertheless, the accuracy of the models remains contentious and a matter of ongoing research agenda. Additionally, most of these studies used a training dataset to train the model and a validation dataset, which is a part of the original training dataset used to validate the model’s accuracy. However, to justify model’s actual predictive capability, we need to test the model on real-world dataset that was not used in modeling. So, we present XGBoost model to improve the accuracy of LULC prediction. Contrary to the typical studies, we use a separate test dataset to justify the model’s predictive capacity in real-world scenario. The result reveals that XGBoost model exhibits highest 84% kappa and 93% accuracy score compared to two benchmark model LR-CA (82% kappa and 92% accuracy score) and ANN-CA (82% kappa and 92% accuracy score). We also found that the built-up area increased by 48.7% in 2002 to 64% in 2010, while agricultural and vacant land declined by almost at the same magnitude over the period and the most important aspect of the LULC shift process in Khulna city was the proximity factors to major roads, industry and commercial establishments. The proposed model proved to increase the predictive accuracy making it much more reliable for analyzing and predicting urban LULC using spatial factors.


2020 ◽  
Author(s):  
Piyush Agrawal ◽  
Dhruv Bhagat ◽  
Manish Mahalwal ◽  
Neelam Sharma ◽  
Gajendra P. S. Raghava

AbstractIncreasing use of therapeutic peptides for treating cancer has received considerable attention of the scientific community in the recent years. The present study describes the in silico model developed for predicting and designing anticancer peptides (ACPs). ACPs residue composition analysis revealed the preference of A, F, K, L and W. Positional preference analysis revealed that residue A, F and K are preferred at N-terminus and residue L and K are preferred at C-terminus. Motif analysis revealed the presence of motifs like LAKLA, AKLAK, FAKL, LAKL in ACPs. Prediction models were developed using various input features and implementing different machine learning classifiers on two datasets main and alternate dataset. In the case of main dataset, ETree Classifier based model developed using dipeptide composition achieved maximum MCC of 0.51 and 0.83 AUROC on the training dataset. In the case of alternate dataset, ETree Classifier based model developed using amino acid composition performed best and achieved the highest MCC of 0.80 and AUROC of 0.97 on the training dataset. Models were trained and tested using five-fold cross validation technique and their performance was also evaluated on the validation dataset. Best models were implemented in the webserver AntiCP 2.0, freely available at https://webs.iiitd.edu.in/raghava/anticp2. The webserver is compatible with multiple screens such as iPhone, iPad, laptop, and android phones. The standalone version of the software is provided in the form of GitHub package as well as in docker technology.


2020 ◽  
Vol 27 ◽  
Author(s):  
Zaheer Ullah Khan ◽  
Dechang Pi

Background: S-sulfenylation (S-sulphenylation, or sulfenic acid) proteins, are special kinds of post-translation modification, which plays an important role in various physiological and pathological processes such as cytokine signaling, transcriptional regulation, and apoptosis. Despite these aforementioned significances, and by complementing existing wet methods, several computational models have been developed for sulfenylation cysteine sites prediction. However, the performance of these models was not satisfactory due to inefficient feature schemes, severe imbalance issues, and lack of an intelligent learning engine. Objective: In this study, our motivation is to establish a strong and novel computational predictor for discrimination of sulfenylation and non-sulfenylation sites. Methods: In this study, we report an innovative bioinformatics feature encoding tool, named DeepSSPred, in which, resulting encoded features is obtained via n-segmented hybrid feature, and then the resampling technique called synthetic minority oversampling was employed to cope with the severe imbalance issue between SC-sites (minority class) and non-SC sites (majority class). State of the art 2DConvolutional Neural Network was employed over rigorous 10-fold jackknife cross-validation technique for model validation and authentication. Results: Following the proposed framework, with a strong discrete presentation of feature space, machine learning engine, and unbiased presentation of the underline training data yielded into an excellent model that outperforms with all existing established studies. The proposed approach is 6% higher in terms of MCC from the first best. On an independent dataset, the existing first best study failed to provide sufficient details. The model obtained an increase of 7.5% in accuracy, 1.22% in Sn, 12.91% in Sp and 13.12% in MCC on the training data and12.13% of ACC, 27.25% in Sn, 2.25% in Sp, and 30.37% in MCC on an independent dataset in comparison with 2nd best method. These empirical analyses show the superlative performance of the proposed model over both training and Independent dataset in comparison with existing literature studies. Conclusion : In this research, we have developed a novel sequence-based automated predictor for SC-sites, called DeepSSPred. The empirical simulations outcomes with a training dataset and independent validation dataset have revealed the efficacy of the proposed theoretical model. The good performance of DeepSSPred is due to several reasons, such as novel discriminative feature encoding schemes, SMOTE technique, and careful construction of the prediction model through the tuned 2D-CNN classifier. We believe that our research work will provide a potential insight into a further prediction of S-sulfenylation characteristics and functionalities. Thus, we hope that our developed predictor will significantly helpful for large scale discrimination of unknown SC-sites in particular and designing new pharmaceutical drugs in general.


Author(s):  
Saheb Foroutaifar

AbstractThe main objectives of this study were to compare the prediction accuracy of different Bayesian methods for traits with a wide range of genetic architecture using simulation and real data and to assess the sensitivity of these methods to the violation of their assumptions. For the simulation study, different scenarios were implemented based on two traits with low or high heritability and different numbers of QTL and the distribution of their effects. For real data analysis, a German Holstein dataset for milk fat percentage, milk yield, and somatic cell score was used. The simulation results showed that, with the exception of the Bayes R, the other methods were sensitive to changes in the number of QTLs and distribution of QTL effects. Having a distribution of QTL effects, similar to what different Bayesian methods assume for estimating marker effects, did not improve their prediction accuracy. The Bayes B method gave higher or equal accuracy rather than the rest. The real data analysis showed that similar to scenarios with a large number of QTLs in the simulation, there was no difference between the accuracies of the different methods for any of the traits.


2021 ◽  
Vol 13 (7) ◽  
pp. 3870
Author(s):  
Mehrbakhsh Nilashi ◽  
Shahla Asadi ◽  
Rabab Ali Abumalloh ◽  
Sarminah Samad ◽  
Fahad Ghabban ◽  
...  

This study aims to develop a new approach based on machine learning techniques to assess sustainability performance. Two main dimensions of sustainability, ecological sustainability, and human sustainability, were considered in this study. A set of sustainability indicators was used, and the research method in this study was developed using cluster analysis and prediction learning techniques. A Self-Organizing Map (SOM) was applied for data clustering, while Classification and Regression Trees (CART) were applied to assess sustainability performance. The proposed method was evaluated through Sustainability Assessment by Fuzzy Evaluation (SAFE) dataset, which comprises various indicators of sustainability performance in 128 countries. Eight clusters from the data were found through the SOM clustering technique. A prediction model was found in each cluster through the CART technique. In addition, an ensemble of CART was constructed in each cluster of SOM to increase the prediction accuracy of CART. All prediction models were assessed through the adjusted coefficient of determination approach. The results demonstrated that the prediction accuracy values were high in all CART models. The results indicated that the method developed by ensembles of CART and clustering provide higher prediction accuracy than individual CART models. The main advantage of integrating the proposed method is its ability to automate decision rules from big data for prediction models. The method proposed in this study could be implemented as an effective tool for sustainability performance assessment.


BMJ ◽  
2021 ◽  
pp. n365
Author(s):  
Buyun Liu ◽  
Yang Du ◽  
Yuxiao Wu ◽  
Linda G Snetselaar ◽  
Robert B Wallace ◽  
...  

AbstractObjectiveTo examine the trends in obesity and adiposity measures, including body mass index, waist circumference, body fat percentage, and lean mass, by race or ethnicity among adults in the United States from 2011 to 2018.DesignPopulation based study.SettingNational Health and Nutrition Examination Survey (NHANES), 2011-18.ParticipantsA nationally representative sample of US adults aged 20 years or older.Main outcome measuresWeight, height, and waist circumference among adults aged 20 years or older were measured by trained technicians using standardized protocols. Obesity was defined as body mass index of 30 or higher for non-Asians and 27.5 or higher for Asians. Abdominal obesity was defined as a waist circumference of 102 cm or larger for men and 88 cm or larger for women. Body fat percentage and lean mass were measured among adults aged 20-59 years by using dual energy x ray absorptiometry.ResultsThis study included 21 399 adults from NHANES 2011-18. Body mass index was measured for 21 093 adults, waist circumference for 20 080 adults, and body fat percentage for 10 864 adults. For the overall population, age adjusted prevalence of general obesity increased from 35.4% (95% confidence interval 32.5% to 38.3%) in 2011-12 to 43.4% (39.8% to 47.0%) in 2017-18 (P for trend<0.001), and age adjusted prevalence of abdominal obesity increased from 54.5% (51.2% to 57.8%) in 2011-12 to 59.1% (55.6% to 62.7%) in 2017-18 (P for trend=0.02). Age adjusted mean body mass index increased from 28.7 (28.2 to 29.1) in 2011-12 to 29.8 (29.2 to 30.4) in 2017-18 (P for trend=0.001), and age adjusted mean waist circumference increased from 98.4 cm (97.4 to 99.5 cm) in 2011-12 to 100.5 cm (98.9 to 102.1 cm) in 2017-18 (P for trend=0.01). Significant increases were observed in body mass index and waist circumference among the Hispanic, non-Hispanic white, and non-Hispanic Asian groups (all P for trend<0.05), but not for the non-Hispanic black group. For body fat percentage, a significant increase was observed among non-Hispanic Asians (30.6%, 29.8% to 31.4% in 2011-12; 32.7%, 32.0% to 33.4% in 2017-18; P for trend=0.001), but not among other racial or ethnic groups. The age adjusted mean lean mass decreased in the non-Hispanic black group and increased in the non-Hispanic Asian group, but no statistically significant changes were found in other racial or ethnic groups.ConclusionsAmong US adults, an increasing trend was found in obesity and adiposity measures from 2011 to 2018, although disparities exist among racial or ethnic groups.


Sign in / Sign up

Export Citation Format

Share Document