Abstract 64: An Integrated Model for Titin Truncation Mutation Interpretation

2016 ◽  
Vol 119 (suppl_1) ◽  
Author(s):  
Jun Zou ◽  
Diana Tran ◽  
Angelo Pelonero ◽  
Rahul C Deo

Background: We recently discovered a conserved internal promoter in the Titin gene, which explains why truncating mutations in the C-terminal two thirds of the zebrafish ttna protein result in more severe disease, recapitulating a puzzling observation in human dilated cardiomyopathy (DCM) patients. Here we focus on the contribution of alternative splicing to the DCM phenotype, both in zebrafish Titin truncation mutants and in the context of an integrative model for Titin mutation interpretation. Methods and Results: Using CRISPR/Cas9, we disrupted an alternatively spliced exon in the I-band of Titin , normally present in zebrafish heart but absent in skeletal muscle. The resulting mutants had, on average, a milder cardiac phenotype than those with mutations in constitutive exons but also showed striking inter-sibling variability in disease expression, ranging from intact cardiac blood flow to severe early demise. The mutant exon demonstrated nonsense-altered splicing and disease severity paralleled selective deficiency in Titin transcript level, implying that variability in mutated exon inclusion coupled with nonsense-mediated decay (NMD) modulated phenotype. We next amassed Titin mutation information from 1785 human DCM cases and >68,000 controls to model mutation distribution and found three variance components 1) splicing; 2) internal isoform disruption; and 3) targeting of the C-terminal 2000 amino acids. An integrated model demonstrated strong predictive performance with an area under the receiver operating characteristic curve of 0.79 and correctly identified the highest risk individuals. Conclusions: We conclude that genetically targeted models and large-scale human data can be complementary in overcoming the challenges of genetic data interpretation.

2014 ◽  
Vol 2014 ◽  
pp. 1-7 ◽  
Author(s):  
Zi-Hui Tang ◽  
Fangfang Zeng ◽  
Zhongtao Li ◽  
Linuo Zhou

Background.The purpose of this study was to evaluate the predictive value of DM and resting HR on CAN in a large sample derived from a Chinese population.Materials and Methods.We conducted a large-scale, population-based, cross-sectional study to explore the relationships of CAN with DM and resting HR. A total of 387 subjects were diagnosed with CAN in our dataset. The associations of CAN with DM and resting HR were assessed by a multivariate logistic regression (MLR) analysis (using subjects without CAN as a reference group) after controlling for potential confounding factors. The area under the receiver-operating characteristic curve (AUC) was used to evaluate the predictive performance of resting HR and DM.Results.A tendency toward increased CAN prevalence with increasing resting HR was reported (Pfor trend<0.001). MLR analysis showed that DM and resting HR were very significantly and independently associated with CAN (P<0.001for both). Resting HR alone or combined with DM (DM-HR) both strongly predicted CAN (AUC = 0.719, 95% CI 0.690–0.748 for resting HR and AUC = 0.738, 95% CI 0.710–0.766 for DM-HR).Conclusion.Our findings signify that resting HR and DM-HR have a high value in predicting CAN in the general population.


2021 ◽  
Vol 9 ◽  
Author(s):  
Yin Xing ◽  
Jianping Yue ◽  
Zizheng Guo ◽  
Yang Chen ◽  
Jia Hu ◽  
...  

Integration of different models may improve the performance of landslide susceptibility assessment, but few studies have tested it. The present study aims at exploring the way to integrating different models and comparing the results among integrated and individual models. Our objective is to answer this question: Will the integrated model have higher accuracy compared with individual model? The Lvliang mountains area, a landslide-prone area in China, was taken as the study area, and ten factors were considered in the influencing factors system. Three basic machine learning models (the back propagation (BP), support vector machine (SVM), and random forest (RF) models) were integrated by an objective function where the weight coefficients among different models were computed by the gray wolf optimization (GWO) algorithm. 80 and 20% of the landslide data were randomly selected as the training and testing samples, respectively, and different landslide susceptibility maps were generated based on the GIS platform. The results illustrated that the accuracy expressed by the area under the receiver operating characteristic curve (AUC) of the BP-SVM-RF integrated model was the highest (0.7898), which was better than that of the BP (0.6929), SVM (0.6582), RF (0.7258), BP-SVM (0.7360), BP-RF (0.7569), and SVM-RF models (0.7298). The experimental results authenticated the effectiveness of the BP-SVM-RF method, which can be a reliable model for the regional landslide susceptibility assessment of the study area. Moreover, the proposed procedure can be a good option to integrate different models to seek an “optimal” result.


Cancers ◽  
2021 ◽  
Vol 13 (9) ◽  
pp. 2111
Author(s):  
Bo-Wei Zhao ◽  
Zhu-Hong You ◽  
Lun Hu ◽  
Zhen-Hao Guo ◽  
Lei Wang ◽  
...  

Identification of drug-target interactions (DTIs) is a significant step in the drug discovery or repositioning process. Compared with the time-consuming and labor-intensive in vivo experimental methods, the computational models can provide high-quality DTI candidates in an instant. In this study, we propose a novel method called LGDTI to predict DTIs based on large-scale graph representation learning. LGDTI can capture the local and global structural information of the graph. Specifically, the first-order neighbor information of nodes can be aggregated by the graph convolutional network (GCN); on the other hand, the high-order neighbor information of nodes can be learned by the graph embedding method called DeepWalk. Finally, the two kinds of feature are fed into the random forest classifier to train and predict potential DTIs. The results show that our method obtained area under the receiver operating characteristic curve (AUROC) of 0.9455 and area under the precision-recall curve (AUPR) of 0.9491 under 5-fold cross-validation. Moreover, we compare the presented method with some existing state-of-the-art methods. These results imply that LGDTI can efficiently and robustly capture undiscovered DTIs. Moreover, the proposed model is expected to bring new inspiration and provide novel perspectives to relevant researchers.


2021 ◽  
Vol 13 (11) ◽  
pp. 2074
Author(s):  
Ryan R. Reisinger ◽  
Ari S. Friedlaender ◽  
Alexandre N. Zerbini ◽  
Daniel M. Palacios ◽  
Virginia Andrews-Goff ◽  
...  

Machine learning algorithms are often used to model and predict animal habitat selection—the relationships between animal occurrences and habitat characteristics. For broadly distributed species, habitat selection often varies among populations and regions; thus, it would seem preferable to fit region- or population-specific models of habitat selection for more accurate inference and prediction, rather than fitting large-scale models using pooled data. However, where the aim is to make range-wide predictions, including areas for which there are no existing data or models of habitat selection, how can regional models best be combined? We propose that ensemble approaches commonly used to combine different algorithms for a single region can be reframed, treating regional habitat selection models as the candidate models. By doing so, we can incorporate regional variation when fitting predictive models of animal habitat selection across large ranges. We test this approach using satellite telemetry data from 168 humpback whales across five geographic regions in the Southern Ocean. Using random forests, we fitted a large-scale model relating humpback whale locations, versus background locations, to 10 environmental covariates, and made a circumpolar prediction of humpback whale habitat selection. We also fitted five regional models, the predictions of which we used as input features for four ensemble approaches: an unweighted ensemble, an ensemble weighted by environmental similarity in each cell, stacked generalization, and a hybrid approach wherein the environmental covariates and regional predictions were used as input features in a new model. We tested the predictive performance of these approaches on an independent validation dataset of humpback whale sightings and whaling catches. These multiregional ensemble approaches resulted in models with higher predictive performance than the circumpolar naive model. These approaches can be used to incorporate regional variation in animal habitat selection when fitting range-wide predictive models using machine learning algorithms. This can yield more accurate predictions across regions or populations of animals that may show variation in habitat selection.


Author(s):  
Kazutaka Uchida ◽  
Junichi Kouno ◽  
Shinichi Yoshimura ◽  
Norito Kinjo ◽  
Fumihiro Sakakibara ◽  
...  

AbstractIn conjunction with recent advancements in machine learning (ML), such technologies have been applied in various fields owing to their high predictive performance. We tried to develop prehospital stroke scale with ML. We conducted multi-center retrospective and prospective cohort study. The training cohort had eight centers in Japan from June 2015 to March 2018, and the test cohort had 13 centers from April 2019 to March 2020. We use the three different ML algorithms (logistic regression, random forests, XGBoost) to develop models. Main outcomes were large vessel occlusion (LVO), intracranial hemorrhage (ICH), subarachnoid hemorrhage (SAH), and cerebral infarction (CI) other than LVO. The predictive abilities were validated in the test cohort with accuracy, positive predictive value, sensitivity, specificity, area under the receiver operating characteristic curve (AUC), and F score. The training cohort included 3178 patients with 337 LVO, 487 ICH, 131 SAH, and 676 CI cases, and the test cohort included 3127 patients with 183 LVO, 372 ICH, 90 SAH, and 577 CI cases. The overall accuracies were 0.65, and the positive predictive values, sensitivities, specificities, AUCs, and F scores were stable in the test cohort. The classification abilities were also fair for all ML models. The AUCs for LVO of logistic regression, random forests, and XGBoost were 0.89, 0.89, and 0.88, respectively, in the test cohort, and these values were higher than the previously reported prediction models for LVO. The ML models developed to predict the probability and types of stroke at the prehospital stage had superior predictive abilities.


Author(s):  
Prasad Nagakumar ◽  
Ceri-Louise Chadwick ◽  
Andrew Bush ◽  
Atul Gupta

AbstractThe COVID-19 pandemic caused by SARS-COV-2 virus fortunately resulted in few children suffering from severe disease. However, the collateral effects on the COVID-19 pandemic appear to have had significant detrimental effects on children affected and young people. There are also some positive impacts in the form of reduced prevalence of viral bronchiolitis. The new strain of SARS-COV-2 identified recently in the UK appears to have increased transmissibility to children. However, there are no large vaccine trials set up in children to evaluate safety and efficacy. In this short communication, we review the collateral effects of COVID-19 pandemic in children and young people. We highlight the need for urgent strategies to mitigate the risks to children due to the COVID-19 pandemic. What is Known:• Children and young people account for <2% of all COVID-19 hospital admissions• The collateral impact of COVID-19 pandemic on children and young people is devastating• Significant reduction in influenza and respiratory syncytial virus (RSV) infection in the southern hemisphere What is New:• The public health measures to reduce COVID-19 infection may have also resulted in near elimination of influenza and RSV infections across the globe• A COVID-19 vaccine has been licensed for adults. However, large scale vaccine studies are yet to be initiated although there is emerging evidence of the new SARS-COV-2 strain spreading more rapidly though young people.• Children and young people continue to bear the collateral effects of COVID-19 pandemic


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Espen Jimenez-Solem ◽  
Tonny S. Petersen ◽  
Casper Hansen ◽  
Christian Hansen ◽  
Christina Lioma ◽  
...  

AbstractPatients with severe COVID-19 have overwhelmed healthcare systems worldwide. We hypothesized that machine learning (ML) models could be used to predict risks at different stages of management and thereby provide insights into drivers and prognostic markers of disease progression and death. From a cohort of approx. 2.6 million citizens in Denmark, SARS-CoV-2 PCR tests were performed on subjects suspected for COVID-19 disease; 3944 cases had at least one positive test and were subjected to further analysis. SARS-CoV-2 positive cases from the United Kingdom Biobank was used for external validation. The ML models predicted the risk of death (Receiver Operation Characteristics—Area Under the Curve, ROC-AUC) of 0.906 at diagnosis, 0.818, at hospital admission and 0.721 at Intensive Care Unit (ICU) admission. Similar metrics were achieved for predicted risks of hospital and ICU admission and use of mechanical ventilation. Common risk factors, included age, body mass index and hypertension, although the top risk features shifted towards markers of shock and organ dysfunction in ICU patients. The external validation indicated fair predictive performance for mortality prediction, but suboptimal performance for predicting ICU admission. ML may be used to identify drivers of progression to more severe disease and for prognostication patients in patients with COVID-19. We provide access to an online risk calculator based on these findings.


Nutrients ◽  
2021 ◽  
Vol 13 (7) ◽  
pp. 2308
Author(s):  
Sunmin Park ◽  
Ting Zhang

The association between immunity and metabolic syndrome (MetS) has been studied, but its interaction with lifestyles remains unclear. We studied their association and interactions with lifestyles in 40,768 adults aged over 40 years from a large-scale, hospital-based cohort study collected during 2010–2013. White blood cell counts (WBC) and serum C-reactive protein concentrations (CRP) were used as indexes of immune status. The participants were categorized into four groups by the cutoff points of 6.2 × 109/L WBC(L-WBC) and <0.5 mg/dL CRP(L-CRP): L-WBC+L-CRP(n = 25,604), H-WBC+L-CRP(n = 13,880), L-WBC+H-CRP(n = 464), and H-WBC+H-CRP(n = 820). The participants in the H-WBC+L-CRP were younger and had higher numbers of males than the L-WBC+L-CRP. MetS risk was higher by 1.75- and 1.86-fold in the H-WBC+L-CRP and H-WBC+H-CRP, respectively, than the L-WBC+L-CRP. MetS components, including plasma glucose and triglyceride concentrations, and SBP were elevated in H-WBC+L-CRP and H-WBC+H-CRP compared with L-WBC+L-CR+P. The risk of hyperglycemia and high HbA1c was the highest in the H-WBC+H-CRP among all groups. Areas of WBC counts and serum CRP concentrations were 0.637 and 0.672, respectively, in the receiver operating characteristic curve. Daily intake of energy, carbohydrate, protein, and fat was not significantly different in the groups based on WBC counts and CRP. However, a plant-based diet (PBD), physical activity, and non-smoking were related to lowering WBC counts and CRP, but a Western-style diet was linked to elevating CRP. A high PBD intake and smoking status interacted with immunity to influence MetS risk: a low PBD and current smoking were associated with a higher MetS risk in the H-WBC+H-CRP. In conclusion, overactivated immunity determined by CRP and WBC was associated with MetS risk. Behavior modification with PBD and physical activity might be related to immunity regulation.


2018 ◽  
Vol 20 (6) ◽  
pp. 2066-2087 ◽  
Author(s):  
Chen Wang ◽  
Lukasz Kurgan

AbstractDrug–protein interactions (DPIs) underlie the desired therapeutic actions and the adverse side effects of a significant majority of drugs. Computational prediction of DPIs facilitates research in drug discovery, characterization and repurposing. Similarity-based methods that do not require knowledge of protein structures are particularly suitable for druggable genome-wide predictions of DPIs. We review 35 high-impact similarity-based predictors that were published in the past decade. We group them based on three types of similarities and their combinations that they use. We discuss and compare key aspects of these methods including source databases, internal databases and their predictive models. Using our novel benchmark database, we perform comparative empirical analysis of predictive performance of seven types of representative predictors that utilize each type of similarity individually and all possible combinations of similarities. We assess predictive quality at the database-wide DPI level and we are the first to also include evaluation over individual drugs. Our comprehensive analysis shows that predictors that use more similarity types outperform methods that employ fewer similarities, and that the model combining all three types of similarities secures area under the receiver operating characteristic curve of 0.93. We offer a comprehensive analysis of sensitivity of predictive performance to intrinsic and extrinsic characteristics of the considered predictors. We find that predictive performance is sensitive to low levels of similarities between sequences of the drug targets and several extrinsic properties of the input drug structures, drug profiles and drug targets. The benchmark database and a webserver for the seven predictors are freely available at http://biomine.cs.vcu.edu/servers/CONNECTOR/.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Smritikana Dutta ◽  
Anwesha Deb ◽  
Prasun Biswas ◽  
Sukanya Chakraborty ◽  
Suman Guha ◽  
...  

AbstractBamboos, member of the family Poaceae, represent many interesting features with respect to their fast and extended vegetative growth, unusual, yet divergent flowering time across species, and impact of sudden, large scale flowering on forest ecology. However, not many studies have been conducted at the molecular level to characterize important genes that regulate vegetative and flowering habit in bamboo. In this study, two bamboo FD genes, BtFD1 and BtFD2, which are members of the florigen activation complex (FAC) have been identified by sequence and phylogenetic analyses. Sequence comparisons identified one important amino acid, which was located in the DNA-binding basic region and was altered between BtFD1 and BtFD2 (Ala146 of BtFD1 vs. Leu100 of BtFD2). Electrophoretic mobility shift assay revealed that this alteration had resulted into ten times higher binding efficiency of BtFD1 than BtFD2 to its target ACGT motif present at the promoter of the APETALA1 gene. Expression analyses in different tissues and seasons indicated the involvement of BtFD1 in flower and vegetative development, while BtFD2 was very lowly expressed throughout all the tissues and conditions studied. Finally, a tenfold increase of the AtAP1 transcript level by p35S::BtFD1 Arabidopsis plants compared to wild type confirms a positively regulatory role of BtFD1 towards flowering. However, constitutive expression of BtFD1 had led to dwarfisms and apparent reduction in the length of flowering stalk and numbers of flowers/plant, whereas no visible phenotype was observed for BtFD2 overexpression. This signifies that timely expression of BtFD1 may be critical to perform its programmed developmental role in planta.


Sign in / Sign up

Export Citation Format

Share Document