scholarly journals ReCGBM: a gradient boosting-based method for predicting human dicer cleavage sites

2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Pengyu Liu ◽  
Jiangning Song ◽  
Chun-Yu Lin ◽  
Tatsuya Akutsu

Abstract Background Human dicer is an enzyme that cleaves pre-miRNAs into miRNAs. Several models have been developed to predict human dicer cleavage sites, including PHDCleav and LBSizeCleav. Given an input sequence, these models can predict whether the sequence contains a cleavage site. However, these models only consider each sequence independently and lack interpretability. Therefore, it is necessary to develop an accurate and explainable predictor, which employs relations between different sequences, to enhance the understanding of the mechanism by which human dicer cleaves pre-miRNA. Results In this study, we develop an accurate and explainable predictor for human dicer cleavage site – ReCGBM. We design relational features and class features as inputs to a lightGBM model. Computational experiments show that ReCGBM achieves the best performance compared to the existing methods. Further, we find that features in close proximity to the center of pre-miRNA are more important and make a significant contribution to the performance improvement of the developed method. Conclusions The results of this study show that ReCGBM is an interpretable and accurate predictor. Besides, the analyses of feature importance show that it might be of particular interest to consider more informative features close to the center of the pre-miRNA in future predictors.

2017 ◽  
Author(s):  
Igor I. Titov ◽  
Pavel S. Vorozheykin

AbstractBackgroundMicroRNAs proceeds through the different canonical and non-canonical pathways; the most frequent of the non-canonical ones is the splicing-dependent biogenesis of mirtrons. We compare the mirtrons and non-mirtrons of human and mouse to explore how their maturation appears in the precursor structure around the miRNA.ResultsWe found the coherence of the overhang lengths what indicates the dependence between the cleavage sites. To explain this dependence we suggest the 2-lever model of the Dicer structure that couples the imprecisions in Drosha and Dicer. Considering the secondary structure of all animal pre-miRNAs we confirmed that single-stranded nucleotides tend to be located near the miRNA boundaries and in its center and are characterized by a higher mutation rate. The 5′ end of the canonical 5′ miRNA approaches the nearest single-stranded nucleotides what suggests the extension of the loop-counting rule from the Dicer to the Drosha cleavage site. A typical structure of the annotated mirtron pre-miRNAs differs from the canonical pre-miRNA structure and possesses the 1- and 2nt hanging ends at the hairpin base. Together with the excessive variability of the mirtron Dicer cleavage site (that could be partially explained by guanine at its ends inherited from splicing) this is one more evidence for the 2-lever model. In contrast with the canonical miRNAs the mirtrons have higher snp densities and their pre-miRNAs are inversely associated with diseases. Therefore we supported the view that mirtrons are under positive selection while canonical miRNAs are under negative one and we suggested that mirtrons are an intrinsic source of silencing variability which produces the disease-promoting variants. Finally, we considered the interference of the pre-miRNA structure and the U2snRNA:pre-mRNA basepairing. We analyzed the location of the branchpoints and found that mirtron structure tends to expose the branchpoint site what suggests that the mirtrons can readily evolve from occasional hairpins in the immediate neighbourhood of the 3′ splice site.ConclusionThe miRNA biogenesis manifests itself in the footprints of the secondary structure. Close inspection of these structural properties can help to uncover new pathways of miRNA biogenesis and to refine the known miRNA data, in particular, new non-canonical miRNAs may be predicted or the known miRNAs can be re-classified.


2021 ◽  
Vol 5 (1) ◽  
Author(s):  
Osman Mamun ◽  
Madison Wenzlick ◽  
Arun Sathanur ◽  
Jeffrey Hawk ◽  
Ram Devanathan

AbstractThe Larson–Miller parameter (LMP) offers an efficient and fast scheme to estimate the creep rupture life of alloy materials for high-temperature applications; however, poor generalizability and dependence on the constant C often result in sub-optimal performance. In this work, we show that the direct rupture life parameterization without intermediate LMP parameterization, using a gradient boosting algorithm, can be used to train ML models for very accurate prediction of rupture life in a variety of alloys (Pearson correlation coefficient >0.9 for 9–12% Cr and >0.8 for austenitic stainless steels). In addition, the Shapley value was used to quantify feature importance, making the model interpretable by identifying the effect of various features on the model performance. Finally, a variational autoencoder-based generative model was built by conditioning on the experimental dataset to sample hypothetical synthetic candidate alloys from the learnt joint distribution not existing in both 9–12% Cr ferritic–martensitic alloys and austenitic stainless steel datasets.


Author(s):  
Antonio Ramón Romeu ◽  
Enric Ollé

The furin cleavage site, with an arginine doublet (RR), is one of the clues of the SARS-CoV-2 origin. This furin-RR is encoded by the CGG-CGG sequence. Because arginine can be encoded by six codons, in a previous work we found that in SARS-CoV-2, CGG was the minority arginine codon (3%). Also, analyzing the RR doublet from a large sample of furin cleavage sites of several kinds of viruses, we found that none of them were encoded by CGG-CGG. Here, we come back to the core of the matter, but from the perspective that in the human genome, in contrast, CGG is the majoroty arginine codon (21%). Here, we highlighted that the 6 arginine codons provide genetic markers to a traceability on the RR origin in the furin site, as well as, to weigh the probability of the theories about the origin of the virus.


2021 ◽  
Author(s):  
Seong Hwan Kim ◽  
Eun-Tae Jeon ◽  
Sungwook Yu ◽  
Kyungmi O ◽  
Chi Kyung Kim ◽  
...  

Abstract We aimed to develop a novel prediction model for early neurological deterioration (END) based on an interpretable machine learning (ML) algorithm for atrial fibrillation (AF)-related stroke and to evaluate the prediction accuracy and feature importance of ML models. Data from multi-center prospective stroke registries in South Korea were collected. After stepwise data preprocessing, we utilized logistic regression, support vector machine, extreme gradient boosting, light gradient boosting machine (LightGBM), and multilayer perceptron models. We used the Shapley additive explanations (SHAP) method to evaluate feature importance. Of the 3,623 stroke patients, the 2,363 who had arrived at the hospital within 24 hours of symptom onset and had available information regarding END were included. Of these, 318 (13.5%) had END. The LightGBM model showed the highest area under the receiver operating characteristic curve (0.778, 95% CI, 0.726 - 0.830). The feature importance analysis revealed that fasting glucose level and the National Institute of Health Stroke Scale score were the most influential factors. Among ML algorithms, the LightGBM model was particularly useful for predicting END, as it revealed new and diverse predictors. Additionally, the SHAP method can be adjusted to individualize the features’ effects on the predictive power of the model.


2017 ◽  
Vol 91 (10) ◽  
Author(s):  
Amit Gaba ◽  
Lisanework Ayalew ◽  
Niraj Makadiya ◽  
Suresh Tikoo

ABSTRACT Proteolytic maturation involving cleavage of one nonstructural and six structural precursor proteins including pVIII by adenovirus protease is an important aspect of the adenovirus life cycle. The pVIII encoded by bovine adenovirus 3 (BAdV-3) is a protein of 216 amino acids and contains two potential protease cleavage sites. Here, we report that BAdV-3 pVIII is cleaved by adenovirus protease at both potential consensus protease cleavage sites. Usage of at least one cleavage site appears essential for the production of progeny BAdV-3 virions as glycine-to-alanine mutation of both protease cleavage sites appears lethal for the production of progeny virions. However, mutation of a single protease cleavage site of BAdV-3 pVIII significantly affects the efficient production of infectious progeny virions. Further analysis revealed no significant defect in endosome escape, genome replication, capsid formation, and virus assembly. Interestingly, cleavage of pVIII at both potential cleavage sites appears essential for the production of stable BAdV-3 virions as BAdV-3 expressing pVIII containing a glycine-to-alanine mutation of either of the potential cleavage sites is thermolabile, and this mutation leads to the production of noninfectious virions. IMPORTANCE Here, we demonstrated that the BAdV-3 adenovirus protease cleaves BAdV-3 pVIII at both potential protease cleavage sites. Although cleavage of pVIII at one of the two adenoviral protease cleavage sites is required for the production of progeny virions, the mutation of a single cleavage site of pVIII affects the efficient production of infectious progeny virions. Further analysis indicated that the mutation of a single protease cleavage site (glycine to alanine) of pVIII produces thermolabile virions, which leads to the production of noninfectious virions with disrupted capsids. We thus provide evidence about the requirement of proteolytic cleavage of pVIII for production of infectious progeny virions. We feel that our study has significantly advanced the understanding of the requirement of adenovirus protease cleavage of pVIII.


1997 ◽  
Vol 324 (1) ◽  
pp. 263-272 ◽  
Author(s):  
Gepke O. DELWEL ◽  
Ingrid KUIKMAN ◽  
Roel C. van der SCHORS ◽  
Annemieke A. de MELKER ◽  
Arnoud SONNENBERG

The α6A and α6B integrin subunits are proteolytically cleaved during biosynthesis into a heavy chain (120 kDa) that is disulphide-linked to one of two light chains (31 or 30 kDa). Analysis of the structure of the α6A subunit on the carcinoma cell line T24 and human platelets demonstrated that the two light chains of α6 are not differentially glycosylated products of one polypeptide. Rather they possess different polypeptide backbones, which presumably result from proteolytic cleavage at distinct sites in the α6 precursor. Mutations were introduced in the codons for the R876KKR879, E883K884, R890K891 and R898K899 sequences, the potential proteolytic cleavage sites, and wild-type and mutant α6A cDNAs were transfected into K562 cells. The mutant α6A integrin subunits were expressed in association with endogenous β1 at levels comparable to that of wild-type α6Aβ1. A single α6 polypeptide chain (150 kDa) was precipitated from transfectants expressing α6A with mutations or deletions in the RKKR sequence. Mutations in the EK sequence yielded α6A subunits that were cleaved once into a heavy and a light chain, whereas α6A subunits with mutations in one of the two RK sequences were, like wild-type α6A, cleaved into one heavy and two light chains. Thus a change in the RKKR sequence prevents the cleavage of α6. The EK site is the secondary cleavage site, which is used only when the primary site (RKKR) is intact. Microsequencing of the N-termini of the two α6A light chains from platelets demonstrated that cleavage occurs after Arg879 and Lys884. Because α6RKKG, α6GKKR and α6RGGR subunits were not cleaved it seems that both the arginine residues and the lysine residues are essential for cleavage of RKKR. α6A mutants with the RKKR sequence shifted to the EK site, in such a way that the position of the arginine residue after which cleavage occurs corresponds exactly to Lys884, were partly cleaved, whereas α6A mutants with the RKKR sequence shifted to other positions in the α6A subunit, including one in which it was shifted two residues farther than the EK cleavage site, were not cleaved. In addition, α6A mutants with an α5-like cleavage site, i.e. arginine, lysine and histidine residues at positions -1, -2 and -6, were not cleaved. Thus both an intact RKKR sequence and its proper position are essential. After activation by the anti-β1 stimulatory monoclonal antibody TS2/16, both cleaved and uncleaved α6Aβ1 integrins bound to laminin-1. The phorbol ester PMA, which activates cleaved wild-type and mutant α6Aβ1, did not activate uncleaved α6Aβ1. Thus uncleaved α6Aβ1 is capable of ligand binding, but not of inside-out signalling. Our results suggest that cleavage of α6 is required to generate a proper conformation that enables the affinity modulation of the α6Aβ1 receptor by PMA.


Author(s):  
Bernhard N. Bohnert ◽  
Daniel Essigke ◽  
Andrea Janessa ◽  
Jonas C Schneider ◽  
Matthias Wörn ◽  
...  

Proteolytic activation of the renal epithelial sodium channel ENaC involves cleavage events in its α- and γ-subunits and is thought to mediate sodium retention in nephrotic syndrome (NS). However, detection of proteolytically processed ENaC in kidney tissue from nephrotic mice has been elusive so far. We used a refined Western blot technique to reliably discriminate full-length α- and γ-ENaC and their cleavage products after proteolysis at their proximal and distal cleavage sites (designated from the N-terminus), respectively. Proteolytic ENaC activation was investigated in kidneys from mice with experimental NS induced by doxorubicin or inducible podocin deficiency with or without treatment with the serine protease inhibitor aprotinin. Nephrotic mice developed sodium retention and increased expression of fragments of α- and γ-ENaC cleaved at both the proximal and more prominently at the distal cleavage site, respectively. Treatment with aprotinin but not with the mineralocorticoid receptor antagonist canrenoate prevented sodium retention and upregulation of the cleavage products in nephrotic mice. Increased expression of cleavage products of α- and γ-ENaC was similarly found in healthy mice treated with a low salt diet, sensitive to mineralocorticoid receptor blockade. In human nephrectomy specimens, γ-ENaC was found in the full-length form and predominantly cleaved at its distal cleavage site. In conclusion, murine experimental NS leads to aprotinin-sensitive proteolytic activation of ENaC at both proximal and more prominently distal cleavage sites of its α- and γ-subunit, most likely by urinary serine protease activity or proteasuria.


2014 ◽  
Vol 25 (2) ◽  
pp. 179-186
Author(s):  
Ruth Kleinpell ◽  
Christa A. Schorr

Sepsis is the body’s systemic response to infection that can be complicated by acute organ dysfunction and is associated with high mortality rates and adverse outcomes for acute and critically ill patients. The 2012 Surviving Sepsis Campaign guidelines advocated for implementation of evidence-based practice care for sepsis, with a focus on quality improvement. Nurses are directly involved in identification and management of sepsis. Implementing performance improvement strategies aimed at early recognition and targeted treatment can further improve sepsis care and patient outcomes. This article presents an overview of the process of implementing performance improvement initiatives for sepsis care, highlighting the significant contribution of nursing care.


2012 ◽  
Vol 302 (1) ◽  
pp. F1-F8 ◽  
Author(s):  
Christopher J. Passero ◽  
Gunhild M. Mueller ◽  
Michael M. Myerburg ◽  
Marcelo D. Carattino ◽  
Rebecca P. Hughey ◽  
...  

The epithelial sodium channel (ENaC) is activated by a unique mechanism, whereby inhibitory tracts are released by proteolytic cleavage within the extracellular loops of two of its three homologous subunits. While cleavage by furin within the biosynthetic pathway releases one inhibitory tract from the α-subunit and moderately activates the channel, full activation through release of a second inhibitory tract from the γ-subunit requires cleavage once by furin and then at a distal site by a second protease, such as prostasin, plasmin, or elastase. We now report that coexpression of mouse transmembrane protease serine 4 (TMPRSS4) with mouse ENaC in Xenopus oocytes was associated with a two- to threefold increase in channel activity and production of a unique ∼70-kDa carboxyl-terminal fragment of the γ-subunit, similar to the ∼70-kDa γ-subunit fragment that we previously observed with prostasin-dependent channel activation. TMPRSS4-dependent channel activation and production of the ∼70-kDa fragment were partially blocked by mutation of the prostasin-dependent cleavage site (γRKRK186QQQQ). Complete inhibition of TMPRSS4-dependent activation of ENaC and γ-subunit cleavage was observed when three basic residues between the furin and prostasin cleavage sites were mutated (γK173Q, γK175Q, and γR177Q), in addition to γRKRK186QQQQ. Mutation of the four basic residues associated with the furin cleavage site (γRKRR143QQQQ) also prevented TMPRSS4-dependent channel activation. We conclude that TMPRSS4 primarily activates ENaC by cleaving basic residues within the tract γK173-K186 distal to the furin cleavage site, thereby releasing a previously defined key inhibitory tract encompassing γR158-F168 from the γ-subunit.


Sign in / Sign up

Export Citation Format

Share Document