sequence elements
Recently Published Documents


TOTAL DOCUMENTS

868
(FIVE YEARS 107)

H-INDEX

77
(FIVE YEARS 5)

Author(s):  
Hamza Abbad ◽  
Shengwu Xiong

Automatic diacritization is an Arabic natural language processing topic based on the sequence labeling task where the labels are the diacritics and the letters are the sequence elements. A letter can have from zero up to two diacritics. The dataset used was a subset of the preprocessed version of the Tashkeela corpus. We developed a deep learning model composed of a stack of four bidirectional long short-term memory hidden layers of the same size and an output layer at every level. The levels correspond to the groups that we classified the diacritics into (short vowels, double case-endings, Shadda, and Sukoon). Before training, the data were divided into input vectors containing letter indexes and outputs vectors containing the indexes of diacritics regarding their groups. Both input and output vectors are concatenated, then a sliding window operation with overlapping is performed to generate continuous and fixed-size data. Such data is used for both training and evaluation. Finally, we realize some tests using the standard metrics with all of their variations and compare our results with two recent state-of-the-art works. Our model achieved 3% diacritization error rate and 8.99% word error rate when including all letters. We have also generated the confusion matrix to show the performances per output and analyzed the mismatches of the first 500 lines to classify the model errors according to their linguistic nature.


2021 ◽  
Author(s):  
Felix Nicolaus ◽  
Fatima Ibrahimi ◽  
Anne den Besten ◽  
Gunnar von Heijne

During SecYEG-mediated cotranslational insertion of membrane proteins, transmembrane helices (TMHs) first make contact with the membrane when their N-terminal end is ~45 residues away from the peptidyl transferase center. However, we recently uncovered instances where the first contact is delayed by up to ~10 residues. Here, we recapitulate these effects using a model TMH fused to two short segments from the BtuC protein: a positively charged loop and a re-entrant loop. We show that the critical residues are two Arg residues in the positively charged loop and four hydrophobic residues in the re-entrant loop. Thus, both electrostatic and hydrophobic interactions involving sequence elements that are not part of a TMH can impact the way the latter behaves during membrane insertion.


2021 ◽  
Vol 22 (23) ◽  
pp. 13021
Author(s):  
Sandra M. Fernández-Moya ◽  
Janina Ehses ◽  
Karl E. Bauer ◽  
Rico Schieweck ◽  
Anob M. Chakrabarti ◽  
...  

RNA-binding proteins (RBPs) act as posttranscriptional regulators controlling the fate of target mRNAs. Unraveling how RNAs are recognized by RBPs and in turn are assembled into neuronal RNA granules is therefore key to understanding the underlying mechanism. While RNA sequence elements have been extensively characterized, the functional impact of RNA secondary structures is only recently being explored. Here, we show that Staufen2 binds complex, long-ranged RNA hairpins in the 3′-untranslated region (UTR) of its targets. These structures are involved in the assembly of Staufen2 into RNA granules. Furthermore, we provide direct evidence that a defined Rgs4 RNA duplex regulates Staufen2-dependent RNA localization to distal dendrites. Importantly, disrupting the RNA hairpin impairs the observed effects. Finally, we show that these secondary structures differently affect protein expression in neurons. In conclusion, our data reveal the importance of RNA secondary structure in regulating RNA granule assembly, localization and eventually translation. It is therefore tempting to speculate that secondary structures represent an important code for cells to control the intracellular fate of their mRNAs.


2021 ◽  
Author(s):  
Sarah Teakel ◽  
Michealla Marama ◽  
David Aragão ◽  
Sofiya Tsimbalyuk ◽  
Jade K. Forwood ◽  
...  

We recently reported that the membrane associated progesterone receptor (MAPR) protein family (mammalian members: PGRMC1, PGRMC2, NEUFC and NENF) originated from a new class of prokaryotic cytochrome b5 (cytb5) domain proteins, called cytb5M (MAPR-like). Relative to classical cytb5 proteins, MAPR and ctyb5M proteins shared unique sequence elements and a distinct heme binding orientation at an approximately 90⁰ rotation relative to classical cytb5, as demonstrated in the archetypal crystal structure of a cytb5M protein (PDB accession number 6NZX). Here, we present the second crystal structure of an archaeal cytb5M domain (Methanococcoides burtonii WP_011499504.1, PDB:6VZ6). It exhibits similar heme-binding to the 6NZX cytb5M, supporting the deduction that MAPR-like heme orientation was inherited from the prokaryotic ancestor of the original eukaryotic MAPR gene.


2021 ◽  
Vol 12 ◽  
Author(s):  
Jinseung Jeong ◽  
Inhwan Hwang ◽  
Dong Wook Lee

Although the chloroplasts in plants are characterized by an inherent genome, the chloroplast proteome is composed of proteins encoded by not only the chloroplast genome but also the nuclear genome. Nuclear-encoded chloroplast proteins are synthesized on cytosolic ribosomes and post-translationally targeted to the chloroplasts. In the latter process, an N-terminal cleavable transit peptide serves as a targeting signal required for the import of nuclear-encoded chloroplast interior proteins. This import process is mediated via an interaction between the sequence motifs in transit peptides and the components of the TOC/TIC (translocon at the outer/inner envelope of chloroplasts) translocons. Despite a considerable diversity in primary structures, several common features have been identified among transit peptides, including N-terminal moderate hydrophobicity, multiple proline residues dispersed throughout the transit peptide, preferential usage of basic residues over acidic residues, and an absence of N-terminal arginine residues. In this review, we will recapitulate and discuss recent progress in our current understanding of the functional organization of sequence elements commonly present in diverse transit peptides, which are essential for the multi-step import of chloroplast proteins.


2021 ◽  
Vol 220 (12) ◽  
Author(s):  
Sunandini Chandra ◽  
Philip J. Mannino ◽  
David J. Thaller ◽  
Nicholas R. Ader ◽  
Megan C. King ◽  
...  

Mechanisms that turn over components of the nucleus and inner nuclear membrane (INM) remain to be fully defined. We explore how components of the INM are selected by a cytosolic autophagy apparatus through a transmembrane nuclear envelope–localized cargo adaptor, Atg39. A split-GFP reporter showed that Atg39 localizes to the outer nuclear membrane (ONM) and thus targets the INM across the nuclear envelope lumen. Consistent with this, sequence elements that confer both nuclear envelope localization and a membrane remodeling activity are mapped to the Atg39 lumenal domain; these lumenal motifs are required for the autophagy-mediated degradation of integral INM proteins. Interestingly, correlative light and electron microscopy shows that the overexpression of Atg39 leads to the expansion of the ONM and the enclosure of a network of INM-derived vesicles in the nuclear envelope lumen. Thus, we propose an outside–in model of nucleophagy where INM is delivered into vesicles in the nuclear envelope lumen, which can be targeted by the autophagosome.


Viruses ◽  
2021 ◽  
Vol 13 (11) ◽  
pp. 2181
Author(s):  
Pontus Öhlund ◽  
Juliette Hayer ◽  
Jenny C. Hesson ◽  
Anne-Lie Blomström

RNA interference (RNAi)-mediated antiviral immunity is believed to be the primary defense against viral infection in mosquitoes. The production of virus-specific small RNA has been demonstrated in mosquitoes and mosquito-derived cell lines for viruses in all of the major arbovirus families. However, many if not all mosquitoes are infected with a group of viruses known as insect-specific viruses (ISVs), and little is known about the mosquito immune response to this group of viruses. Therefore, in this study, we sequenced small RNA from an Aedes albopictus-derived cell line infected with either Lammi virus (LamV) or Hanko virus (HakV). These viruses belong to two distinct phylogenetic groups of insect-specific flaviviruses (ISFVs). The results revealed that both viruses elicited a strong virus-derived small interfering RNA (vsiRNA) response that increased over time and that targeted the whole viral genome, with a few predominant hotspots observed. Furthermore, only the LamV-infected cells produced virus-derived Piwi-like RNAs (vpiRNAs); however, they were mainly derived from the antisense genome and did not show the typical ping-pong signatures. HakV, which is more distantly related to the dual-host flaviviruses than LamV, may lack certain unknown sequence elements or structures required for vpiRNA production. Our findings increase the understanding of mosquito innate immunity and ISFVs’ effects on their host.


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Andrew Savinov ◽  
Benjamin M. Brandsen ◽  
Brooke E. Angell ◽  
Josh T. Cuperus ◽  
Stanley Fields

Abstract Background The 3′ untranslated region (UTR) plays critical roles in determining the level of gene expression through effects on activities such as mRNA stability and translation. Functional elements within this region have largely been identified through analyses of native genes, which contain multiple co-evolved sequence features. Results To explore the effects of 3′ UTR sequence elements outside of native sequence contexts, we analyze hundreds of thousands of random 50-mers inserted into the 3′ UTR of a reporter gene in the yeast Saccharomyces cerevisiae. We determine relative protein expression levels from the fitness of transformants in a growth selection. We find that the consensus 3′ UTR efficiency element significantly boosts expression, independent of sequence context; on the other hand, the consensus positioning element has only a small effect on expression. Some sequence motifs that are binding sites for Puf proteins substantially increase expression in the library, despite these proteins generally being associated with post-transcriptional downregulation of native mRNAs. Our measurements also allow a systematic examination of the effects of point mutations within efficiency element motifs across diverse sequence backgrounds. These mutational scans reveal the relative in vivo importance of individual bases in the efficiency element, which likely reflects their roles in binding the Hrp1 protein involved in cleavage and polyadenylation. Conclusions The regulatory effects of some 3′ UTR sequence features, like the efficiency element, are consistent regardless of sequence context. In contrast, the consequences of other 3′ UTR features appear to be strongly dependent on their evolved context within native genes.


Cells ◽  
2021 ◽  
Vol 10 (10) ◽  
pp. 2674
Author(s):  
Marie-Luise Mosbach ◽  
Christina Pfafenrot ◽  
Elke Pogge von Strandmann ◽  
Albrecht Bindereif ◽  
Christian Preußer

Extracellular vesicles (EVs) are important for intercellular communication and act as vehicles for biological material, such as various classes of coding and non-coding RNAs, a few of which were shown to selectively target into vesicles. However, protein factors, mechanisms, and sequence elements contributing to this specificity remain largely elusive. Here, we use a reporter system that results in different types of modified transcripts to decipher the specificity determinants of RNAs released into EVs. First, we found that small RNAs are more efficiently packaged into EVs than large ones, and second, we determined absolute quantities for several endogenous RNA transcripts in EVs (U6 snRNA, U1 snRNA, Y1 RNA, and GAPDH mRNA). We show that RNA polymerase III (pol III) transcripts are more efficiently secreted into EVs compared to pol II-derived transcripts. Surprisingly, our quantitative analysis revealed no RNA accumulation in the vesicles relative to the total cellular levels, based on both overexpressed reporter transcripts and endogenous RNAs. RNA appears to be EV-associated only at low copy numbers, ranging between 0.02 and 1 molecule per EV. This RNA association may reflect internal EV encapsulation or a less tightly bound state at the vesicle surface.


Antioxidants ◽  
2021 ◽  
Vol 10 (10) ◽  
pp. 1565
Author(s):  
Hajnal A. Kovács ◽  
Enikő Lázár ◽  
György Várady ◽  
Gábor Sirokmány ◽  
Miklós Geiszt

Peroxidasin (PXDN) and peroxidasin-like protein (PXDNL) are members of the peroxidase-cyclooxygenase superfamily. PXDN functions in basement membrane synthesis by forming collagen IV crosslinks, while the function of PXDNL remains practically unknown. In this work, we characterized the post-translational proteolytic processing of PXDN and PXDNL. Using a novel knock-in mouse model, we demonstrate that the proteolytic cleavage of PXDN occurs in vivo. With the help of furin-specific siRNA we also demonstrate that the proprotein-convertase, furin participates in the proteolytic processing of PXDN. Furthermore, we demonstrate that only the proteolytically processed PXDN integrates into the extracellular matrix, highlighting the importance of the proteolysis step in PXDN’s collagen IV-crosslinking activity. We also provide multiple lines of evidence for the importance of peroxidase activity in the proteolytic processing of PXDN. Finally, we show that PXDNL does not undergo proteolytic processing, despite containing sequence elements efficiently recognized by proprotein convertases. Collectively, our observations suggest a previously unknown protein quality control during PXDN synthesis and the importance of the peroxidase activity of PXDN in this process.


Sign in / Sign up

Export Citation Format

Share Document