sequence features
Recently Published Documents


TOTAL DOCUMENTS

372
(FIVE YEARS 115)

H-INDEX

41
(FIVE YEARS 7)

Synlett ◽  
2022 ◽  
Author(s):  
Hui Xiong ◽  
Adam T. Hoye

AbstractA synthesis of 2-aminopyridines from pyridine N-oxides via their corresponding N-(2-pyridyl)pyridinium salts has been demonstrated and investigated. The reaction sequence features a highly regioselective conversion of the N-oxide into its pyridinium salt followed by hydrolytic decomposition of the pyridinium moiety to furnish the 2-aminopyridine product. The method is compatible with a wide range of functional groups, is scalable, and features inexpensive reagents. 15N-labeling results gave products consistent with a Zincke reaction mechanism.


2021 ◽  
Author(s):  
Anne Bremer ◽  
Mina Farag ◽  
Wade M. Borcherds ◽  
Ivan Peran ◽  
Erik W. Martin ◽  
...  

2021 ◽  
Author(s):  
Mengling Qi ◽  
Peter D. Stenson ◽  
Edward V. Ball ◽  
John A. Tainer ◽  
Albino Bacolla ◽  
...  

Author(s):  
Norio Matsushima ◽  
Robert H. Kretsinger

: Leucine-rich repeats (LRRs) occurring in tandem are 20-29 amino acids long. Eleven LRR types have been recognized. Sequence features of LRRs from viruses were investigated using over 600 LRR proteins from 89 species. Before, metagenome data of nucleo-cytoplasmic large dsDNA viruses (NCLDVs) have been published; the 2,074 NCLDVs encode 199,021 proteins. From the NCLDVs, 549 LRR proteins were identified and analyzed. A comprehensive analysis of TpLRR and FNIP that belong to an LRR class was first performed. The repeating unit lengths (RULs) in five types are 19 residues, which are the shortest among all LRRs. Some RULs are one to five residues shorter than those of the known, corresponding LRR types. The shrinking of RUL is also observed in FNIP. The conserved hydrophobic residues, such as Leu, Val or Ile, in the consensus sequences are frequently substituted by cysteine at one or two positions. Some unique LRR types that are different from those identified previously have been observed. The present study confirms the previous result that the sequence novelty is a general feature of viral LRR proteins.


2021 ◽  
Vol 12 ◽  
Author(s):  
Yanling Peng ◽  
Huifang Kang ◽  
Jing Luo ◽  
Yubo Zhang

Super-enhancers (SEs) and broad H3K4me3 domains (BDs) are crucial regulators in the control of tissue identity in human and mouse. However, their features in pig remain largely unknown. In this study, by integrative computational analyses of epigenomic and transcriptomic data, we have characterized SEs and BDs in six pig tissues and analyzed their conservation in comparison with human and mouse tissues. Similar to human and mouse, pig SEs and BDs display higher tissue specificity than their typical counterparts. Genes proximal to SEs and BDs are associated with tissue identity in most tissues. About 55–182 SEs (5–17% in total) and 99–309 BDs (8–16% in total) across pig tissues are considered as functionally conserved elements because they have orthologous SEs and BDs in human and mouse. However, these elements do not necessarily exhibit sequence conservation. The functionally conserved SEs are correlated to tissue identity in majority of pig tissues, while those conserved BDs are linked to tissue identity in a few tissues. Our study provides resources for future gene regulatory studies in pig. It highlights that SEs are more effective in defining tissue identity than BDs, which is contrasting to a previous study. It also provides novel insights on understanding the sequence features of functionally conserved elements.


2021 ◽  
Author(s):  
Arne de Klerk ◽  
Phillip Ivan Swanepoel ◽  
Rentia Francis Lourens ◽  
Mpumelelo Zondo ◽  
Isaac Abodunran ◽  
...  

Recombination contributes to the genetic diversity found in coronaviruses and is known to be a prominent mechanism whereby they evolve. It is apparent, both from controlled experiments and in genome sequences sampled from nature, that patterns of recombination in coronaviruses are non-random and that this is likely attributable to a combination of sequence features that favour the occurrence of recombination breakpoints at specific genomic sites, and selection disfavouring the survival of recombinants within which favourable intra-genome interactions have been disrupted. Here we leverage available whole-genome sequence data for six coronavirus subgenera to identify specific patterns of recombination that are conserved between multiple subgenera and then identify the likely factors that underlie these conserved patterns. Specifically, we confirm the non-randomness of recombination breakpoints across all six tested coronavirus subgenera, locate conserved recombination hot- and cold-spots, and determine that the locations of transcriptional regulatory sequences are likely major determinants of conserved recombination breakpoint hot-spot locations. We find that while the locations of recombination breakpoints are not uniformly associated with degrees of nucleotide sequence conservation, they display significant tendencies in multiple coronavirus subgenera to occur in low guanine-cytosine content genome regions, in non-coding regions, at the edges of genes, and at sites within the Spike gene that are predicted to be minimally disruptive of Spike protein folding. While it is apparent that sequence features such as transcriptional regulatory sequences are likely major determinants of where the template-switching events that yield recombination breakpoints most commonly occur, it is evident that selection against misfolded recombinant proteins also strongly impacts observable recombination breakpoint distributions in coronavirus genomes sampled from nature.


2021 ◽  
Author(s):  
Zewei Chen ◽  
Ziyi Zhao ◽  
Xinjie Hui ◽  
Junya Zhang ◽  
Yixue Hu ◽  
...  

The proteins secreted through type 1 secretion systems often play important roles in pathogenicity of various gram-negative bacteria. However, the type 1 secretion mechanism remains unknown. In this research, we observed the sequence features of RTX proteins, a major class of type 1 secreted substrates. We found striking non-RTX-motif amino acid composition patterns at the C-termini, most typically exemplified by the enriched '[FLI][VAI]' at the most C-terminal two positions. Machine-learning models, including deep-learning models, were trained using these sequence-based non-RTX-motif features, and further combined into a tri-layer stacking model, T1SEstacker, which predicted the RTX proteins accurately, with a 5-fold cross-validated sensitivity of ~0.89 at the specificity of ~0.94. Besides substrates with RTX motifs, T1SEstacker can also well distinguish non-RTX-motif type 1 secreted proteins, further suggesting their potential existence of common secretion signals. In summary, we made comprehensive sequence analysis on the type 1 secreted RTX proteins, identified common sequence-based features at the C-termini, and developed a stacking model that can predict type 1 secreted proteins accurately.


2021 ◽  
Author(s):  
Luca Capaldo ◽  
Stefano Bonciolini ◽  
Antonio Pulcinella ◽  
Manuel Nuno ◽  
Timothy Noel

The late-stage introduction of allyl groups provides an opportunity to synthetic organic chemists for subsequent diversification, providing rapid access to new chemical space. Here, we report the development of a modular synthetic sequence for the allylation of strong aliphatic C(sp3)–H bonds. Our sequence features the merger of two distinct steps to accomplish this goal, including a photocatalytic Hydrogen Atom Transfer and an ensuing Horner-Wadsworth-Emmons reaction. This practical protocol enables the modular and scalable allylation of valuable building blocks and medicinally relevant molecules.


2021 ◽  
Author(s):  
Hilmar Strickfaden ◽  
Kristal Missiaen ◽  
Justin W Knechtel ◽  
Michael J Hendzel ◽  
D Alan Underhill

Cells use multiple strategies to compartmentalize functions through a combination of membrane-bound and membraneless organelles. The latter represent complex assemblies of biomolecules that coalesce into a dense phase through low affinity, multivalent interactions and undergo rapid exchange with the surrounding dilute phase. We describe a liquid-like state for the lysine methyltransferase KMT5C characterized by diffusion within heterochromatin condensates but lacking appreciable nucleoplasmic exchange. Retention was strongly correlated with reduction of condensate surface area, suggesting formation of a liquid droplet with high connectivity. This behavior mapped to a discrete domain whose activity was dependent on multiple short linear motifs. Moreover, it was strikingly resilient to marked phylogenetic differences or targeted changes in intrinsic disorder, charge, sequence, and architecture. Collectively, these findings show that a limited number of sequence features can dominantly encode multivalency, localization, and dynamic behavior within heterochromatin condensates to confer protein retention without progression to a gel or solid.


Sign in / Sign up

Export Citation Format

Share Document