noncoding sequences
Recently Published Documents


TOTAL DOCUMENTS

173
(FIVE YEARS 19)

H-INDEX

38
(FIVE YEARS 2)

2022 ◽  
Author(s):  
W. Bart Bryant ◽  
Allison Yang ◽  
Susan Griffin ◽  
Wei Zhang ◽  
Xiaochun Long ◽  
...  

Microinjected transgenes, including bacterial artificial chromosomes (BACs), insert randomly in the mouse genome. Traditional methods of mapping a transgene are challenging, thus complicating breeding strategies and the accurate interpretation of phenotypes, particularly when a transgene disrupts critical coding or noncoding sequences. Here, we introduce CRISPR-Cas9 long-read sequencing (CRISPR-LRS) to ascertain transgene integration locus and estimated copy number. This method revealed integration loci for both a BAC and Cre-driver line, and estimated the copy numbers for two other BAC mouse lines. CRISPR-LRS offers an easy approach to establish robust breeding practices and accurate phenotyping of most any transgenic mouse line.


2021 ◽  
Author(s):  
Chris Papadopoulos ◽  
Isabelle Callebaut ◽  
Jean-Christophe Gelly ◽  
Isabelle Hatin ◽  
Olivier Namy ◽  
...  

The noncoding genome plays an important role in de novo gene birth and in the emergence of genetic novelty. Nevertheless, how noncoding sequences’ properties could promote the birth of novel genes and shape the evolution and the structural diversity of proteins remains unclear. Therefore, by combining different bioinformatic approaches, we characterized the fold potential diversity of the amino acid sequences encoded by all intergenic open reading frames (ORFs) of S. cerevisiae with the aim of (1) exploring whether the structural states’ diversity of proteomes is already present in noncoding sequences, and (2) estimating the potential of the noncoding genome to produce novel protein bricks that could either give rise to novel genes or be integrated into pre-existing proteins, thus participating in protein structure diversity and evolution. We showed that amino acid sequences encoded by most yeast intergenic ORFs contain the elementary building blocks of protein structures. Moreover, they encompass the large structural state diversity of canonical proteins, with the majority predicted as foldable. Then, we investigated the early stages of de novo gene birth by reconstructing the ancestral sequences of 70 yeast de novo genes and characterized the sequence and structural properties of intergenic ORFs with a strong translation signal. This enabled us to highlight sequence and structural factors determining de novo gene emergence. Finally, we showed a strong correlation between the fold potential of de novo proteins and one of their ancestral amino acid sequences, reflecting the relationship between the noncoding genome and the protein structure universe.


Entropy ◽  
2021 ◽  
Vol 23 (10) ◽  
pp. 1324
Author(s):  
Garin Newcomb ◽  
Khalid Sayood

One of the important steps in the annotation of genomes is the identification of regions in the genome which code for proteins. One of the tools used by most annotation approaches is the use of signals extracted from genomic regions that can be used to identify whether the region is a protein coding region. Motivated by the fact that these regions are information bearing structures we propose signals based on measures motivated by the average mutual information for use in this task. We show that these signals can be used to identify coding and noncoding sequences with high accuracy. We also show that these signals are robust across species, phyla, and kingdom and can, therefore, be used in species agnostic genome annotation algorithms for identifying protein coding regions. These in turn could be used for gene identification.


2021 ◽  
Author(s):  
Nilmini Hettiarachchi

Conserved non coding Sequences (CNSs) are extensively studied for their regulatory properties and functional importance to organisms. Many features such as location, proximity to the likely target gene, lineage specificity, functionality of likely target genes, and nucleotide composition of these sequences have been investigated, thus have provided very meaningful insight to signify underlying evolutionary importance of these elements. Also thorough investigation around how to assign function to non-coding regions of eukaryote genomes is another area that is studied. On one hand evolutionary analyses, including signatures of selection or conservation which can indicate the presence of constraint, suggesting that sequences that are evolving non-neutrally are candidates for functionality. On the other hand evidence that is based on experimental profiling of transcription, methylation, histone modifications and chromatin state. While these types of data are very important and are associated with function in most cases, this is not always the case. Evolutionary conservation though highly conservative which mostly considers elements identifiable in more than one species, is still being used as the initial guideline in investigating function via experiments. If we had an understanding of the experimental profiles of conserved non-coding regions as there may be patterns that are often associated these potentially functional elements it may help to construed functionality of conserved non coding regions easily. In an effort to try integrate experimental profile data, we investigated evidence of expression of conserved noncoding sequences (CNSs). For CNSs from ten primates, we assessed transcription, histone modifications, level of evolutionary constraint or accelerated evolution, and assessed possible target genes, tissue expression profiles of likely target genes (as some CNSs may be enhancers, and may be ncRNAs that interact directly with mRNA) and clustering patterns of CNSs. In total we found 153475 CNSs conserved across all ten primates. Of these 59,870 were overlapping non coding regions of ncRNA genes. H3K4Me1 marks (often associated with active enhancers) were highly correlated with CNSs whereas H4K20Me1 (linked to, e.g. DNA damage repair) had high correlation with conserved ncRNA regions (ncRNA-gene-CEs). Both CNSs and conserved ncRNA showed evidence of being under purifying selection. The CNSs in our dataset overall exhibited lower allele frequencies, consistent with higher levels of evolutionary constraint. We also found that CNSs and ncRNA-gene-CEs produce mutually exclusive groups. The analyses also suggest that both types of conserved elements have undergone waves of accelerated evolution, which we speculate may indicate changes in regulatory requirements following divergence events. Finally, we find that likely target genes for hominoidae, primate and mammalian-specific CNSs and ncRNA-gene-CEs are predominantly associated with brain-related function in humans. The deep conserved primate CNSs and ncRNA gene-CEs signify functional importance suggesting ongoing recruitment of these elements into brain-related functions, consistent with King and Wilsons hypothesis that regulatory changes may account for rapid changes in phenotype among primates.


2021 ◽  
Vol 8 (1) ◽  
Author(s):  
Jianke Du ◽  
Chunfeng Ge ◽  
Tingting Li ◽  
Sanhong Wang ◽  
Zhihong Gao ◽  
...  

AbstractStrawberry (Fragaria spp.) is a member of the Rosoideae subfamily in the family Rosaceae. The self-incompatibility (SI) of some diploid species is a key agronomic trait that acts as a basic pollination barrier; however, the genetic mechanism underlying SI control in strawberry remains unclear. Two candidate S-RNases (Sa- and Sb-RNase) identified in the transcriptome of the styles of the self-incompatible Fragaria viridis 42 were confirmed to be SI determinants at the S locus following genotype identification and intraspecific hybridization using selfing progenies. Whole-genome collinearity and RNase T2 family analysis revealed that only an S locus exists in Fragaria; however, none of the compatible species contained S-RNase. Although the results of interspecific hybridization experiments showed that F. viridis (SI) styles could accept pollen from F. mandshurica (self-compatible), the reciprocal cross was incompatible. Sa and Sb-RNase contain large introns, and their noncoding sequences (promotors and introns) can be transcribed into long noncoding RNAs (lncRNAs). Overall, the genus Fragaria exhibits S-RNase-based gametophytic SI, and S-RNase loss occurs at the S locus of compatible germplasms. In addition, a type of SI-independent unilateral incompatibility exists between compatible and incompatible Fragaria species. Furthermore, the large introns and neighboring lncRNAs in S-RNase in Fragaria could offer clues about S-RNase expression strategies.


2021 ◽  
Author(s):  
Wendell J Pereira ◽  
Sara A Knaack ◽  
Daniel Conde ◽  
Sanhita Chakraborty ◽  
Ryan A Folk ◽  
...  

Nitrogen is one of the most inaccessible plant nutrients, but certain species have overcome this limitation by establishing symbiotic interactions with nitrogen-fixing bacteria in the root nodule. This root nodule symbiosis (RNS) is restricted to species within a single clade of angiosperms, suggesting a critical evolutionary event at the base of this clade, which has not yet been determined. While genes implicated in the RNS are present in most plant species (nodulating or not), gene sequence conservation alone does not imply functional conservation - developmental or phenotypic differences can arise from variation in the regulation of transcription. To identify putative regulatory sequences implicated in the evolution of RNS, we aligned the genomes of 25 species capable of nodulation. We detected 3,091 conserved noncoding sequences (CNS) in the nitrogen-fixing clade that are absent from outgroup species. Functional analysis revealed that chromatin accessibility of 452 CNS significantly correlates with the differential regulation of genes responding to lipo-chitooligosaccharides in Medicago truncatula. These included 38 CNS in proximity to 19 known genes involved in RNS. Five such regions are upstream of MtCRE1, Cytokinin Response Element 1, required to activate a suite of downstream transcription factors necessary for nodulation in M. truncatula. Genetic complementation of a Mtcre1 mutant showed a significant association between nodulation and the presence of these CNS, when they are driving the expression of a functional copy of MtCRE1. Conserved noncoding sequences, therefore, may be required for the regulation of genes controlling the root nodule symbiosis in M. truncatula.


2021 ◽  
Vol 118 (25) ◽  
pp. e2102683118
Author(s):  
Hao Yin ◽  
Chunyao Wei ◽  
Jeannie T. Lee

Mammalian cells equalize X-linked dosages between the male (XY) and female (XX) sexes by silencing one X chromosome in the female sex. This process, known as “X chromosome inactivation” (XCI), requires a master switch within the X inactivation center (Xic). The Xic spans several hundred kilobases in the mouse and includes a number of regulatory noncoding genes that produce functional transcripts. Over three decades, transgenic and deletional analyses have demonstrated both the necessity and sufficiency of the Xic to induce XCI, including the steps of X chromosome counting, choice, and initiation of whole-chromosome silencing. One recent study, however, reported that deleting the noncoding sequences of the Xic surprisingly had no effect for XCI and attributed a sufficiency to drive counting to the coding gene, Rnf12/Rlim. Here, we revisit the question by creating independent Xic deletion cell lines. Multiple independent clones carrying heterozygous deletions of the Xic display an inability to up-regulate Xist expression, consistent with a counting defect. This defect is rescued by a second site mutation in Tsix occurring in trans, bypassing the defect in counting. These findings reaffirm the essential nature of noncoding Xic elements for the initiation of XCI.


2021 ◽  
pp. gr.266528.120
Author(s):  
Baoxing Song ◽  
Edward S. Buckler ◽  
Hai Wang ◽  
Yaoyao Wu ◽  
Evan Rees ◽  
...  

2021 ◽  
Author(s):  
Chris Papadopoulos ◽  
Isabelle Callebaut ◽  
Jean-Christophe Gelly ◽  
Isabelle Hatin ◽  
Olivier Namy ◽  
...  

The noncoding genome plays an important role in de novo gene birth and in the emergence of genetic novelty. Nevertheless, how noncoding sequences' properties could promote the birth of novel genes and shape the evolution and the structural diversity of proteins remains unclear. Therefore, by combining different bioinformatic approaches, we characterized the fold potential diversity of the amino acid sequences encoded by all intergenic ORFs (Open Reading Frames) of S. cerevisiae with the aim of (i) exploring whether the large structural diversity observed in proteomes is already present in noncoding sequences, and (ii) estimating the potential of the noncoding genome to produce novel protein bricks that can either give rise to novel genes or be integrated into pre-existing proteins, thus participating in protein structure diversity and evolution. We showed that amino acid sequences encoded by most yeast intergenic ORFs contain the elementary building blocks of protein structures. Moreover, they encompass the large structural diversity of canonical proteins with strikingly the majority predicted as foldable. Then, we investigated the early stages of de novo gene birth by identifying intergenic ORFs with a strong translation signal in ribosome profiling experiments and by reconstructing the ancestral sequences of 70 yeast de novo genes. This enabled us to highlight sequence and structural factors determining de novo gene emergence. Finally, we showed a strong correlation between the fold potential of de novo proteins and the one of their ancestral amino acid sequences, reflecting the relationship between the noncoding genome and the protein structure universe.


Sign in / Sign up

Export Citation Format

Share Document