intergenic regions
Recently Published Documents


TOTAL DOCUMENTS

591
(FIVE YEARS 222)

H-INDEX

53
(FIVE YEARS 6)

2022 ◽  
Author(s):  
Siu Lung Ng ◽  
Sophia A. Kammann ◽  
Gabi Steinbach ◽  
Tobias Hoffmann ◽  
Peter J. Yunker ◽  
...  

Mutations in regulatory mechanisms that control gene expression contribute to phenotypic diversity and thus facilitate the adaptation of microbes to new niches. Regulatory architecture is often inferred from transcription factor identification and genome analysis using purely computational approaches. However, there are few examples of phenotypic divergence that arise from the rewiring of bacterial regulatory circuity by mutations in intergenic regions, because locating regulatory elements within regions of DNA that do not code for protein requires genomic and experimental data. We identify a single cis-acting single nucleotide polymorphism (SNP) dramatically alters control of the type VI secretion system (T6), a common weapon for inter-bacterial competition. Tight T6 regulatory control is necessary for adaptation of the waterborne pathogen Vibrio cholerae to in vivo conditions within the human gut, which we show can be altered by this single non-coding SNP that results in constitutive expression in vitro. Our results support a model of pathogen evolution through cis-regulatory mutation and preexisting, active transcription factors, thus conferring different fitness advantages to tightly regulated strains inside a human host and unfettered strains adapted to environmental niches.


2022 ◽  
Author(s):  
Albert Agustinus ◽  
Ramya Raviram ◽  
Bhargavi Dameracharla ◽  
Jens Luebeck ◽  
Stephanie Stransky ◽  
...  

Chromosomal instability (CIN) and epigenetic alterations are characteristics of advanced and metastatic cancers [1-4], yet whether they are mechanistically linked is unknown. Here we show that missegregation of mitotic chromosomes, their sequestration in micronuclei [5, 6], and subsequent micronuclear envelope rupture [7] profoundly disrupt normal histone post-translational modifications (PTMs), a phenomenon conserved across humans and mice as well as cancer and non-transformed cells. Some of the changes to histone PTMs occur due to micronuclear envelope rupture whereas others are inherited from mitotic abnormalities prior to micronucleus formation. Using orthogonal techniques, we show that micronuclei exhibit extensive differences in chromatin accessibility with a strong positional bias between promoters and distal or intergenic regions. Finally, we show that inducing CIN engenders widespread epigenetic dysregulation and that chromosomes which transit in micronuclei experience durable abnormalities in their accessibility long after they have been reincorporated into the primary nucleus. Thus, in addition to genomic copy number alterations, CIN can serve as a vehicle for epigenetic reprogramming and heterogeneity in cancer.


BMC Biology ◽  
2022 ◽  
Vol 20 (1) ◽  
Author(s):  
Dongseok Kim ◽  
JunMo Lee ◽  
Chung Hyun Cho ◽  
Eun Jeung Kim ◽  
Debashish Bhattacharya ◽  
...  

Abstract Background Group II introns are mobile genetic elements that can insert at specific target sequences, however, their origins are often challenging to reconstruct because of rapid sequence decay following invasion and spread into different sites. To advance understanding of group II intron spread, we studied the intron-rich mitochondrial genome (mitogenome) in the unicellular red alga, Porphyridium. Results Analysis of mitogenomes in three closely related species in this genus revealed they were 3–6-fold larger in size (56–132 kbp) than in other red algae, that have genomes of size 21–43 kbp. This discrepancy is explained by two factors, group II intron invasion and expansion of repeated sequences in large intergenic regions. Phylogenetic analysis demonstrates that many mitogenome group II intron families are specific to Porphyridium, whereas others are closely related to sequences in fungi and in the red alga-derived plastids of stramenopiles. Network analysis of intron-encoded proteins (IEPs) shows a clear link between plastid and mitochondrial IEPs in distantly related species, with both groups associated with prokaryotic sequences. Conclusion Our analysis of group II introns in Porphyridium mitogenomes demonstrates the dynamic nature of group II intron evolution, strongly supports the lateral movement of group II introns among diverse eukaryotes, and reveals their ability to proliferate, once integrated in mitochondrial DNA.


2022 ◽  
Author(s):  
Sahil Mahfooz ◽  
Jitendra Narayan ◽  
Ruba Mustafa Elsaid Ahmed ◽  
Amel Bakri Mohammed El Hag ◽  
Nuha Abdel Rahman Khalil Mohammed ◽  
...  

Abstract Pathogenic bacteria use phase variation of surface molecules and other characteristics as a significant adaptation mechanism. Repetitive sequences made up of numerous identical repeat units can be found in many phase variable genes. Here, we investigated the frequency and distribution of long-SSRs in 15 human pathogenic Staphylococcus, Streptococcus, and Enterococcus bacteria. Long-SSRs were found to be distributed differently in the genic and intergenic sequences. In the genic sequences, 61.3 SSRs were discovered on average, while 16.2 SSRs were found in the intergenic regions. Staphylococci exhibited the highest frequency of SSRs, followed by Enterococcus, and Streptococci had the lowest frequency of SSRs. Higher A+T content was found to be the best predictor of long-SSR in these human pathogens. Tetranucleotide repeats predominated in intergenic regions, while trinucleotide repeats predominated in genic regions. In human pathogenic Streptococcus and Staphylococcus bacteria, genus-specific encoding of amino acids by tri-nucleotide SSRs was observed. A genetic relationship between these human pathogenic bacteria was derived based on the presence of SSRs in the housekeeping genes and compared to the phylogeny generated based on the 16S ribosomal RNA gene.


2021 ◽  
Author(s):  
Marketa Nykrynova ◽  
Vojtech Barton ◽  
Roman Jakubicek ◽  
Matej Bezdicek ◽  
Martina Lengerova ◽  
...  

Recently, nanopore sequencing has come to the fore as library preparation is rapid and simple, sequencing can be done almost anywhere, and longer reads are obtained than with next-generation sequencing. The main bottleneck still lies in data postprocessing which consists of basecalling, genome assembly, and localizing significant sequences, which is time consuming and computationally demanding, thus prolonging delivery of crucial results for clinical practice. Here, we present a neural network-based method capable of detecting and classifying specific genomic regions already in raw nanopore signals - squiggles. Therefore, the basecalling process can be omitted entirely as the raw signals of significant genes, or intergenic regions can be directly analysed, or if the nucleotide sequences are required, the identified squiggles can be basecalled, preferably to others. The proposed neural network could be included directly in the sequencing run, allowing real-time squiggle processing.


2021 ◽  
Vol 12 ◽  
Author(s):  
Ning Chen ◽  
Li-Na Sha ◽  
Yi-Ling Wang ◽  
Ling-Juan Yin ◽  
Yue Zhang ◽  
...  

To investigate the pattern of chloroplast genome variation in Triticeae, we comprehensively analyzed the indels in protein-coding genes and intergenic sequence, gene loss/pseudonization, intron variation, expansion/contraction in inverted repeat regions, and the relationship between sequence characteristics and chloroplast genome size in 34 monogenomic Triticeae plants. Ancestral genome reconstruction suggests that major length variations occurred in four-stem branches of monogenomic Triticeae followed by independent changes in each genus. It was shown that the chloroplast genome sizes of monogenomic Triticeae were highly variable. The chloroplast genome of Pseudoroegneria, Dasypyrum, Lophopyrum, Thinopyrum, Eremopyrum, Agropyron, Australopyrum, and Henradia in Triticeae had evolved toward size reduction largely because of pseudogenes elimination events and length deletion fragments in intergenic. The Aegilops/Triticum complex, Taeniatherum, Secale, Crithopsis, Herteranthelium, and Hordeum in Triticeae had a larger chloroplast genome size. The large size variation in major lineages and their subclades are most likely consequences of adaptive processes since these variations were significantly correlated with divergence time and historical climatic changes. We also found that several intergenic regions, such as petN–trnC and psbE–petL containing unique genetic information, which can be used as important tools to identify the maternal relationship among Triticeae species. Our results contribute to the novel knowledge of plastid genome evolution in Triticeae.


2021 ◽  
Vol 9 ◽  
Author(s):  
Jiao Fang ◽  
Yangliang Chen ◽  
Guoxiang Liu ◽  
Heroen Verbruggen ◽  
Huan Zhu

A positive relationship between cell size and chloroplast genome size within chloroplast-bearing protists has been hypothesized in the past and shown in some case studies, but other factors influencing chloroplast genome size during the evolution of chlorophyte algae have been less studied. We study chloroplast genome size and GC content as a function of habitats and cell size of chlorophyte algae. The chloroplast genome size of green algae in freshwater, marine and terrestrial habitats was differed significantly, with terrestrial algae having larger chloroplast genome sizes in general. The most important contributor to these enlarged genomes in terrestrial species was the length of intergenic regions. There was no clear difference in the GC content of chloroplast genomes from the three habitats categories. Functional morphological categories also showed differences in chloroplast genome size, with filamentous algae having substantially larger genomes than other forms of algae, and foliose algae had lower GC content than other groups. Chloroplast genome size showed no significant differences among the classes Ulvophyceae, Trebouxiophyceae, and Chlorophyceae, but the GC content of Chlorophyceae chloroplast genomes was significantly lower than that of Ulvophyceae and Trebouxiophyceae. There was a certain positive relationship between chloroplast genome size and cell size for the Chlorophyta as a whole and within each of three major classes. Our data also confirmed previous reports that ancestral quadripartite architecture had been lost many times independently in Chlorophyta. Finally, the comparison of the phenotype of chlorophytes algae harboring plastids uncovered that most of the investigated Chlorophyta algae housed a single plastid per cell.


2021 ◽  
Vol 16 (1) ◽  
Author(s):  
Klairton L. Brito ◽  
Andre R. Oliveira ◽  
Alexsandro O. Alexandrino ◽  
Ulisses Dias ◽  
Zanoni Dias

Abstract Background In the comparative genomics field, one of the goals is to estimate a sequence of genetic changes capable of transforming a genome into another. Genome rearrangement events are mutations that can alter the genetic content or the arrangement of elements from the genome. Reversal and transposition are two of the most studied genome rearrangement events. A reversal inverts a segment of a genome while a transposition swaps two consecutive segments. Initial studies in the area considered only the order of the genes. Recent works have incorporated other genetic information in the model. In particular, the information regarding the size of intergenic regions, which are structures between each pair of genes and in the extremities of a linear genome. Results and conclusions In this work, we investigate the sorting by intergenic reversals and transpositions problem on genomes sharing the same set of genes, considering the cases where the orientation of genes is known and unknown. Besides, we explored a variant of the problem, which generalizes the transposition event. As a result, we present an approximation algorithm that guarantees an approximation factor of 4 for both cases considering the reversal and transposition (classic definition) events, an improvement from the 4.5-approximation previously known for the scenario where the orientation of the genes is unknown. We also present a 3-approximation algorithm by incorporating the generalized transposition event, and we propose a greedy strategy to improve the performance of the algorithms. We performed practical tests adopting simulated data which indicated that the algorithms, in both cases, tend to perform better when compared with the best-known algorithms for the problem. Lastly, we conducted experiments using real genomes to demonstrate the applicability of the algorithms.


GigaScience ◽  
2021 ◽  
Vol 10 (12) ◽  
Author(s):  
Youri Hoogstrate ◽  
Malgorzata A Komor ◽  
René Böttcher ◽  
Job van Riet ◽  
Harmen J G van de Werken ◽  
...  

Abstract Background Fusion genes are typically identified by RNA sequencing (RNA-seq) without elucidating the causal genomic breakpoints. However, non–poly(A)-enriched RNA-seq contains large proportions of intronic reads that also span genomic breakpoints. Results We have developed an algorithm, Dr. Disco, that searches for fusion transcripts by taking an entire reference genome into account as search space. This includes exons but also introns, intergenic regions, and sequences that do not meet splice junction motifs. Using 1,275 RNA-seq samples, we investigated to what extent genomic breakpoints can be extracted from RNA-seq data and their implications regarding poly(A)-enriched and ribosomal RNA–minus RNA-seq data. Comparison with whole-genome sequencing data revealed that most genomic breakpoints are not, or minimally, transcribed while, in contrast, the genomic breakpoints of all 32 TMPRSS2-ERG–positive tumours were present at RNA level. We also revealed tumours in which the ERG breakpoint was located before ERG, which co-existed with additional deletions and messenger RNA that incorporated intergenic cryptic exons. In breast cancer we identified rearrangement hot spots near CCND1 and in glioma near CDK4 and MDM2 and could directly associate this with increased expression. Furthermore, in all datasets we find fusions to intergenic regions, often spanning multiple cryptic exons that potentially encode neo-antigens. Thus, fusion transcripts other than classical gene-to-gene fusions are prominently present and can be identified using RNA-seq. Conclusion By using the full potential of non–poly(A)-enriched RNA-seq data, sophisticated analysis can reliably identify expressed genomic breakpoints and their transcriptional effects.


Agronomy ◽  
2021 ◽  
Vol 11 (11) ◽  
pp. 2298
Author(s):  
Manosh Kumar Biswas ◽  
Dhima Biswas ◽  
Mita Bagchi ◽  
Ganjun Yi ◽  
Guiming Deng

Microsatellites, or simple sequences repeat (SSRs), are distributed in genes, intergenic regions and transposable elements in the genome. SSRs were identified for developing markers from draft genome assemblies, transcriptome sequences and genome survey sequences in plant and animals. The identification, distribution, and density of microsatellites in pre-microRNAs (miRNAs) are not well documented in plants. In this study, SSRs were identified in 16,892 pre-miRNA sequences from 292 plant species in six taxonomic groups (algae to dicots). Fifty-one percent of pre-miRNA sequences contained SSRs. Mononucleotide repeats were the most abundant, followed by di- and trinucleotide repeats. Tetra-, penta-, and hexarepeats were rare. A total of 9,498 (57.46%) microsatellite loci had potential as pre-miRNA SSR markers. Of the markers, 3,573 (37.62%) were non-redundant, and 2,341 (65.51%) primer pairs could be transferred to at least one of the plant taxonomic groups. All data and primer pairs were deposited in a user-friendly, freely accessible plant miRNA SSR marker database. The data presented in this study, accelerate the understanding of pre-miRNA evolution and serve as valuable genomic treasure for genetic improvements in a wide range of crops, including legumes, cereals, and cruciferous crops.


Sign in / Sign up

Export Citation Format

Share Document