scholarly journals Identification of Open Chromatin Regions in Plant Genomes Using ATAC-Seq

Author(s):  
Marko Bajic ◽  
Kelsey A. Maher ◽  
Roger B. Deal
Keyword(s):  
2020 ◽  
Author(s):  
Yin Shen ◽  
Ling-Ling Chen ◽  
Junxiang Gao

AbstractChromatin accessibility is a highly informative structural feature for understanding gene transcription regulation because it indicates the degree to which nuclear macromolecules such as proteins and RNA can access chromosomal DNA. Studies show that chromatin accessibility is highly dynamic during stress response, stimulus response, and developmental transition. Moreover, physical access to chromosomal DNA in eukaryotes is highly cell-specific. Therefore, current technologies such as DNase-seq, ATAC-seq, and FAIRE-seq reveal only a portion of the open chromatin regions (OCRs) present in a given species. Thus, the genome-wide distribution of OCRs remains unknown. In this study, we developed a bioinformatics tool called CharPlant for the de novo prediction of chromatin accessible regions in plant genomes. To develop this tool, we constructed a three-layer convolutional neural network (CNN) and subsequently trained the CNN using DNase-seq and ATAC-seq datasets of four plant species. The model simultaneously learns the sequence motifs and regulatory logics, which are jointly used to determine DNA accessibility. All of these steps are integrated into CharPlant, which can be run using a simple command line. The results of data analysis using CharPlant in this study demonstrate its prediction power and computational efficiency. To our knowledge, CharPlant is the first de novo prediction tool that can identify potential OCRs in the whole genome. The source code of CharPlant and supporting files are freely downloadable from https://github.com/Yin-Shen/CharPlant.


2014 ◽  
Vol 143 (1-3) ◽  
pp. 18-27 ◽  
Author(s):  
Wenli Zhang ◽  
Tao Zhang ◽  
Yufeng Wu ◽  
Jiming Jiang
Keyword(s):  

2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Flavia Mascagni ◽  
Gabriele Usai ◽  
Andrea Cavallini ◽  
Andrea Porceddu

AbstractWe identified and characterized the pseudogene complements of five plant species: four dicots (Arabidopsis thaliana, Vitis vinifera, Populus trichocarpa and Phaseolus vulgaris) and one monocot (Oryza sativa). Retroposition was considered of modest importance for pseudogene formation in all investigated species except V. vinifera, which showed an unusually high number of retro-pseudogenes in non coding genic regions. By using a pipeline for the classification of sequence duplicates in plant genomes, we compared the relative importance of whole genome, tandem, proximal, transposed and dispersed duplication modes in the pseudo and functional gene complements. Pseudogenes showed higher tendencies than functional genes to genomic dispersion. Dispersed pseudogenes were prevalently fragmented and showed high sequence divergence at flanking regions. On the contrary, those deriving from whole genome duplication were proportionally less than expected based on observations on functional loci and showed higher levels of flanking sequence conservation than dispersed pseudogenes. Pseudogenes deriving from tandem and proximal duplications were in excess compared to functional loci, probably reflecting the high evolutionary rate associated with these duplication modes in plant genomes. These data are compatible with high rates of sequence turnover at neutral sites and double strand break repairs mediated duplication mechanisms.


Sign in / Sign up

Export Citation Format

Share Document