sequence window
Recently Published Documents


TOTAL DOCUMENTS

3
(FIVE YEARS 3)

H-INDEX

2
(FIVE YEARS 2)

2021 ◽  
Author(s):  
Travis Wrightsman ◽  
Alexandre P. Marand ◽  
Peter A. Crisp ◽  
Nathan M. Springer ◽  
Edward S. Buckler

Accessible chromatin regions are critical components of gene regulation but modeling them directly from sequence remains challenging, especially within plants, whose mechanisms of chromatin remodeling are less understood than in animals. We trained an existing deep learning architecture, DanQ, on leaf ATAC-seq data from 12 angiosperm species to predict the chromatin accessibility of sequence windows within and across species. We also trained DanQ on DNA methylation data from 10 angiosperms, because unmethylated regions have been shown to overlap significantly with accessible chromatin regions in some plants. The across-species models have comparable or even superior performance to a model trained within species, suggesting strong conservation of chromatin mechanisms across angiosperms. Testing a maize held out model on a multi-tissue scATAC panel revealed our models are best at predicting constitutively-accessible chromatin regions, with diminishing performance as cell-type specificity increases. Using a combination of interpretation methods, we ranked JASPAR motifs by their importance to each model and saw that the TCP and AP2/ERF transcription factor families consistently ranked highly. We embedded the top three JASPAR motifs for each model at all possible positions on both strands in our sequence window and observed position- and strand-specific patterns in their importance to the model. With our cross-species "a2z" model it is now feasible to predict the chromatin accessibility and methylation landscape of any angiosperm genome.


2019 ◽  
Author(s):  
Rina C. Sakata ◽  
Soh Ishiguro ◽  
Hideto Mori ◽  
Mamoru Tanaka ◽  
Motoaki Seki ◽  
...  

While several Cas9-derived base editors have been developed to induce either C-to-T or A-to-G mutation at target genomic sites, the possible genome editing space when using the current base editors remains limited. Here, we present a novel base editor, Target-ACE, which integrates the abilities of both of the previously developed C-to-T and A-to-G base editors by fusing an activation-induced cytidine deaminase (AID) and an engineered tRNA adenosine deaminase (TadA) to a catalytically impaired Streptococcus pyogenes Cas9. In mammalian cells, Target-ACE enabled heterologous editing of multiple bases in a small sequence window of target sites with increased efficiency compared with a mixture of two relevant base editor enzymes, each of which may block the same target DNA molecule from the other. Furthermore, by modeling editing patterns using deep sequencing data, the editing spectra of Target-ACE and other base editors were simulated across the human genome, demonstrating the highest potency of Target-ACE to edit amino acid coding patterns. Taking these findings together, Target-ACE is a new tool that broadens the capabilities for base editing for various applications.


2019 ◽  
Vol 35 (17) ◽  
pp. 3020-3027 ◽  
Author(s):  
Sebastian Maurer-Stroh ◽  
Nora L Krutz ◽  
Petra S Kern ◽  
Vithiagaran Gunalan ◽  
Minh N Nguyen ◽  
...  

Abstract Motivation Due to the risk of inducing an immediate Type I (IgE-mediated) allergic response, proteins intended for use in consumer products must be investigated for their allergenic potential before introduction into the marketplace. The FAO/WHO guidelines for computational assessment of allergenic potential of proteins based on short peptide hits and linear sequence window identity thresholds misclassify many proteins as allergens. Results We developed AllerCatPro which predicts the allergenic potential of proteins based on similarity of their 3D protein structure as well as their amino acid sequence compared with a data set of known protein allergens comprising of 4180 unique allergenic protein sequences derived from the union of the major databases Food Allergy Research and Resource Program, Comprehensive Protein Allergen Resource, WHO/International Union of Immunological Societies, UniProtKB and Allergome. We extended the hexamer hit rule by removing peptides with high probability of random occurrence measured by sequence entropy as well as requiring 3 or more hexamer hits consistent with natural linear epitope patterns in known allergens. This is complemented with a Gluten-like repeat pattern detection. We also switched from a linear sequence window similarity to a B-cell epitope-like 3D surface similarity window which became possible through extensive 3D structure modeling covering the majority (74%) of allergens. In case no structure similarity is found, the decision workflow reverts to the old linear sequence window rule. The overall accuracy of AllerCatPro is 84% compared with other current methods which range from 51 to 73%. Both the FAO/WHO rules and AllerCatPro achieve highest sensitivity but AllerCatPro provides a 37-fold increase in specificity. Availability and implementation https://allercatpro.bii.a-star.edu.sg/ Supplementary information Supplementary data are available at Bioinformatics online.


Sign in / Sign up

Export Citation Format

Share Document