scholarly journals BLAT-Based Comparative Analysis for Transposable Elements: BLATCAT

2014 ◽  
Vol 2014 ◽  
pp. 1-7
Author(s):  
Sangbum Lee ◽  
Sumin Oh ◽  
Keunsoo Kang ◽  
Kyudong Han

The availability of several whole genome sequences makes comparative analyses possible. In primate genomes, the priority of transposable elements (TEs) is significantly increased because they account for ~45% of the primate genomes, they can regulate the gene expression level, and they are associated with genomic fluidity in their host genomes. Here, we developed the BLAST-like alignment tool (BLAT) based comparative analysis for transposable elements (BLATCAT) program. The BLATCAT program can compare specific regions of six representative primate genome sequences (human, chimpanzee, gorilla, orangutan, gibbon, and rhesus macaque) on the basis of BLAT and simultaneously carry out RepeatMasker and/or Censor functions, which are widely used Windows-based web-server functions to detect TEs. All results can be stored as a HTML file for manual inspection of a specific locus. BLATCAT will be very convenient and efficient for comparative analyses of TEs in various primate genomes.

2010 ◽  
Vol 3 ◽  
pp. BII.S3846 ◽  
Author(s):  
Ying Chen ◽  
Rebekah Wu ◽  
James Felton ◽  
David M. Rocke ◽  
Anu Chakicherla

Motivation Whole genome microarrays are increasingly becoming the method of choice to study responses in model organisms to disease, stressors or other stimuli. However, whole genome sequences are available for only some model organisms, and there are still many species whose genome sequences are not yet available. Cross-species studies, where arrays developed for one species are used to study gene expression in a closely related species, have been used to address this gap, with some promising results. Current analytical methods have included filtration of some probes or genes that showed low hybridization activities. But consensus filtration schemes are still not available. Results A novel masking procedure is proposed based on currently available target species sequences to filter out probes and study a cross-species data set using this masking procedure and gene-set analysis. Gene-set analysis evaluates the association of some priori defined gene groups with a phenotype of interest. Two methods, Gene Set Enrichment Analysis (GSEA) and Test of Test Statistics (ToTS) were investigated. The results showed that masking procedure together with ToTS method worked well in our data set. The results from an alternative way to study cross-species hybridization experiments without masking are also presented. We hypothesize that the multi-probes structure of Affymetrix microarrays makes it possible to aggregate the effects of both well-hybridized and poorly-hybridized probes to study a group of genes. The principles of gene-set analysis were applied to the probe-level data instead of gene-level data. The results showed that ToTS can give valuable information and thus can be used as a powerful technique for analyzing cross-species hybridization experiments. Availability Software in the form of R code is available at http://anson.ucdavis.edu/~ychen/cross-species.html Supplementary Data Supplementary data are available at http://anson.ucdavis.edu/~ychen/cross-species.html


2017 ◽  
Vol 5 (39) ◽  
Author(s):  
Hervé Tettelin ◽  
Thomas A. Hooven ◽  
Xuechu Zhao ◽  
Qi Su ◽  
Lisa Sadzewicz ◽  
...  

ABSTRACT Bordetella holmesii causes respiratory and invasive diseases in humans, but its pathogenesis remains poorly understood. We report here the genome sequences of seven bacteremia isolates of B. holmesii, including the type strain. Comparative analysis of these sequences may aid studies of B. holmesii biology and assist in the development of species-specific diagnostic strategies.


2020 ◽  
Author(s):  
Eric Minwei Liu ◽  
Alexander Martinez-Fundichely ◽  
Rajesh Bollapragada ◽  
Maurice Spiewack ◽  
Ekta Khurana

ABSTRACTMost mutations in cancer genomes occur in the non-coding regions with unknown impact to tumor development. Although the increase in number of cancer whole-genome sequences has revealed numerous putative non-coding cancer drivers, their information is dispersed across multiple studies and thus it is difficult to bridge the understanding of non-coding alterations, the genes they impact and the supporting evidence for their role in tumorigenesis across multiple cancer types. To address this gap, we have developed CNCDatabase, Cornell Non-Coding Cancer driver Database (https://cncdatabase.med.cornell.edu/) that contains detailed information about predicted non-coding drivers at gene promoters, 5’ and 3’ UTRs (untranslated regions), enhancers, CTCF insulators and non-coding RNAs. CNCDatabase documents 1,111 protein-coding genes and 90 non-coding RNAs with reported drivers in their non-coding regions from 32 cancer types by computational predictions of positive selection in whole-genome sequences; differential gene expression in samples with and without mutations; or another set of experimental validations including luciferase reporter assays and genome editing. The database can be easily modified and scaled as lists of non-coding drivers are revised in the community with larger whole-genome sequencing studies, CRISPR screens and further experimental validations. Overall, CNCDatabase provides a helpful resource for researchers to explore the pathological role of non-coding alterations and their associations with gene expression in human cancers.


2017 ◽  
Vol 47 (10-11) ◽  
pp. 655-665 ◽  
Author(s):  
D.G. Teixeira ◽  
G.R.G. Monteiro ◽  
D.R.A. Martins ◽  
M.Z. Fernandes ◽  
V. Macedo-Silva ◽  
...  

Author(s):  
Christine Pourcel ◽  
Marie Touchon ◽  
Nicolas Villeriot ◽  
Jean-Philippe Vernadet ◽  
David Couvin ◽  
...  

Abstract In Archaea and Bacteria, the arrays called CRISPRs for ‘clustered regularly interspaced short palindromic repeats’ and the CRISPR associated genes or cas provide adaptive immunity against viruses, plasmids and transposable elements. Short sequences called spacers, corresponding to fragments of invading DNA, are stored in-between repeated sequences. The CRISPR–Cas systems target sequences homologous to spacers leading to their degradation. To facilitate investigations of CRISPRs, we developed 12 years ago a website holding the CRISPRdb. We now propose CRISPRCasdb, a completely new version giving access to both CRISPRs and cas genes. We used CRISPRCasFinder, a program that identifies CRISPR arrays and cas genes and determine the system's type and subtype, to process public whole genome assemblies. Strains are displayed either in an alphabetic list or in taxonomic order. The database is part of the CRISPR-Cas++ website which also offers the possibility to analyse submitted sequences and to download programs. A BLAST search against lists of repeats and spacers extracted from the database is proposed. To date, 16 990 complete prokaryote genomes (16 650 bacteria from 2973 species and 340 archaea from 300 species) are included. CRISPR–Cas systems were found in 36% of Bacteria and 75% of Archaea strains. CRISPRCasdb is freely accessible at https://crisprcas.i2bc.paris-saclay.fr/.


2006 ◽  
Vol 51 (10) ◽  
pp. 1199-1209 ◽  
Author(s):  
Wu Wei ◽  
Guohui Ding ◽  
Xiaojing Wang ◽  
Jingchun Sun ◽  
Kang Tu ◽  
...  

Author(s):  
Mauricio J. Lozano ◽  
Miguel Redondo-Nieto ◽  
Daniel Garrido-Sanz ◽  
Elías Mongiardini ◽  
J. Ignacio Quelas ◽  
...  

The genetic and genomic changes that occur under laboratory conditions in Bradyrhizobium diazoefficiens genomes remain poorly studied. Only a few genome sequences of this important nitrogen-fixing species are available, and there are no genome-wide comparative analyses of related strains.


Sign in / Sign up

Export Citation Format

Share Document