gene coverage
Recently Published Documents


TOTAL DOCUMENTS

24
(FIVE YEARS 13)

H-INDEX

6
(FIVE YEARS 2)

2022 ◽  
Author(s):  
Dao-jin Xue ◽  
Zheng Zhen ◽  
Ke-xin Wang ◽  
Jia-lin Zhao ◽  
Yao Gao ◽  
...  

Abstract Background Chinese herbal medicine (CHM) is characterized by “multi- compounds, multi-targets and multi-pathway”, which has advanced benefits for the preventing and treating complex diseases, but still exists unsolved issues, mainly include unclear material basis and underling mechanism of prescription. Integrated pharmacology is a hot cross research area based on system biology, mathematic and poly-pharmacology. It can systematically and comprehensively investigate the therapeutic reaction of compounds or drugs on pathogenic genes network, and is especially suitable for the study of complex CHM systems. Intracerebral Hemorrhage (ICH) is one of the main causes of death among Chinese residents, which is characterized by high mortality and high disability rate. In recent years, the treatment of ICH by CHM has been deeply researched. Xue Fu Zhu Yu Decoction (XFZYD), one of the commonly used prescriptions in treating ICH at clinic level, has not been clear about its mechanism in treating ICH. Methods Here, we established a strategy, which based on compounds-targets, pathogenetic genes, network analysis and node importance calculation. Using this strategy, the core compounds group (CCG) of XFZYD was predicted and validated by in vitro experiments. The molecular mechanism of XFZYD in treating ICH was deduced based on CCG and their targets. Results The results show that the CCG with 43 compounds predicted by this model is highly consistent with the corresponding Compound-Target (C-T) network in terms of gene coverage, enriched pathway coverage and accumulated contribution of key nodes at 89.49%, 88.72% and 90.11%, respectively, which confirmed the reliability and accuracy of the effective compound group optimization and mechanism speculation strategy proposed by us. Conclusions Our strategy of optimizing the effective compound groups and inferring the mechanism provides a strategic reference for explaining the optimization and inferring the molecular mechanism of prescriptions in treating complex diseases of CHM.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Fredy E. Villena ◽  
Stephen E. Lizewski ◽  
Christie A. Joya ◽  
Hugo O. Valdivia

AbstractPrevious studies have shown that P. falciparum parasites in South America have undergone population bottlenecks resulting in clonal lineages that are differentially distributed and that have been responsible for several outbreaks different endemic regions. In this study, we explored the genomic profile of 18 P. falciparum samples collected in the Peruvian Amazon Basin (Loreto) and 6 from the Peruvian North Coast (Tumbes). Our results showed the presence of three subpopulations that matched previously typed lineages in Peru: Bv1 (n = 17), Clonet D (n = 4) and Acre-Loreto type (n = 3). Gene coverage analysis showed that none of the Bv1 samples presented coverage for pfhrp2 and pfhrp3. Genotyping of drug resistance markers showed a high prevalence of Chloroquine resistance mutations S1034C/N1042D/D1246Y in pfmdr1 (62.5%) and K45T in pfcrt (87.5%). Mutations associated with sulfadoxine and pyrimethamine treatment failure were found on 88.8% of the Bv1 samples which were triple mutants for pfdhfr (50R/51I/108N) and pfdhps (437G/540E/581G). Analysis of the pfS47 gene that allows P. falciparum to evade mosquito immune responses showed that the Bv1 lineage presented one pfS47 haplotype exclusive to Loreto and another haplotype that was present in both Loreto and Tumbes. Furthermore, a possible expansion of Bv1 was detected since 2011 in Loreto. This replacement could be a result of the high prevalence of CQ resistance polymorphisms in Bv1, which could have provided a selective advantage to the indirect selection pressures driven by the use of CQ for P. vivax treatment.


2021 ◽  
Vol 31 (10) ◽  
pp. 1706-1718 ◽  
Author(s):  
Ruben Dries ◽  
Jiaji Chen ◽  
Natalie del Rossi ◽  
Mohammed Muzamil Khan ◽  
Adriana Sistig ◽  
...  

Spatial transcriptomics is a rapidly growing field that promises to comprehensively characterize tissue organization and architecture at the single-cell or subcellular resolution. Such information provides a solid foundation for mechanistic understanding of many biological processes in both health and disease that cannot be obtained by using traditional technologies. The development of computational methods plays important roles in extracting biological signals from raw data. Various approaches have been developed to overcome technology-specific limitations such as spatial resolution, gene coverage, sensitivity, and technical biases. Downstream analysis tools formulate spatial organization and cell–cell communications as quantifiable properties, and provide algorithms to derive such properties. Integrative pipelines further assemble multiple tools in one package, allowing biologists to conveniently analyze data from beginning to end. In this review, we summarize the state of the art of spatial transcriptomic data analysis methods and pipelines, and discuss how they operate on different technological platforms.


2021 ◽  
Vol 12 ◽  
Author(s):  
Ryan Musich ◽  
Lance Cadle-Davidson ◽  
Michael V. Osier

Aligning short-read sequences is the foundational step to most genomic and transcriptomic analyses, but not all tools perform equally, and choosing among the growing body of available tools can be daunting. Here, in order to increase awareness in the research community, we discuss the merits of common algorithms and programs in a way that should be approachable to biologists with limited experience in bioinformatics. We will only in passing consider the effects of data cleanup, a precursor analysis to most alignment tools, and no consideration will be given to downstream processing of the aligned fragments. To compare aligners [Bowtie2, Burrows Wheeler Aligner (BWA), HISAT2, MUMmer4, STAR, and TopHat2], an RNA-seq dataset was used containing data from 48 geographically distinct samples of the grapevine powdery mildew fungus Erysiphe necator. Based on alignment rate and gene coverage, all aligners performed well with the exception of TopHat2, which HISAT2 superseded. BWA perhaps had the best performance in these metrics, except for longer transcripts (>500 bp) for which HISAT2 and STAR performed well. HISAT2 was ~3-fold faster than the next fastest aligner in runtime, which we consider a secondary factor in most alignments. At the end, this direct comparison of commonly used aligners illustrates key considerations when choosing which tool to use for the specific sequencing data and objectives. No single tool meets all needs for every user, and there are many quality aligners available.


BMC Biology ◽  
2021 ◽  
Vol 19 (1) ◽  
Author(s):  
Keith M. Bayless ◽  
Michelle D. Trautwein ◽  
Karen Meusemann ◽  
Seunggwan Shin ◽  
Malte Petersen ◽  
...  

Abstract Background The most species-rich radiation of animal life in the 66 million years following the Cretaceous extinction event is that of schizophoran flies: a third of fly diversity including Drosophila fruit fly model organisms, house flies, forensic blow flies, agricultural pest flies, and many other well and poorly known true flies. Rapid diversification has hindered previous attempts to elucidate the phylogenetic relationships among major schizophoran clades. A robust phylogenetic hypothesis for the major lineages containing these 55,000 described species would be critical to understand the processes that contributed to the diversity of these flies. We use protein encoding sequence data from transcriptomes, including 3145 genes from 70 species, representing all superfamilies, to improve the resolution of this previously intractable phylogenetic challenge. Results Our results support a paraphyletic acalyptrate grade including a monophyletic Calyptratae and the monophyly of half of the acalyptrate superfamilies. The primary branching framework of Schizophora is well supported for the first time, revealing the primarily parasitic Pipunculidae and Sciomyzoidea stat. rev. as successive sister groups to the remaining Schizophora. Ephydroidea, Drosophila’s superfamily, is the sister group of Calyptratae. Sphaeroceroidea has modest support as the sister to all non-sciomyzoid Schizophora. We define two novel lineages corroborated by morphological traits, the ‘Modified Oviscapt Clade’ containing Tephritoidea, Nerioidea, and other families, and the ‘Cleft Pedicel Clade’ containing Calyptratae, Ephydroidea, and other families. Support values remain low among a challenging subset of lineages, including Diopsidae. The placement of these families remained uncertain in both concatenated maximum likelihood and multispecies coalescent approaches. Rogue taxon removal was effective in increasing support values compared with strategies that maximise gene coverage or minimise missing data. Conclusions Dividing most acalyptrate fly groups into four major lineages is supported consistently across analyses. Understanding the fundamental branching patterns of schizophoran flies provides a foundation for future comparative research on the genetics, ecology, and biocontrol.


Blood ◽  
2020 ◽  
Vol 136 (Supplement 1) ◽  
pp. 38-39
Author(s):  
Ha Jin Lim ◽  
Jun Hyung Lee ◽  
Ju-Hyeon Shin ◽  
Seung Yeob Lee ◽  
Hyun-Woo Choi ◽  
...  

Introduction Targeted RNA sequencing (RNA-seq) is a highly accurate method for sequencing transcripts of interest and can overcome limitations regarding resolution, throughput, and multistep workflow. However, RNA-seq has not been widely performed in clinical molecular laboratories due to the complexity of data processing and interpretation. We developed a customized targeted RNA-seq panel with a data processing protocol and validated its analytical performance for gene fusion detection using a subset of samples with different hematologic malignancies. Additionally, we investigated its applicability for identifying transcript variants and expression analysis using the targeted panel. Methods The target panel and customized oligonucleotide probes were designed to capture 84 genes associated with hematologic malignancies. Libraries were prepared from 800 to 1,500 ng of total RNA using GeneMediKit NGS-Leukemia-RNA kit (GeneMedica, Gwangju, Korea) and sequenced using Miseq reagent kit v3 (300 cycles) and MiseqDx (Illumina, San Diego, CA, USA). The diagnostic samples included one reference DNA (NA12878), one reference RNA (Cat no. 740000, Agilent Technologies), 14 normal peripheral blood (PB) samples, four validation bone marrow (BM) samples with known gene fusions, and 30 clinical BM or PB samples from seven categories of hematologic malignancies. The clinical samples included 27 BM aspirates and three PB samples composed of six acute myeloid leukemia, nine B-lymphoblastic leukemia/lymphoma, four T-lymphoblastic leukemia/lymphoma, three mature B-cell neoplasms, six MPN, one myelodysplastic/myeloproliferative neoplasm, and one myeloid/lymphoid neoplasm with eosinophilia and gene rearrangement. For the analytical validation of fusion detection, target gene coverage, between-run and within-run repeatability, and dilution tests (1:2 to 1:8 dilution) were performed. For the comparative analysis of fusion detection, the RNA-seq data were analyzed by STAR-Fusion and FusionCatcher and processed with stepwise filtering and prioritization strategy (Figure 1), and the result was compared to those of multiplex RT-PCR (HemaVision kit; DNA Technology, Aarhus, Denmark) or FISH (MetaSystems Gmbh, Althusseim, Germany) using 30 clinical samples. The RNA-seq data from clinical samples were additionally analyzed by FreeBayes for variant detection and by StringTie for expression profiling (Figure 1). Results First, the analytical validation showed reliable results in target gene coverage, between-run and within-run repeatability, and linearity tests. The uniformity of coverage (% of base pairs higher than 0.2 × total average depth) was calculated to be 99.8%, which revealed even coverage for the target genes in the panel using the reference DNA. Both in the within-run and between-run tests, the read counts and FFPM (fusion fragments per million) of all replicates showed reliable repeatability (r2 = 0.9655 and 0.9874, respectively). The FFPM of the diluted analytical samples including BCR-ABL1 and PML-RARA showed linear log2-fold-changes (r2 = 0.9852 and 0.9447, respectively). Second, compared to multiplex RT-PCR and FISH using 30 clinical samples, targeted RNA-seq combined with filtering and prioritization strategies detected all 13 known fusions and newly detected 17 fusions. Finally, 16 disease- and drug resistance-associated variants on the expressed transcripts of ABL1, GATA2, IKZF1, JAK2, RUNX1, and WT1 were simultaneously designated and expression analysis showed distinct four clusters of clinical samples according to the cancer subtypes and lineages. Conclusions Our customized targeted RNA-seq system provided a stable analytical performance and a more sensitive identification of gene fusions than conventional molecular methods in various clinical samples. In addition, clinically significant variants in the transcripts and expression profiling could be simultaneously identified directly from the RNA-seq data without the need for additional parallel testing. Our study identified the advantages of the clinical laboratory-oriented targeted RNA-seq system to enhance the diagnostic yield for gene fusion detection and to simplify the diagnostic steps as providing a comprehensive tool for analyzing hematologic malignancies in the clinical laboratory. Figure 1 Disclosures Lee: National Research Foundation of Korea: Research Funding.


2020 ◽  
Vol 2 (2) ◽  
Author(s):  
Aaron Ayllon-Benitez ◽  
Romain Bourqui ◽  
Patricia Thébault ◽  
Fleur Mougin

Abstract The revolution in new sequencing technologies is greatly leading to new understandings of the relations between genotype and phenotype. To interpret and analyze data that are grouped according to a phenotype of interest, methods based on statistical enrichment became a standard in biology. However, these methods synthesize the biological information by a priori selecting the over-represented terms and may suffer from focusing on the most studied genes that represent a limited coverage of annotated genes within a gene set. Semantic similarity measures have shown great results within the pairwise gene comparison by making advantage of the underlying structure of the Gene Ontology. We developed GSAn, a novel gene set annotation method that uses semantic similarity measures to synthesize a priori Gene Ontology annotation terms. The originality of our approach is to identify the best compromise between the number of retained annotation terms that has to be drastically reduced and the number of related genes that has to be as large as possible. Moreover, GSAn offers interactive visualization facilities dedicated to the multi-scale analysis of gene set annotations. Compared to enrichment analysis tools, GSAn has shown excellent results in terms of maximizing the gene coverage while minimizing the number of terms.


eLife ◽  
2020 ◽  
Vol 9 ◽  
Author(s):  
Chao Li ◽  
Xiang Li ◽  
Zhenghong Bi ◽  
Ken Sugino ◽  
Guangqin Wang ◽  
...  

Inner ear cochlear spiral ganglion neurons (SGNs) transmit sound information to the brainstem. Recent single cell RNA-Seq studies have revealed heterogeneities within SGNs. Nonetheless, much remains unknown about the transcriptome of SGNs, especially which genes are specifically expressed in SGNs. To address these questions, we needed a deeper and broader gene coverage than that in previous studies. We performed bulk RNA-Seq on mouse SGNs at five ages, and on two reference cell types (hair cells and glia). Their transcriptome comparison identified genes previously unknown to be specifically expressed in SGNs. To validate our dataset and provide useful genetic tools for this research field, we generated two knockin mouse strains: Scrt2-P2A-tdTomato and Celf4-3xHA-P2A-iCreER-T2A-EGFP. Our comprehensive analysis confirmed the SGN-selective expression of the candidate genes, testifying to the quality of our transcriptome data. These two mouse strains can be used to temporally label SGNs or to sort them.


2019 ◽  
Author(s):  
Imad Abugessaisa ◽  
Shuhei Noguchi ◽  
Melissa Cardon ◽  
Akira Hasegawa ◽  
Kazuhide Watanabe ◽  
...  

AbstractAnalysis and interpretation of single-cell RNA-sequencing (scRNA-seq) experiments are compromised by the presence of poor quality cells. For meaningful analyses, such poor quality cells should be excluded to avoid biases and large variation. However, no clear guidelines exist. We introduce SkewC, a novel quality-assessment method to identify poor quality single-cells in scRNA-seq experiments. The method is based on the assessment of gene coverage for each single cell and its skewness as a quality measure. To validate the method, we investigated the impact of poor quality cells on downstream analyses and compared biological differences between typical and poor quality cells. Moreover, we measured the ratio of intergenic expression, suggesting genomic contamination, and foreign organism contamination of single-cell samples. SkewC is tested in 37,993 single-cells generated by 15 scRNA-seq protocols. We envision SkewC as an indispensable QC method to be incorporated into scRNA-seq experiment to preclude the possibility of scRNA-seq data misinterpretation.


2019 ◽  
Vol 66 (1) ◽  
pp. 199-206 ◽  
Author(s):  
Garrett Gotway ◽  
Eric Crossley ◽  
Julia Kozlitina ◽  
Chao Xing ◽  
Judy Fan ◽  
...  

Abstract BACKGROUND Exome sequencing has become a commonly used clinical diagnostic test. Multiple studies have examined the diagnostic utility and individual laboratory performance of exome testing; however, no previous study has surveyed and compared the data quality from multiple clinical laboratories. METHODS We examined sequencing data from 36 clinical exome tests from 3 clinical laboratories. Exome data were compared in terms of overall characteristics and coverage of specific genes and nucleotide positions. The sets of genes examined included genes in Consensus Coding Sequence (CCDS) (n = 17723), a subset of genes clinically relevant to epilepsy (n = 108), and genes that are recommended for reporting of secondary findings (n = 57; excludes X-linked genes). RESULTS The average exome nucleotide coverage (≥20×) of each laboratory varied at 96.49% (CV = 3%), 96.54% (CV = 1%), and 91.68% (CV = 4%), for laboratories A, B, and C, respectively. For CCDS genes, the average number of completely covered genes varied at 12184 (CV = 29%), 11687 (CV = 13%), and 5989 (CV = 37%), for laboratories A, B, and C, respectively. With smaller subsets of genes related to epilepsy and secondary findings, the CV revealed low consistency, with a maximum CV seen in laboratory C for both epilepsy genes (CV = 60%) and secondary findings genes (CV = 71%). CONCLUSIONS Poor consistency in complete gene coverage was seen in the clinical exome laboratories surveyed. The degree of consistency varied widely between the laboratories.


Sign in / Sign up

Export Citation Format

Share Document