scholarly journals Characterization and Mapping of retr04, retr05 and retr06 Broad-Spectrum Resistances to Turnip Mosaic Virus in Brassica juncea, and the Development of Robust Methods for Utilizing Recalcitrant Genotyping Data

2022 ◽  
Vol 12 ◽  
Author(s):  
Lawrence E. Bramham ◽  
Tongtong Wang ◽  
Erin E. Higgins ◽  
Isobel A. P. Parkin ◽  
Guy C. Barker ◽  
...  

Turnip mosaic virus (TuMV) induces disease in susceptible hosts, notably impacting cultivation of important crop species of the Brassica genus. Few effective plant viral disease management strategies exist with the majority of current approaches aiming to mitigate the virus indirectly through control of aphid vector species. Multiple sources of genetic resistance to TuMV have been identified previously, although the majority are strain-specific and have not been exploited commercially. Here, two Brassica juncea lines (TWBJ14 and TWBJ20) with resistance against important TuMV isolates (UK 1, vVIR24, CDN 1, and GBR 6) representing the most prevalent pathotypes of TuMV (1, 3, 4, and 4, respectively) and known to overcome other sources of resistance, have been identified and characterized. Genetic inheritance of both resistances was determined to be based on a recessive two-gene model. Using both single nucleotide polymorphism (SNP) array and genotyping by sequencing (GBS) methods, quantitative trait loci (QTL) analyses were performed using first backcross (BC1) genetic mapping populations segregating for TuMV resistance. Pairs of statistically significant TuMV resistance-associated QTLs with additive interactive effects were identified on chromosomes A03 and A06 for both TWBJ14 and TWBJ20 material. Complementation testing between these B. juncea lines indicated that one resistance-linked locus was shared. Following established resistance gene nomenclature for recessive TuMV resistance genes, these new resistance-associated loci have been termed retr04 (chromosome A06, TWBJ14, and TWBJ20), retr05 (A03, TWBJ14), and retr06 (A03, TWBJ20). Genotyping by sequencing data investigated in parallel to robust SNP array data was highly suboptimal, with informative data not established for key BC1 parental samples. This necessitated careful consideration and the development of new methods for processing compromised data. Using reductive screening of potential markers according to allelic variation and the recombination observed across BC1 samples genotyped, compromised GBS data was rendered functional with near-equivalent QTL outputs to the SNP array data. The reductive screening strategy employed here offers an alternative to methods relying upon imputation or artificial correction of genotypic data and may prove effective for similar biparental QTL mapping studies.

2017 ◽  
Author(s):  
Zilu Zhou ◽  
Weixin Wang ◽  
Li-San Wang ◽  
Nancy Ruonan Zhang

AbstractMotivationCopy number variations (CNVs) are gains and losses of DNA segments and have been associated with disease. Many large-scale genetic association studies are performing CNV analysis using whole exome sequencing (WES) and whole genome sequencing (WGS). In many of these studies, previous SNP-array data are available. An integrated cross-platform analysis is expected to improve resolution and accuracy, yet there is no tool for effectively combining data from sequencing and array platforms. The detection of CNVs using sequencing data alone can also be further improved by the utilization of allele-specific reads.ResultsWe propose a statistical framework, integrated Copy Number Variation detection algorithm (iCNV), which can be applied to multiple study designs: WES only, WGS only, SNP array only, or any combination of SNP and sequencing data. iCNV applies platform specific normalization, utilizes allele specific reads from sequencing and integrates matched NGS and SNP-array data by a Hidden Markov Model (HMM). We compare integrated two-platform CNV detection using iCNV to naive intersection or union of platforms and show that iCNV increases sensitivity and robustness. We also assess the accuracy of iCNV on WGS data only, and show that the utilization of allele-specific reads improve CNV detection accuracy compared to existing methods.Availabilityhttps://github.com/zhouzilu/[email protected], [email protected] informationSupplementary data are available at Bioinformatics online.


2019 ◽  
Vol 35 (17) ◽  
pp. 2924-2931
Author(s):  
Mark R Zucker ◽  
Lynne V Abruzzo ◽  
Carmen D Herling ◽  
Lynn L Barron ◽  
Michael J Keating ◽  
...  

Abstract Motivation Clonal heterogeneity is common in many types of cancer, including chronic lymphocytic leukemia (CLL). Previous research suggests that the presence of multiple distinct cancer clones is associated with clinical outcome. Detection of clonal heterogeneity from high throughput data, such as sequencing or single nucleotide polymorphism (SNP) array data, is important for gaining a better understanding of cancer and may improve prediction of clinical outcome or response to treatment. Here, we present a new method, CloneSeeker, for inferring clinical heterogeneity from sequencing data, SNP array data, or both. Results We generated simulated SNP array and sequencing data and applied CloneSeeker along with two other methods. We demonstrate that CloneSeeker is more accurate than existing algorithms at determining the number of clones, distribution of cancer cells among clones, and mutation and/or copy numbers belonging to each clone. Next, we applied CloneSeeker to SNP array data from samples of 258 previously untreated CLL patients to gain a better understanding of the characteristics of CLL tumors and to elucidate the relationship between clonal heterogeneity and clinical outcome. We found that a significant majority of CLL patients appear to have multiple clones distinguished by copy number alterations alone. We also found that the presence of multiple clones corresponded with significantly worse survival among CLL patients. These findings may prove useful for improving the accuracy of prognosis and design of treatment strategies. Availability and implementation Code available on R-Forge: https://r-forge.r-project.org/projects/CloneSeeker/ Supplementary information Supplementary data are available at Bioinformatics online.


Author(s):  
Jie Huang ◽  
Stefano Pallotti ◽  
Qianling Zhou ◽  
Marcus Kleber ◽  
Xiaomeng Xin ◽  
...  

Abstract The identification of rare haplotypes may greatly expand our knowledge in the genetic architecture of both complex and monogenic traits. To this aim, we developed PERHAPS (Paired-End short Reads-based HAPlotyping from next-generation Sequencing data), a new and simple approach to directly call haplotypes from short-read, paired-end Next Generation Sequencing (NGS) data. To benchmark this method, we considered the APOE classic polymorphism (*1/*2/*3/*4), since it represents one of the best examples of functional polymorphism arising from the haplotype combination of two Single Nucleotide Polymorphisms (SNPs). We leveraged the big Whole Exome Sequencing (WES) and SNP-array data obtained from the multi-ethnic UK BioBank (UKBB, N=48,855). By applying PERHAPS, based on piecing together the paired-end reads according to their FASTQ-labels, we extracted the haplotype data, along with their frequencies and the individual diplotype. Concordance rates between WES directly called diplotypes and the ones generated through statistical pre-phasing and imputation of SNP-array data are extremely high (>99%), either when stratifying the sample by SNP-array genotyping batch or self-reported ethnic group. Hardy-Weinberg Equilibrium tests and the comparison of obtained haplotype frequencies with the ones available from the 1000 Genome Project further supported the reliability of PERHAPS. Notably, we were able to determine the existence of the rare APOE*1 haplotype in two unrelated African subjects from UKBB, supporting its presence at appreciable frequency (approximatively 0.5%) in the African Yoruba population. Despite acknowledging some technical shortcomings, PERHAPS represents a novel and simple approach that will partly overcome the limitations in direct haplotype calling from short read-based sequencing.


2011 ◽  
Vol 30 (2) ◽  
pp. 309-318 ◽  
Author(s):  
Lei Zhu ◽  
Yanman Li ◽  
Neelam Ara ◽  
Jinghua Yang ◽  
Mingfang Zhang

2018 ◽  
Author(s):  
Andrew Whalen ◽  
Gregor Gorjanc ◽  
John M Hickey

AbstractIn this paper we evaluate using genotype-by-sequencing (GBS) data to perform parentage assignment in lieu of traditional array data. The use of GBS data raises two issues: First, for low-coverage GBS data, it may not be possible to call the genotype at many loci, a critical first step for detecting opposing homozygous markers. Second, the amount of sequencing coverage may vary across individuals, making it challenging to directly compare the likelihood scores between putative parents. To address these issues we extend the probabilistic framework of Huisman (2017) and evaluate putative parents by comparing their (potentially noisy) genotypes to a series of proposal distributions. These distributions describe the expected genotype probabilities for the relatives of an individual. We assign putative parents as a parent if they are classified as a parent (as opposed to e.g., an unrelated individual), and if the assignment score passes a threshold. We evaluated this method on simulated data and found that (1) high-coverage GBS data performs similarly to array data and requires only a small number of markers to correctly assign parents and (2) low-coverage GBS data (as low as 0.1x) can also be used, provided that it is obtained across a large number of markers. When analysing the low-coverage GBS data, we also found a high number of false positives if the true parent is not contained within the list of candidate parents, but that this false positive rate can be greatly reduced by hand tuning the assignment threshold. We provide this parentage assignment method as a standalone program called AlphaAssign.


2019 ◽  
Author(s):  
Barbara Tabak ◽  
Gordon Saksena ◽  
Coyin Oh ◽  
Galen F. Gao ◽  
Barbara Hill Meyers ◽  
...  

AbstractMotivationSomatic copy-number alterations (SCNAs) play an important role in cancer development. Systematic noise in sequencing and array data present a significant challenge to the inference of SCNAs for cancer genome analyses. As part of The Cancer Genome Atlas (TCGA), the Broad Institute Genome Characterization Center developed the Tangent copy-number inference pipeline to generate copy-number profiles using single-nucleotide polymorphism (SNP) array and whole-exome sequencing (WES) data from over 10,000 pairs of tumors and matched normal samples. Here, we describe the Tangent pipeline, which begins with DNA sequencing data in the form of .bam files or raw SNP array probe-level intensity data, and ends with segmented copy-number calls to facilitate the identification of novel genes potentially targeted by SCNAs. We also describe a modification of Tangent, Pseudo-Tangent, which enables denoising through comparisons between tumor profiles when few normal samples are available.ResultsTangent Normalization offers substantial signal-to-noise ratio (SNR) improvements compared to conventional normalization methods in both SNP array and WES analyses. The improvement in SNRs is achieved primarily through noise reduction with minimal effect on signal. Pseudo-Tangent also reduces noise when few normal samples are available. Tangent and Pseudo-Tangent are broadly applicable and enable more accurate inference of SCNAs from DNA sequencing and array data.Availability and ImplementationTangent is available at https://github.com/coyin/tangent and as a Docker image (https://hub.docker.com/r/coyin/tangent). Tangent is also the normalization method for the Copy Number pipeline in Genome Analysis Toolkit 4 (GATK4)[email protected], [email protected], [email protected]


Sign in / Sign up

Export Citation Format

Share Document