variant filtering
Recently Published Documents


TOTAL DOCUMENTS

97
(FIVE YEARS 41)

H-INDEX

11
(FIVE YEARS 4)

Cells ◽  
2021 ◽  
Vol 11 (1) ◽  
pp. 118
Author(s):  
Caroline Cazin ◽  
Yasmine Neirijnck ◽  
Corinne Loeuillet ◽  
Lydia Wehrli ◽  
Françoise Kühne ◽  
...  

The genetic landscape of male infertility is highly complex. It is estimated that at least 4000 genes are involved in human spermatogenesis, but only few have so far been extensively studied. In this study, we investigated by whole exome sequencing two cases of idiopathic non-obstructive azoospermia (NOA) due to severe hypospermatogenesis. After variant filtering and prioritizing, we retained for each patient a homozygous loss-of-function (LoF) variant in a testis-specific gene, C1orf185 (c.250C>T; p.Gln84Ter) and CCT6B (c.615-2A>G), respectively. Both variants are rare according to the gnomAD database and absent from our local control cohort (n = 445). To verify the implication of these candidate genes in NOA, we used the CRISPR/Cas9 system to invalidate the mouse orthologs 4930522H14Rik and Cct6b and produced two knockout (KO) mouse lines. Sperm and testis parameters of homozygous KO adult male mice were analyzed and compared with those of wild-type animals. We showed that homozygous KO males were fertile and displayed normal sperm parameters and a functional spermatogenesis. Overall, these results demonstrate that not all genes highly and specifically expressed in the testes are essential for spermatogenesis, and in particular, we conclude that bi-allelic variants of C1orf185 and CCT6B are most likely not to be involved in NOA and male fertility.


BMC Genomics ◽  
2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Nuno Maia ◽  
Maria João Nabais Sá ◽  
Manuel Melo-Pires ◽  
Arjan P. M. de Brouwer ◽  
Paula Jorge

AbstractIntellectual disability (ID) can be caused by non-genetic and genetic factors, the latter being responsible for more than 1700 ID-related disorders. The broad ID phenotypic and genetic heterogeneity, as well as the difficulty in the establishment of the inheritance pattern, often result in a delay in the diagnosis. It has become apparent that massive parallel sequencing can overcome these difficulties. In this review we address: (i) ID genetic aetiology, (ii) clinical/medical settings testing, (iii) massive parallel sequencing, (iv) variant filtering and prioritization, (v) variant classification guidelines and functional studies, and (vi) ID diagnostic yield. Furthermore, the need for a constant update of the methodologies and functional tests, is essential. Thus, international collaborations, to gather expertise, data and resources through multidisciplinary contributions, are fundamental to keep track of the fast progress in ID gene discovery.


2021 ◽  
Vol 11 (22) ◽  
pp. 10788
Author(s):  
Ali Fallah ◽  
Steven van de Par

Speech intelligibility in public places can be degraded by the environmental noise and reverberation. In this study, a new near-end listening enhancement (NELE) approach is proposed in which using a time varying filter jointly enhances the onsets and reduces the overlap masking. For optimization, some look-ahead in clean speech and prior knowledge of room impulse response (RIR) are required. In this method, by optimizing a defined cost function, the Spectro-Temporal Envelope of reverb speech is optimized to be as close as possible to that of clean speech. In this cost function, onsets of speech are optimized with increased weight. This approach is different from overlap-masking ratio (OMR) and speech enhancement (OE) approaches (Grosse, van de Par, 2017, J. Audio Eng. Soc., Vol. 65(1/2), pp. 31–41) that only consider previous frames in each time slot for determining the time variant filtering. The SRT measurements show that the new optimization framework enhances the speech intelligibility up to 2 dB more that OE.


2021 ◽  
Author(s):  
Tim H. Heupink ◽  
Lennert Verboven ◽  
Robin M. Warren ◽  
Annelies Van Rie

AbstractImproved understanding of the genomic variants that allow Mycobacterium tuberculosis (Mtb) to acquire drug resistance, or tolerance, and increase its virulence are important factors in controlling the current tuberculosis epidemic. Current approaches to Mtb sequencing however cannot reveal Mtb’s full genomic diversity due to the strict requirements of low contamination levels, high Mtb sequence coverage, and elimination of complex regions.We developed the XBS (compleX Bacterial Samples) bioinformatics pipeline which implements joint calling and machine-learning-based variant filtering tools to specifically improve variant detection in the important Mtb samples that do not meet these criteria, such as those from unbiased sputum samples. Using novel simulated datasets, that permit exact accuracy verification, XBS was compared to the UVP and MTBseq pipelines. Accuracy statistics showed that all three pipelines performed equally well for sequence data that resemble those obtained from high depth coverage and low-level contamination culture isolates. In the complex genomic regions however, XBS accurately identified 9.0% more single nucleotide polymorphisms and 8.1% more single nucleotide insertions and deletions than the WHO-endorsed unified analysis variant pipeline. XBS also had superior accuracy for sequence data that resemble those obtained directly from sputum samples, where depth of coverage is typically very low and contamination levels are high. XBS was the only pipeline not affected by low depth of coverage (5-10×), type of contamination and excessive contamination levels (>50%). Simulation results were confirmed using WGS data from clinical samples, confirming the superior performance of XBS with a higher sensitivity (98.8%) when analysing culture isolates and identification of 13.9% more variable sites in WGS data from sputum samples as compared to MTBseq, without evidence for false positive variants when ribosomal RNA regions were excluded.The XBS pipeline facilitates sequencing of less-than-perfect Mtb samples. These advances will benefit future clinical applications of Mtb sequencing, especially whole genome sequencing directly from clinical specimens, thereby avoiding in vitro biases and making many more samples available for drug resistance and other genomic analyses. The additional genetic resolution and increased sample success rate will improve genome-wide association studies and sequence-based transmission studies.Impact statementMycobacterium tuberculosis (Mtb) DNA is usually extracted from culture isolates to obtain high quantities of non-contaminated DNA but this process can change the make-up of the bacterial population and is time-consuming. Furthermore, current analytic approaches exclude complex genomic regions where DNA sequences are repeated to avoid inference of false positive genetic variants, which may result in the loss of important genetic information.We designed the compleX Bacterial Sample (XBS) variant caller to overcome these limitations. XBS employs joint variant calling and machine-learning-based variant filtering to ensure that high quality variants can be inferred from low coverage and highly contaminated genomic sequence data obtained directly from sputum samples. Simulation and clinical data analyses showed that XBS performs better than other pipelines as it can identify more genetic variants and can handle complex (low depth, highly contaminated) Mtb samples. The XBS pipeline was designed to analyse Mtb samples but can easily be adapted to analyse other complex bacterial samples.Data summarySimulated sequencing data have been deposited in SRA BioProject PRJNA706121. All detailed findings are available in the Supplementary Material. Scripts for running the XBS variant calling core are available on https://github.com/TimHHH/XBS The authors confirm all supporting data, code and protocols have been provided within the article or through supplementary data files.


2021 ◽  
Author(s):  
Giulio Formenti ◽  
Arang Rhie ◽  
Brian P Walenz ◽  
Francoise Thibaud-Nissen ◽  
Kishwar Shafin ◽  
...  

Read mapping and variant calling approaches have been widely used for accurate genotyping and improving consensus quality assembled from noisy long reads. Variant calling accuracy relies heavily on the read quality, the precision of the read mapping algorithm and variant caller, and the criteria adopted to filter the calls. However, it is impossible to define a single set of optimal parameters, as they vary depending on the quality of the read set, the variant caller of choice, and the quality of the unpolished assembly. To overcome this issue, we have devised a new tool called Merfin (k-mer based finishing tool), a k-mer based variant filtering algorithm for improved genotyping and polishing. Merfin evaluates the accuracy of a call based on expected k-mer multiplicity in the reads, independently of the quality of the read alignment and variant caller internal score. Moreover, we introduce novel assembly quality and completeness metrics that account for the expected genomic copy numbers. Merfin significantly increased the precision of a variant call and reduced frameshift errors when applied to PacBio HiFi, PacBio CLR, or Nanopore long read based assemblies. We demonstrate the utility while polishing the first complete human genome, a fully phased human genome, and non-human high-quality genomes.


2021 ◽  
Vol 9 ◽  
Author(s):  
Niaz Muhammad Khan ◽  
Basharat Hussain ◽  
Chenqing Zheng ◽  
Ayaz Khan ◽  
Muhammad Shareef Masoud ◽  
...  

Microcephaly (MCPH) is a genetically heterogeneous disorder characterized by non-progressive intellectual disability, small head circumference, and small brain size compared with the age- and sex-matched population. MCPH manifests as an isolated condition or part of another clinical syndrome; so far, 25 genes have been linked with MCPH. Many of these genes are reported in Pakistani population, but due to a high rate of consanguinity, a significant proportion of MCPH cohort is yet to be explored. MCPH5 is the most frequently reported type, accounting for up to 68.75% alone in a genetically constrained population like Pakistan. In the current study, whole exome sequencing (WES) was performed on probands from 10 families sampled from South Waziristan and two families from rural areas of the Pakistani Punjab. Candidate variants were validated through Sanger sequencing in all available family members. Variant filtering and in silico analysis identified three known mutations in ASPM, a MCPH5-associated gene. The founder mutation p.Trp1326* was segregating in 10 families, which further confirmed the evidence that it is the most prominent mutation in Pashtun ethnicity living in Pakistan and Afghanistan. Furthermore, the previously known mutations p.Arg3244* and p.Arg1019* were inherited in two families with Punjab ethnic profile. Collectively, this study added 12 more families to the mutational paradigm of ASPM and expanded the Pakistani MCPH cohort.


2021 ◽  
Vol 21 (1) ◽  
Author(s):  
Yu Xu ◽  
Yong-Biao Zhang ◽  
Li-Jun Liang ◽  
Jia-Li Tian ◽  
Jin-Ming Lin ◽  
...  

Abstract Background Hereditary hemorrhagic telangiectasia (HHT) is a disease characterized by arteriovenous malformations in the skin and mucous membranes. We enrolled a large pedigree comprising 32 living members, and screened for mutations responsible for HHT. Methods We performed whole-exome sequencing to identify novel mutations in the pedigree after excluding three previously reported HHT-related genes using Sanger sequencing. We then performed in silico functional analysis of candidate mutations that were obtained using a variant filtering strategy to identify mutations responsible for HHT. Results After screening the HHT-related genes, activin A receptor-like type 1 (ACVRL1), endoglin (ENG), and SMAD family member 4 (SMAD4), we did not detect any co-segregated mutations in this pedigree. Whole-exome sequencing analysis of 7 members and Sanger sequencing analysis of 16 additional members identified a mutation (c.784A > G) in the NSF attachment protein gamma (NAPG) gene that co-segregated with the disease. Functional prediction showed that the mutation was deleterious and might change the conformational stability of the NAPG protein. Conclusions NAPG c.784A > G may potentially lead to HHT. These results expand the current understanding of the genetic contributions to HHT pathogenesis.


2021 ◽  
Vol 22 (4) ◽  
pp. 2187
Author(s):  
Caroline Cazin ◽  
Yasmine Boumerdassi ◽  
Guillaume Martinez ◽  
Selima Fourati Ben Mustapha ◽  
Marjorie Whitfield ◽  
...  

Acephalic spermatozoa syndrome (ASS) is a rare but extremely severe type of teratozoospermia, defined by the presence of a majority of headless flagella and a minority of tail-less sperm heads in the ejaculate. Like the other severe monomorphic teratozoospermias, ASS has a strong genetic basis and is most often caused by bi-allelic variants in SUN5 (Sad1 and UNC84 domain-containing 5). Using whole exome sequencing (WES), we investigated a cohort of nine infertile subjects displaying ASS. These subjects were recruited in three centers located in France and Tunisia, but all originated from North Africa. Sperm from subjects carrying candidate genetic variants were subjected to immunofluorescence analysis and transmission electron microscopy. Moreover, fluorescent in situ hybridization (FISH) was performed on sperm nuclei to assess their chromosomal content. Variant filtering permitted us to identify the same SUN5 homozygous frameshift variant (c.211+1_211+2dup) in 7/9 individuals (78%). SUN5 encodes a protein localized on the posterior part of the nuclear envelope that is necessary for the attachment of the tail to the sperm head. Immunofluorescence assays performed on sperm cells from three mutated subjects revealed a total absence of SUN5, thus demonstrating the deleterious impact of the identified variant on protein expression. Transmission electron microscopy showed a conserved flagellar structure and a slightly decondensed chromatin. FISH did not highlight a higher rate of chromosome aneuploidy in spermatozoa from SUN5 patients compared to controls, indicating that intra-cytoplasmic sperm injection (ICSI) can be proposed for patients carrying the c.211+1_211+2dup variant. These results suggest that the identified SUN5 variant is the main cause of ASS in the North African population. Consequently, a simple and inexpensive genotyping of the 211+1_211+2dup variant could be beneficial for affected men of North African origin before resorting to more exhaustive genetic analyses.


2021 ◽  
Author(s):  
Paniz Miar ◽  
Sina Narrei ◽  
Mohammad Amin Tabatabaiefar ◽  
Mohammad Reza Pourreza ◽  
Morteza Hashemzadeh-Chaleshtori ◽  
...  

Abstract Purpose Lynch syndrome is the most common hereditary cancer syndromes due to a germline mutation in one of the mismatch-repair (MMR) genes. It results in early-onset colorectal cancer (CRC) and other Lynch-associated cancers in an autosomal dominant pattern. In this article, a new pathogenic variant in a Persian family with familial CRCs and positive Amsterdam II criteria has been described. Methods IHC-MMRs was done on tissue sections from tumor and its adjacent healthy tissue of the proband. Microsatellite instability (MSI) testing was also performed on DNA extracted from tumor and healthy tissue using Promega kit. Next Generation Sequencing (NGS) was finally done on genomic DNA of the proband using a 12-gene-panel including MMR genes. Variant filtering and prioritization were done using bioinformatics tools. Co-segregation analysis was used to evaluate the explored pathogenic variant. Results The proband was a 38-year woman at-diagnosis affected with CRC located in the sigmoid colon. The family history of cancer was observed in three successive generations. IHC was absent for MSH2 and MSH6, and MSI-High was reported from MSI testing. NGS analysis explored a new stop gained codon mutation on the first exon of MSH2 gene as a substitution of A to G MSH2: c.364A > T which was pathogenic according to the variant interpretation guidelines of American College of Medical Genetics and Genomics. Conclusion Revealing of a more obvious molecular feature of Lynch-syndrome among Iranian populations could lead to identification of at-risk people for early care and prevention of cancer.


2021 ◽  
Author(s):  
Albert Rosenberger ◽  
Viola Tozzi ◽  
Rayjean J Hung ◽  
David C Christiani ◽  
Neil E Caporaso ◽  
...  

Abstract Background: Imputation of untyped markers is a standard tool in genome-wide association studies to close the gap between directly genotyped and other known DNA variants. However, high accuracy with which genotypes are imputed is fundamental. Several accuracy measures have been proposed and some are implemented in imputation software, unfortunately diversely across platforms. In the present paper we introduce I’am hiQ, an independent pair of accuracy measures that can be applied to dosage files, the output of all imputation software. I’am (imputation accuracy measure) quantifies the average amount of individual-specific versus population-specific genotype information in a linear manner. hiQ (heterogeneity in quantities of dosages) addresses the inter-individual heterogeneity between dosages of a marker across the sample at hand. Results: Applying both measures to a large case-control sample of the International Lung Cancer Consortium (ILCCO), comprising 27,065 individuals, we found meaningful thresholds for I’am and hiQ suitable to classify markers of poor accuracy. We demonstrate how Manhattan-like plots and moving averages of I’am and hiQ can be useful to identify regions enriched with less accurate imputed markers, whereas these regions would by missed when applying the accuracy measure info (implemented in IMPUTE2). Conclusion: We recommend using I’am hiQ additional to other accuracy scores for variant filtering before stepping into the analysis of imputed GWAS data.


Sign in / Sign up

Export Citation Format

Share Document