scholarly journals Solving patients with rare diseases through programmatic reanalysis of genome-phenome data

Author(s):  
Leslie Matalonga ◽  
◽  
Carles Hernandez-Ferrer ◽  
Davide Piscia ◽  
Rebecca Schüle ◽  
...  

AbstractReanalysis of inconclusive exome/genome sequencing data increases the diagnosis yield of patients with rare diseases. However, the cost and efforts required for reanalysis prevent its routine implementation in research and clinical environments. The Solve-RD project aims to reveal the molecular causes underlying undiagnosed rare diseases. One of the goals is to implement innovative approaches to reanalyse the exomes and genomes from thousands of well-studied undiagnosed cases. The raw genomic data is submitted to Solve-RD through the RD-Connect Genome-Phenome Analysis Platform (GPAP) together with standardised phenotypic and pedigree data. We have developed a programmatic workflow to reanalyse genome-phenome data. It uses the RD-Connect GPAP’s Application Programming Interface (API) and relies on the big-data technologies upon which the system is built. We have applied the workflow to prioritise rare known pathogenic variants from 4411 undiagnosed cases. The queries returned an average of 1.45 variants per case, which first were evaluated in bulk by a panel of disease experts and afterwards specifically by the submitter of each case. A total of 120 index cases (21.2% of prioritised cases, 2.7% of all exome/genome-negative samples) have already been solved, with others being under investigation. The implementation of solutions as the one described here provide the technical framework to enable periodic case-level data re-evaluation in clinical settings, as recommended by the American College of Medical Genetics.

GigaScience ◽  
2019 ◽  
Vol 8 (8) ◽  
Author(s):  
Marek Wiewiórka ◽  
Agnieszka Szmurło ◽  
Wiktor Kuśmirek ◽  
Tomasz Gambin

Abstract Background Depth of coverage calculation is an important and computationally intensive preprocessing step in a variety of next-generation sequencing pipelines, including the analysis of RNA-sequencing data, detection of copy number variants, or quality control procedures. Results Building upon big data technologies, we have developed SeQuiLa-cov, an extension to the recently released SeQuiLa platform, which provides efficient depth of coverage calculations, reaching >100× speedup over the state-of-the-art tools. The performance and scalability of our solution allow for exome and genome-wide calculations running locally or on a cluster while hiding the complexity of the distributed computing with Structured Query Language Application Programming Interface. Conclusions SeQuiLa-cov provides significant performance gain in depth of coverage calculations streamlining the widely used bioinformatic processing pipelines.


BMJ ◽  
2021 ◽  
pp. n214
Author(s):  
Weedon MN ◽  
Jackson L ◽  
Harrison JW ◽  
Ruth KS ◽  
Tyrrell J ◽  
...  

Abstract Objective To determine whether the sensitivity and specificity of SNP chips are adequate for detecting rare pathogenic variants in a clinically unselected population. Design Retrospective, population based diagnostic evaluation. Participants 49 908 people recruited to the UK Biobank with SNP chip and next generation sequencing data, and an additional 21 people who purchased consumer genetic tests and shared their data online via the Personal Genome Project. Main outcome measures Genotyping (that is, identification of the correct DNA base at a specific genomic location) using SNP chips versus sequencing, with results split by frequency of that genotype in the population. Rare pathogenic variants in the BRCA1 and BRCA2 genes were selected as an exemplar for detailed analysis of clinically actionable variants in the UK Biobank, and BRCA related cancers (breast, ovarian, prostate, and pancreatic) were assessed in participants through use of cancer registry data. Results Overall, genotyping using SNP chips performed well compared with sequencing; sensitivity, specificity, positive predictive value, and negative predictive value were all above 99% for 108 574 common variants directly genotyped on the SNP chips and sequenced in the UK Biobank. However, the likelihood of a true positive result decreased dramatically with decreasing variant frequency; for variants that are very rare in the population, with a frequency below 0.001% in UK Biobank, the positive predictive value was very low and only 16% of 4757 heterozygous genotypes from the SNP chips were confirmed with sequencing data. Results were similar for SNP chip data from the Personal Genome Project, and 20/21 individuals analysed had at least one false positive rare pathogenic variant that had been incorrectly genotyped. For pathogenic variants in the BRCA1 and BRCA2 genes, which are individually very rare, the overall performance metrics for the SNP chips versus sequencing in the UK Biobank were: sensitivity 34.6%, specificity 98.3%, positive predictive value 4.2%, and negative predictive value 99.9%. Rates of BRCA related cancers in UK Biobank participants with a positive SNP chip result were similar to those for age matched controls (odds ratio 1.31, 95% confidence interval 0.99 to 1.71) because the vast majority of variants were false positives, whereas sequence positive participants had a significantly increased risk (odds ratio 4.05, 2.72 to 6.03). Conclusions SNP chips are extremely unreliable for genotyping very rare pathogenic variants and should not be used to guide health decisions without validation.


GigaScience ◽  
2021 ◽  
Vol 10 (5) ◽  
Author(s):  
Colin Farrell ◽  
Michael Thompson ◽  
Anela Tosevska ◽  
Adewale Oyetunde ◽  
Matteo Pellegrini

Abstract Background Bisulfite sequencing is commonly used to measure DNA methylation. Processing bisulfite sequencing data is often challenging owing to the computational demands of mapping a low-complexity, asymmetrical library and the lack of a unified processing toolset to produce an analysis-ready methylation matrix from read alignments. To address these shortcomings, we have developed BiSulfite Bolt (BSBolt), a fast and scalable bisulfite sequencing analysis platform. BSBolt performs a pre-alignment sequencing read assessment step to improve efficiency when handling asymmetrical bisulfite sequencing libraries. Findings We evaluated BSBolt against simulated and real bisulfite sequencing libraries. We found that BSBolt provides accurate and fast bisulfite sequencing alignments and methylation calls. We also compared BSBolt to several existing bisulfite alignment tools and found BSBolt outperforms Bismark, BSSeeker2, BISCUIT, and BWA-Meth based on alignment accuracy and methylation calling accuracy. Conclusion BSBolt offers streamlined processing of bisulfite sequencing data through an integrated toolset that offers support for simulation, alignment, methylation calling, and data aggregation. BSBolt is implemented as a Python package and command line utility for flexibility when building informatics pipelines. BSBolt is available at https://github.com/NuttyLogic/BSBolt under an MIT license.


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Panagiotis Moulos

Abstract Background The relentless continuing emergence of new genomic sequencing protocols and the resulting generation of ever larger datasets continue to challenge the meaningful summarization and visualization of the underlying signal generated to answer important qualitative and quantitative biological questions. As a result, the need for novel software able to reliably produce quick, comprehensive, and easily repeatable genomic signal visualizations in a user-friendly manner is rapidly re-emerging. Results recoup is a Bioconductor package for quick, flexible, versatile, and accurate visualization of genomic coverage profiles generated from Next Generation Sequencing data. Coupled with a database of precalculated genomic regions for multiple organisms, recoup offers processing mechanisms for quick, efficient, and multi-level data interrogation with minimal effort, while at the same time creating publication-quality visualizations. Special focus is given on plot reusability, reproducibility, and real-time exploration and formatting options, operations rarely supported in similar visualization tools in a profound way. recoup was assessed using several qualitative user metrics and found to balance the tradeoff between important package features, including speed, visualization quality, overall friendliness, and the reusability of the results with minimal additional calculations. Conclusion While some existing solutions for the comprehensive visualization of NGS data signal offer satisfying results, they are often compromised regarding issues such as effortless tracking of processing and preparation steps under a common computational environment, visualization quality and user friendliness. recoup is a unique package presenting a balanced tradeoff for a combination of assessment criteria while remaining fast and friendly.


2020 ◽  
Vol 28 (12) ◽  
pp. 1763-1768
Author(s):  
Thomas Bourinaris ◽  
◽  
Damian Smedley ◽  
Valentina Cipriani ◽  
Isabella Sheikh ◽  
...  

AbstractHereditary spastic paraplegia (HSP) is a group of heterogeneous inherited degenerative disorders characterized by lower limb spasticity. Fifty percent of HSP patients remain yet genetically undiagnosed. The 100,000 Genomes Project (100KGP) is a large UK-wide initiative to provide genetic diagnosis to previously undiagnosed patients and families with rare conditions. Over 400 HSP families were recruited to the 100KGP. In order to obtain genetic diagnoses, gene-based burden testing was carried out for rare, predicted pathogenic variants using candidate variants from the Exomiser analysis of the genome sequencing data. A significant gene-disease association was identified for UBAP1 and HSP. Three protein truncating variants were identified in 13 patients from 7 families. All patients presented with juvenile form of pure HSP, with median age at onset 10 years, showing autosomal dominant inheritance or de novo occurrence. Additional clinical features included parkinsonism and learning difficulties, but their association with UBAP1 needs to be established.


2021 ◽  
Vol 39 (15_suppl) ◽  
pp. 10537-10537
Author(s):  
Michelle J McSweeny ◽  
Susan Montgomery ◽  
Kristen Danielle Whitaker ◽  
Mary Beryl Daly ◽  
Michael J. Hall

10537 Background: LS is among the most common hereditary cancer (CA) syndromes. PVs in MSH6 are 2-4 fold more common in the population (1/758) than those in MLH1 (1/1946) or MSH2 (1/2841), and are increasingly regarded as lower penetrance for CRC due to published data supporting later mean age of CRC onset and lower CRC risk. Unlike for MLH1/MSH2, NCCN 2020 CA risk estimates recognize only endometrial CA (EC) and CRC risks in MSH6+ carriers as clearly above SEER population estimates. Further, risks of other LS manifestations such as skin disease/Muir-Torre, ovarian CA (OC), and possible rare tumors in LS like sarcoma, have been minimally characterized in MSH6+ carriers. Methods: Pedigree data for 44 MSH6+ index (first-evaluated family member by our program) pts consecutively ascertained since 2009 at Fox Chase (FCCC) were reviewed. 1 pt w/a rare MSH6 uncertain variant w/personal history (PHx) of MSH6-expression deficient EC (age 50) and MSH6-deficient sebaceous skin CA (age 50) and a strong family history (FHx) c/w LS is also included here. 34% (15/44) index pts were referred to FCCC for cascade testing due to a known MSH6 PV in the family. Of the remaining 29 index pts, ascertainment included: 14% w/positive universal LS tumor screening, 21% w/early-onset or synchronous LS CA, 14% w/multi-gene panel for PHx of OC, 10% w/incidental MSH6+ result (2 had testing for PHx breast CA, 1 tumor genomic profiling), and 28% w/PHx and/or FHx of LS CA warranting genetic testing. Age of CA onset and path data were verified in > 90% index pts. Results: Index pts had a mean age of 55.5 yrs, and 77% were female. Overall, 11% (5/44) of MSH6+ index pts were found to have LS at diagnosis of synchronous primary CAs (3 EC/OC, 1 CRC/CRC, 1 CRC/EC), and 4/5 of these occurred <50 yrs. An additional 20% (9/44) index pts reported PHx of >2 metachronous LS CAs. OC was the presenting CA in 14% (6/44) female index pts; 2 additional index pts had rarer OC variants (Mullerian duct @ 41, primary peritoneal CA @ 50). Skin manifestations of LS were documented in 9.1% (4/44) index pts (3 sebaceous, 1 SCC in-situ/Bowen’s disease); 1 other family had documented sebaceous CAs in an FDR (father) but the 2 daughters seen @FCCC (both 30s) had yet to develop skin lesions. 2 index pts were found to have LS after developing early-onset breast CA (age 39) and contralateral breast CA (ages 50 and 54). Finally, 7% (3/44) index pts had a PHx of sarcoma: 2 were liposarcomas (ages 57 and 67), and 1 was a dermatofibrosarcoma. 2 other index pts had siblings w/childhood sarcomas. Conclusions: Our data, encompassing 44 MSH6+ pts evaluated in our clinic and consecutively ascertained, suggest MSH6 PV carriers develop synchronous primaries (11%), common and rare OC histologic types (18%), sarcomas (7%) and skin disease/Muir-Torre (9%). While common in the population and lower penetrance for CRC, MSH6 PV can behave in uncommon ways and may have significant extra-colonic CA risks such as OC, sarcoma and skin manifestations.


Due to manufactured technology enchantment the living being has much convenience and luxury. Though, at the same time, our current existence is doing damage to the environment. Like water pollution, air pollution and Carbon dioxide (CO2) emissions on so forth. But CO2 emissions are the one of the major reason polluting the environment. Furthermost of what we utilise in our daily life lead to emitting CO2 into the environment. Due to this it leads to global warming and climate change problems. Therefore, carbon auditing (Carbon Footprint Analysis) is the first essential step to review the use of energy, to improve energy conservation and to allow building to go green. For this reason we need carbon audit to reduce usage raw materials, waste generation so on so forth to minimise GHG emissions .“CARBON AUDIT” is conducted within the building’s boundary which includes the following stages:- People Survey to gather employee-level data, Building Survey to gather building-operation data, Carbon Footprint Analysis to evaluate the greenhouse gas (GHG) emission and Final Carbon Audit Report to provide tailored recommendations for going green along with action plan to get started


2021 ◽  
Author(s):  
Jet van der Spek ◽  
Joery den Hoed ◽  
Lot Snijders Blok ◽  
Alexander J. M. Dingemans ◽  
Dick Schijven ◽  
...  

Interpretation of next-generation sequencing data of individuals with an apparent sporadic neurodevelopmental disorder (NDD) often focusses on pathogenic variants in genes associated with NDD, assuming full clinical penetrance with limited variable expressivity. Consequently, inherited variants in genes associated with dominant disorders may be overlooked when the transmitting parent is clinically unaffected. While de novo variants explain a substantial proportion of cases with NDDs, a significant number remains undiagnosed possibly explained by coding variants associated with reduced penetrance and variable expressivity. We characterized twenty families with inherited heterozygous missense or protein-truncating variants (PTVs) in CHD3, a gene in which de novo variants cause Snijders Blok-Campeau syndrome, characterized by intellectual disability, speech delay and recognizable facial features (SNIBCPS). Notably, the majority of the inherited CHD3 variants were maternally transmitted. Computational facial and human phenotype ontology-based comparisons demonstrated that the phenotypic features of probands with inherited CHD3 variants overlap with the phenotype previously associated with de novo variants in the gene, while carrier parents are mildly or not affected, suggesting variable expressivity. Additionally, similarly reduced expression levels of CHD3 protein in cells of an affected proband and of related healthy carriers with a CHD3 PTV, suggested that compensation of expression from the wildtype allele is unlikely to be an underlying mechanism. Our results point to a significant role of inherited variation in SNIBCPS, a finding that is critical for correct variant interpretation and genetic counseling and warrants further investigation towards understanding the broader contributions of such variation to the landscape of human disease.


2019 ◽  
Vol 2 (1) ◽  
pp. 1-12
Author(s):  
Doddy - Lombardo ◽  
Edward Rosyidi

ABSTRACTION   PT Jasa Marga (Persero), Tbk is a company engaged in the development and movement of toll roads having a Current, Safe and Comfortable Quality Policy increasingly demanded to improve the quality of its services. The number of substations that have been repaired at the Kuningan Toll Gate 2 against 4 substations cannot receive currents that increase during rush hour. The queue exceeds the service standards set by the government for a maximum of 5 vehicles for each substation. In this study used the FIFO Queue Model and Distribution testing using the Promodel 7.0 Version of Student Software to find out the distribution of arrival rate and service level data. To test the average value is used the One-way ANOVA test which was previously carried out also the test of adequacy, uniformity and normality of the data. Data collection is taken when a long queue is carried out at the Toll Gate. After passing the test, the next data ? is equal to 2,004 vehicles / hour and ? is = 417 vehicles / hour with Service Time = 8.63 seconds / vehicle, if it is done with Queuing Theory. Results Processing data with queuing theory obtained N (optimal) = 6 and n (Number of vehicles in the system) = 5 vehicles, q (Number of vehicles in queue) = 4 vehicles, d (Time of vehicle in system) = 43.37 seconds, w (Time of vehicle in queue) = 34.74 seconds. The results of data preparation are further processed to obtain optimal Employee Scheduling using tables so that there will be 3 employees in shift 1, 9 in Shift 2 and 2 in shift 3. on weekdays and 3 people on shift 1, 3 on Shift 2 and 2 on shift 3 on holidays. Keywords: Queue Method, Toll Gate, Planning, Optimization                                                                                     


2020 ◽  
Vol 21 (18) ◽  
pp. 6950
Author(s):  
Anastasiya V. Snezhkina ◽  
Dmitry V. Kalinin ◽  
Vladislav S. Pavlov ◽  
Elena N. Lukyanova ◽  
Alexander L. Golovyuk ◽  
...  

Carotid paragangliomas (CPGLs) are rare neuroendocrine tumors often associated with mutations in SDHx genes. The immunohistochemistry of succinate dehydrogenase (SDH) subunits has been considered a useful instrument for the prediction of SDHx mutations in paragangliomas/pheochromocytomas. We compared the mutation status of SDHx genes with the immunohistochemical (IHC) staining of SDH subunits in CPGLs. To identify pathogenic/likely pathogenic variants in SDHx genes, exome sequencing data analysis among 42 CPGL patients was performed. IHC staining of SDH subunits was carried out for all CPGLs studied. We encountered SDHx variants in 38% (16/42) of the cases in SDHx genes. IHC showed negative (5/15) or weak diffuse (10/15) SDHB staining in most tumors with variants in any of SDHx (94%, 15/16). In SDHA-mutated CPGL, SDHA expression was completely absent and weak diffuse SDHB staining was detected. Positive immunoreactivity for all SDH subunits was found in one case with a variant in SDHD. Notably, CPGL samples without variants in SDHx also demonstrated negative (2/11) or weak diffuse (9/11) SDHB staining (42%, 11/26). Obtained results indicate that SDH immunohistochemistry does not fully reflect the presence of mutations in the genes; diagnostic effectiveness of this method was 71%. However, given the high sensitivity of SDHB immunohistochemistry, it could be used for initial identifications of patients potentially carrying SDHx mutations for recommendation of genetic testing.


Sign in / Sign up

Export Citation Format

Share Document