scholarly journals Nebula: ultra-efficient mapping-free structural variant genotyper

2021 ◽  
Author(s):  
Parsoa Khorsand ◽  
Fereydoun Hormozdiari

Abstract Large scale catalogs of common genetic variants (including indels and structural variants) are being created using data from second and third generation whole-genome sequencing technologies. However, the genotyping of these variants in newly sequenced samples is a nontrivial task that requires extensive computational resources. Furthermore, current approaches are mostly limited to only specific types of variants and are generally prone to various errors and ambiguities when genotyping complex events. We are proposing an ultra-efficient approach for genotyping any type of structural variation that is not limited by the shortcomings and complexities of current mapping-based approaches. Our method Nebula utilizes the changes in the count of k-mers to predict the genotype of structural variants. We have shown that not only Nebula is an order of magnitude faster than mapping based approaches for genotyping structural variants, but also has comparable accuracy to state-of-the-art approaches. Furthermore, Nebula is a generic framework not limited to any specific type of event. Nebula is publicly available at https://github.com/Parsoa/Nebula.

2019 ◽  
Author(s):  
Parsoa Khorsand ◽  
Fereydoun Hormozdiari

AbstractMotivationLarge scale catalogs of common genetic variants (including indels and structural variants) are being created using data from second and third generation whole-genome sequencing technologies. However, the genotyping of these variants in newly sequenced samples is a nontrivial task that requires extensive computational resources. Furthermore, current approaches are mostly limited to only specific types of variants and are generally prone to various errors and ambiguities when genotyping events in repeat regions. Thus we are proposing an ultra-efficient approach for genotyping any type of structural variation that is not limited by the shortcomings and complexities of current mapping-based approaches.ResultsOur method Nebula utilizes the changes in the count of k-mers to predict the genotype of common structural variations. We have shown that not only Nebula is an order of magnitude faster than mapping based approaches for genotyping deletions and mobile-element insertions, but also has comparable accuracy to state-of-the-art approaches. Furthermore, Nebula is a generic framework not limited to any specific type of event.AvailabilityNebula is publicly available at https://github.com/Parsoa/NebulousSerendipity


Author(s):  
Robert C. Edgar

AbstractMapping of reads to reference sequences is an essential step in a wide range of biological studies. The large size of datasets generated with next-generation sequencing technologies motivates the development of fast mapping software. Here, I describe URMAP, a new read mapping algorithm. URMAP is an order of magnitude faster than BWA and Bowtie2 with comparable accuracy on a benchmark test using simulated paired 150nt reads of a well-studied human genome. Software is freely available at https://drive5.com/urmap.


PeerJ ◽  
2020 ◽  
Vol 8 ◽  
pp. e9338
Author(s):  
Robert Edgar

Mapping of reads to reference sequences is an essential step in a wide range of biological studies. The large size of datasets generated with next-generation sequencing technologies motivates the development of fast mapping software. Here, I describe URMAP, a new read mapping algorithm. URMAP is an order of magnitude faster than BWA with comparable accuracy on several validation tests. On a Genome in a Bottle (GIAB) variant calling test with 30× coverage 2×150 reads, URMAP achieves high accuracy (precision 0.998, sensitivity 0.982 and F-measure 0.990) with the strelka2 caller. However, GIAB reference variants are shown to be biased against repetitive regions which are difficult to map and may therefore pose an unrealistically easy challenge to read mappers and variant callers.


2019 ◽  
Author(s):  
Rémi Allio ◽  
Alex Schomaker-Bastos ◽  
Jonathan Romiguier ◽  
Francisco Prosdocimi ◽  
Benoit Nabholz ◽  
...  

AbstractThanks to the development of high-throughput sequencing technologies, target enrichment sequencing of nuclear ultraconserved DNA elements (UCEs) now allows routinely inferring phylogenetic relationships from thousands of genomic markers. Recently, it has been shown that mitochondrial DNA (mtDNA) is frequently sequenced alongside the targeted loci in such capture experiments. Despite its broad evolutionary interest, mtDNA is rarely assembled and used in conjunction with nuclear markers in capture-based studies. Here, we developed MitoFinder, a user-friendly bioinformatic pipeline, to efficiently assemble and annotate mitogenomic data from hundreds of UCE libraries. As a case study, we used ants (Formicidae) for which 501 UCE libraries have been sequenced whereas only 29 mitogenomes are available. We compared the efficiency of four different assemblers (IDBA-UD, MEGAHIT, MetaSPAdes, and Trinity) for assembling both UCE and mtDNA loci. Using MitoFinder, we show that metagenomic assemblers, in particular MetaSPAdes, are well suited to assemble both UCEs and mtDNA. Mitogenomic signal was successfully extracted from all 501 UCE libraries allowing confirming species identification using COI barcoding. Moreover, our automated procedure retrieved 296 cases in which the mitochondrial genome was assembled in a single contig, thus increasing the number of available ant mitogenomes by an order of magnitude. By leveraging the power of metagenomic assemblers, MitoFinder provides an efficient tool to extract complementary mitogenomic data from UCE libraries, allowing testing for potential mito-nuclear discordance. Our approach is potentially applicable to other sequence capture methods, transcriptomic data, and whole genome shotgun sequencing in diverse taxa.


2018 ◽  
Vol 84 (10) ◽  
pp. 23-28
Author(s):  
D. A. Golentsov ◽  
A. G. Gulin ◽  
Vladimir A. Likhter ◽  
K. E. Ulybyshev

Destruction of bodies is accompanied by formation of both large and microscopic fragments. Numerous experiments on the rupture of different samples show that those fragments carry a positive electric charge. his phenomenon is of interest from the viewpoint of its potential application to contactless diagnostics of the early stage of destruction of the elements in various technical devices. However, the lack of understanding the nature of this phenomenon restricts the possibility of its practical applications. Experimental studies were carried out using an apparatus that allowed direct measurements of the total charge of the microparticles formed upon sample rupture and determination of their size and quantity. The results of rupture tests of duralumin and electrical steel showed that the size of microparticles is several tens of microns, the particle charge per particle is on the order of 10–14 C, and their amount can be estimated as the ratio of the cross-sectional area of the sample at the point of discontinuity to the square of the microparticle size. A model of charge formation on the microparticles is developed proceeding from the experimental data and current concept of the electron gas in metals. The model makes it possible to determine the charge of the microparticle using data on the particle size and mechanical and electrical properties of the material. Model estimates of the total charge of particles show order-of-magnitude agreement with the experimental data.


NASPA Journal ◽  
1998 ◽  
Vol 35 (4) ◽  
Author(s):  
Jackie Clark ◽  
Joan Hirt

The creation of small communities has been proposed as a way of enhancing the educational experience of students at large institutions. Using data from a survey of students living in large and small residences at a public research university, this study does not support the common assumption that small-scale social environments are more conducive to positive community life than large-scale social environments.


Author(s):  
Paul Oehlmann ◽  
Paul Osswald ◽  
Juan Camilo Blanco ◽  
Martin Friedrich ◽  
Dominik Rietzel ◽  
...  

AbstractWith industries pushing towards digitalized production, adaption to expectations and increasing requirements for modern applications, has brought additive manufacturing (AM) to the forefront of Industry 4.0. In fact, AM is a main accelerator for digital production with its possibilities in structural design, such as topology optimization, production flexibility, customization, product development, to name a few. Fused Filament Fabrication (FFF) is a widespread and practical tool for rapid prototyping that also demonstrates the importance of AM technologies through its accessibility to the general public by creating cost effective desktop solutions. An increasing integration of systems in an intelligent production environment also enables the generation of large-scale data to be used for process monitoring and process control. Deep learning as a form of artificial intelligence (AI) and more specifically, a method of machine learning (ML) is ideal for handling big data. This study uses a trained artificial neural network (ANN) model as a digital shadow to predict the force within the nozzle of an FFF printer using filament speed and nozzle temperatures as input data. After the ANN model was tested using data from a theoretical model it was implemented to predict the behavior using real-time printer data. For this purpose, an FFF printer was equipped with sensors that collect real time printer data during the printing process. The ANN model reflected the kinematics of melting and flow predicted by models currently available for various speeds of printing. The model allows for a deeper understanding of the influencing process parameters which ultimately results in the determination of the optimum combination of process speed and print quality.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Bohan Liu ◽  
Pan Liu ◽  
Lutao Dai ◽  
Yanlin Yang ◽  
Peng Xie ◽  
...  

AbstractThe pandemic of Coronavirus Disease 2019 (COVID-19) is causing enormous loss of life globally. Prompt case identification is critical. The reference method is the real-time reverse transcription PCR (RT-PCR) assay, whose limitations may curb its prompt large-scale application. COVID-19 manifests with chest computed tomography (CT) abnormalities, some even before the onset of symptoms. We tested the hypothesis that the application of deep learning (DL) to 3D CT images could help identify COVID-19 infections. Using data from 920 COVID-19 and 1,073 non-COVID-19 pneumonia patients, we developed a modified DenseNet-264 model, COVIDNet, to classify CT images to either class. When tested on an independent set of 233 COVID-19 and 289 non-COVID-19 pneumonia patients, COVIDNet achieved an accuracy rate of 94.3% and an area under the curve of 0.98. As of March 23, 2020, the COVIDNet system had been used 11,966 times with a sensitivity of 91.12% and a specificity of 88.50% in six hospitals with PCR confirmation. Application of DL to CT images may improve both efficiency and capacity of case detection and long-term surveillance.


Cancers ◽  
2021 ◽  
Vol 13 (13) ◽  
pp. 3247
Author(s):  
Petar Brlek ◽  
Anja Kafka ◽  
Anja Bukovac ◽  
Nives Pećina-Šlaus

Diffuse gliomas are a heterogeneous group of tumors with aggressive biological behavior and a lack of effective treatment methods. Despite new molecular findings, the differences between pathohistological types still require better understanding. In this in silico analysis, we investigated AKT1, AKT2, AKT3, CHUK, GSK3β, EGFR, PTEN, and PIK3AP1 as participants of EGFR-PI3K-AKT-mTOR signaling using data from the publicly available cBioPortal platform. Integrative large-scale analyses investigated changes in copy number aberrations (CNA), methylation, mRNA transcription and protein expression within 751 samples of diffuse astrocytomas, anaplastic astrocytomas and glioblastomas. The study showed a significant percentage of CNA in PTEN (76%), PIK3AP1 and CHUK (75% each), EGFR (74%), AKT2 (39%), AKT1 (32%), AKT3 (19%) and GSK3β (18%) in the total sample. Comprehensive statistical analyses show how genomics and epigenomics affect the expression of examined genes differently across various pathohistological types and grades, suggesting that genes AKT3, CHUK and PTEN behave like tumor suppressors, while AKT1, AKT2, EGFR, and PIK3AP1 show oncogenic behavior and are involved in enhanced activity of the EGFR-PI3K-AKT-mTOR signaling pathway. Our findings contribute to the knowledge of the molecular differences between pathohistological types and ultimately offer the possibility of new treatment targets and personalized therapies in patients with diffuse gliomas.


Smart Cities ◽  
2021 ◽  
Vol 4 (2) ◽  
pp. 662-685
Author(s):  
Stephan Olariu

Under present-day practices, the vehicles on our roadways and city streets are mere spectators that witness traffic-related events without being able to participate in the mitigation of their effect. This paper lays the theoretical foundations of a framework for harnessing the on-board computational resources in vehicles stuck in urban congestion in order to assist transportation agencies with preventing or dissipating congestion through large-scale signal re-timing. Our framework is called VACCS: Vehicular Crowdsourcing for Congestion Support in Smart Cities. What makes this framework unique is that we suggest that in such situations the vehicles have the potential to cooperate with various transportation authorities to solve problems that otherwise would either take an inordinate amount of time to solve or cannot be solved for lack for adequate municipal resources. VACCS offers direct benefits to both the driving public and the Smart City. By developing timing plans that respond to current traffic conditions, overall traffic flow will improve, carbon emissions will be reduced, and economic impacts of congestion on citizens and businesses will be lessened. It is expected that drivers will be willing to donate under-utilized on-board computing resources in their vehicles to develop improved signal timing plans in return for the direct benefits of time savings and reduced fuel consumption costs. VACCS allows the Smart City to dynamically respond to traffic conditions while simultaneously reducing investments in the computational resources that would be required for traditional adaptive traffic signal control systems.


Sign in / Sign up

Export Citation Format

Share Document