sequence graph
Recently Published Documents


TOTAL DOCUMENTS

32
(FIVE YEARS 20)

H-INDEX

5
(FIVE YEARS 2)

2021 ◽  
Vol 2021 ◽  
pp. 1-8
Author(s):  
Chunlong Zhang ◽  
Hongtao He

The existing motion recognition system has a low athlete tracking recognition accuracy due to the poor processing effect of recognition algorithm for edge detection. A machine vision-based gymnast pose-tracking recognition system is designed for the above problem. The software part mainly optimizes the tracking recognition algorithm and uses the spatiotemporal graph convolution algorithm to construct the sequence graph structure of human joints, completes the strategy of label subset division, and completes the pose tracking according to the change of information dimension. The results of the system performance test show that the designed machine vision-based gymnast posture tracking recognition system can enhance the accuracy of tracking recognition and reduce the convergence time compared with the original system.


2021 ◽  
Author(s):  
Samuel Martin ◽  
Martin Ayling ◽  
Livia Patrono ◽  
Mario Caccamo ◽  
Pablo Murcia ◽  
...  

The assembly of contiguous sequence from metagenomic samples presents a particular challenge, due to the presence of multiple species, often closely related, at varying levels of abundance. Capturing diversity within species, for example viral haplotypes, or bacterial strain-level diversity, is even more challenging. We present MetaCortex, a metagenome assembler based on data structures from the Cortex de novo assembler. MetaCortex captures intra-species diversity by searching for signatures of local variation along assembled sequences in the underlying assembly graph and outputting these sequences in sequence graph format. MetaCortex also implements a novel assembly algorithm for representing intra-species diversity in standard linear format. We show that MetaCortex produces accurate assemblies with higher genome coverage and contiguity than other popular metagenomic assemblers on mock viral communities with high levels of strain level diversity, and on simulated communities containing simulated strains. We also show that accuracy can be increased further by using the sequence graph produced by MetaCortex to create highly accurate single contig sequences.


2021 ◽  
Vol 16 ◽  
Author(s):  
Chuanyan Wu ◽  
Bentao Lin ◽  
Kai Shi ◽  
Qingju Zhang ◽  
Rui Gao ◽  
...  

Background: Essential proteins play an important role in the process of life, which can be identified by experimental methods and computational approaches. Experimental approaches to identify essential proteins are of high accuracy but with the limitation of time and resource-consuming. Objective: Herein, we present a computational model (PEPRF) to identify essential proteins based on machine learning. Methods: Different features of proteins were extracted. Topological features of Protein-Protein Interaction (PPI) network-based were extracted. Based on the protein sequence, graph theory-based features, information-based features, composition, and physiochemical features, etc., were extracted. Finally, 282 features were constructed. In order to select the features that contributed most to the identification, the ReliefF-based feature selection method was adopted to measure the weights of these features. As a result, 212 features were curated to train random forest classifiers. Finally, PEPRF obtained an AUC of 0.71 and an accuracy of 0.742. Conclusion: Our results show that PEPRF may be applied as an efficient tool to identify essential proteins.


2021 ◽  
Author(s):  
Jonas A. Sibbesen ◽  
Jordan M. Eizenga ◽  
Adam M. Novak ◽  
Jouni Sirén ◽  
Xian Chang ◽  
...  

AbstractPangenomics is emerging as a powerful computational paradigm in bioinformatics. This field uses population-level genome reference structures, typically consisting of a sequence graph, to mitigate reference bias and facilitate analyses that were challenging with previous reference-based methods. In this work, we extend these methods into transcriptomics to analyze sequencing data using the pantranscriptome: a population-level transcriptomic reference. Our novel toolchain can construct spliced pangenome graphs, map RNA-seq data to these graphs, and perform haplotype-aware expression quantification of transcripts in a pantranscriptome. This workflow improves accuracy over state-of-the-art RNA-seq mapping methods, and it can efficiently quantify haplotype-specific transcript expression without needing to characterize a sample’s haplotypes beforehand.


Author(s):  
Jouni Sirén ◽  
Jean Monlong ◽  
Xian Chang ◽  
Adam M. Novak ◽  
Jordan M. Eizenga ◽  
...  

ABSTRACTWe introduce Giraffe, a pangenome short read mapper that can efficiently map to a collection of haplotypes threaded through a sequence graph. Giraffe, part of the variation graph toolkit (vg)1, maps reads to thousands of human genomes at around the same speed BWA-MEM2 maps reads to a single reference genome, while maintaining comparable accuracy to VG-MAP, vg’s original mapper. We have developed efficient genotyping pipelines using Giraffe. We demonstrate improvements in genotyping for single nucleotide variations (SNVs), insertions and deletions (indels) and structural variations (SVs) genome-wide. We use Giraffe to genotype and phase 167 thousands structural variations ascertained from long read studies in 5,202 human genomes sequenced with short reads, including the complete 1000 Genomes Project dataset, at an average cost of $1.50 per sample. We determine the frequency of these variations in diverse human populations, characterize their complex allelic variations and identify thousands of expression quantitative trait loci (eQTLs) driven by these variations.


Sign in / Sign up

Export Citation Format

Share Document