CaSe4SR: Using category sequence graph to augment session-based recommendation

2021 ◽  
Vol 212 ◽  
pp. 106558
Author(s):  
Lin Liu ◽  
Li Wang ◽  
Tao Lian
Author(s):  
Jouni Sirén ◽  
Jean Monlong ◽  
Xian Chang ◽  
Adam M. Novak ◽  
Jordan M. Eizenga ◽  
...  

ABSTRACTWe introduce Giraffe, a pangenome short read mapper that can efficiently map to a collection of haplotypes threaded through a sequence graph. Giraffe, part of the variation graph toolkit (vg)1, maps reads to thousands of human genomes at around the same speed BWA-MEM2 maps reads to a single reference genome, while maintaining comparable accuracy to VG-MAP, vg’s original mapper. We have developed efficient genotyping pipelines using Giraffe. We demonstrate improvements in genotyping for single nucleotide variations (SNVs), insertions and deletions (indels) and structural variations (SVs) genome-wide. We use Giraffe to genotype and phase 167 thousands structural variations ascertained from long read studies in 5,202 human genomes sequenced with short reads, including the complete 1000 Genomes Project dataset, at an average cost of $1.50 per sample. We determine the frequency of these variations in diverse human populations, characterize their complex allelic variations and identify thousands of expression quantitative trait loci (eQTLs) driven by these variations.


2019 ◽  
Vol 35 (22) ◽  
pp. 4754-4756 ◽  
Author(s):  
Egor Dolzhenko ◽  
Viraj Deshpande ◽  
Felix Schlesinger ◽  
Peter Krusche ◽  
Roman Petrovski ◽  
...  

Abstract Summary We describe a novel computational method for genotyping repeats using sequence graphs. This method addresses the long-standing need to accurately genotype medically important loci containing repeats adjacent to other variants or imperfect DNA repeats such as polyalanine repeats. Here we introduce a new version of our repeat genotyping software, ExpansionHunter, that uses this method to perform targeted genotyping of a broad class of such loci. Availability and implementation ExpansionHunter is implemented in C++ and is available under the Apache License Version 2.0. The source code, documentation, and Linux/macOS binaries are available at https://github.com/Illumina/ExpansionHunter/. Supplementary information Supplementary data are available at Bioinformatics online.


2020 ◽  
Vol 12 (1) ◽  
pp. 45-56
Author(s):  
Soumya George ◽  
M. Sudheep Elayidom ◽  
T. Santhanakrishnan

Research trends are dynamic, changing from time to time. It is an indicator of the latest innovations in each field of research, current areas of research, the latest technologies, and developments in each field of research. It also helps with future innovations and developments by providing current challenges and opportunities. This article proposes an efficient method to find research trends in each field of research of any subject area by using the graph-based subject classification of published papers. This methodology can be efficiently used to find research trends at any point of time, based on the published year of academic publications. A study of change in research trends in three subject areas - physics, mathematics, and computer science have been successfully conducted based on a total of 4500 publications since 2004.


2020 ◽  
Vol 184 ◽  
pp. 01009
Author(s):  
Bharathi Panduri ◽  
Madhurika Vummenthala ◽  
Spoorthi Jonnalagadda ◽  
Garwandha Ashwini ◽  
Naruvadi Nagamani ◽  
...  

IoT(Internet of things), for the most part, comprises of the various scope of Internet-associated gadgets and hubs. In the context of military and defence systems (called as IoBT) these gadgets could be personnel wearable battle outfits, tracking devices, cameras, clinical gadgets etc., The integrity and safety of these devices are critical in mission success and it is of utmost importance to keep them secure. One of the typical ways of the attack on these gadgets is through the use of malware, whose aim could be to compromise the device and or breach the communications. Generally, these IoBT gadgets and hubs are a much more significant target for cyber criminals due to the value they pose, more so than IoT devices. In this paper we attempt at creating a significant learning based procedure to distinguish, classify and tracksuch malware in IoBT(Internet of battlefield things) through operational codes progression. This is achieved by transforming the aforementioned OpCodes into a vector space, upon which a Deep Eigen space learning technique is applied to differentiate between harmful and safe applications. For robust classification, Support vector machine and n gram Sequencing algorithms are proposed in this paper. Moreover, we evaluate the quality of our proposed approach in malware recognition and also its maintainability against garbage code injection assault. These results are presented on a web page which has separate components and levels of accessibility for user and admin credentials. For the purpose of tracking the prevalence of various malwares on the network, counts and against garbage code injection assault. These results are presented on a web page which has separate components and levels of accessibility for user and admin credentials. For the purpose of tracking the prevalence of various malwares on the network, counts and trends of different malicious opcodes are displayed for both user and admin. Thereby our proposed approach will be beneficial for the users, especially for those who want to communicate confidential information within the network. It is also beneficial if a user wants to know whether a message is secure or not. This has also been made malware test accessible, which ideally will profit future research endeavors.


2020 ◽  
Vol 206 ◽  
pp. 104171 ◽  
Author(s):  
Jo Nie Sua ◽  
Si Yi Lim ◽  
Mulyadi Halim Yulius ◽  
Xingtong Su ◽  
Edward Kien Yee Yapp ◽  
...  

2020 ◽  
Vol 36 (Supplement_1) ◽  
pp. i146-i153
Author(s):  
Xian Chang ◽  
Jordan Eizenga ◽  
Adam M Novak ◽  
Jouni Sirén ◽  
Benedict Paten

Abstract Motivation Graph representations of genomes are capable of expressing more genetic variation and can therefore better represent a population than standard linear genomes. However, due to the greater complexity of genome graphs relative to linear genomes, some functions that are trivial on linear genomes become much more difficult in genome graphs. Calculating distance is one such function that is simple in a linear genome but complicated in a graph context. In read mapping algorithms such distance calculations are fundamental to determining if seed alignments could belong to the same mapping. Results We have developed an algorithm for quickly calculating the minimum distance between positions on a sequence graph using a minimum distance index. We have also developed an algorithm that uses the distance index to cluster seeds on a graph. We demonstrate that our implementations of these algorithms are efficient and practical to use for a new generation of mapping algorithms based upon genome graphs. Availability and implementation Our algorithms have been implemented as part of the vg toolkit and are available at https://github.com/vgteam/vg.


2021 ◽  
Vol 16 ◽  
Author(s):  
Chuanyan Wu ◽  
Bentao Lin ◽  
Kai Shi ◽  
Qingju Zhang ◽  
Rui Gao ◽  
...  

Background: Essential proteins play an important role in the process of life, which can be identified by experimental methods and computational approaches. Experimental approaches to identify essential proteins are of high accuracy but with the limitation of time and resource-consuming. Objective: Herein, we present a computational model (PEPRF) to identify essential proteins based on machine learning. Methods: Different features of proteins were extracted. Topological features of Protein-Protein Interaction (PPI) network-based were extracted. Based on the protein sequence, graph theory-based features, information-based features, composition, and physiochemical features, etc., were extracted. Finally, 282 features were constructed. In order to select the features that contributed most to the identification, the ReliefF-based feature selection method was adopted to measure the weights of these features. As a result, 212 features were curated to train random forest classifiers. Finally, PEPRF obtained an AUC of 0.71 and an accuracy of 0.742. Conclusion: Our results show that PEPRF may be applied as an efficient tool to identify essential proteins.


Sign in / Sign up

Export Citation Format

Share Document