Similarity searching in sequences of complex events

Emotive Trending And Tracking of Tweets

International Journal of Advanced Research in Computer Science and Software Engineering ◽

10.23956/ijarcsse.v7i11.463 ◽

2017 ◽

Vol 7 (11) ◽

pp. 25

Author(s):

Sanjay Chhataru Gupta

Keyword(s):

Social Networks ◽

Social Media ◽

Social Network ◽

The Social ◽

The World ◽

Complex Events

Popularity of the social media and the amount of importance given by an individual to social media has significantly increased in last few years. As more and more people become part of the social networks like Twitter, Facebook, information which flows through the social network, can potentially give us good understanding about what is happening around in our locality, state, nation or even in the world. The conceptual motive behind the project is to develop a system which analyses about a topic searched on Twitter. It is designed to assist Information Analysts in understanding and exploring complex events as they unfold in the world. The system tracks changes in emotions over events, signalling possible flashpoints or abatement. For each trending topic, the system also shows a sentiment graph showing how positive and negative sentiments are trending as the topic is getting trended.

Download Full-text

Medicinal Chemistry Database GDBMedChem

10.26434/chemrxiv.7770809.v1 ◽

2019 ◽

Author(s):

Mahendra Awale ◽

Finton Sirockin ◽

Nikolaus Stiefl ◽

Jean-Louis Reymond

Keyword(s):

Small Molecules ◽

Natural Product ◽

Medicinal Chemistry ◽

3D Visualization ◽

Molecular Size ◽

Similarity Searching ◽

Complex Molecules ◽

Synthetic Accessibility ◽

Simple Chemical ◽

Reduced Complexity

<div>The generated database GDB17 enumerates 166.4 billion possible molecules up to 17 atoms of C, N, O, S and halogens following simple chemical stability and synthetic feasibility rules, however medicinal chemistry criteria are not taken into account. Here we applied rules inspired by medicinal chemistry to exclude problematic functional groups and complex molecules from GDB17, and sampled the resulting subset evenly across molecular size, stereochemistry and polarity to form GDBMedChem as a compact collection of 10 million small molecules.</div><div><br></div><div>This collection has reduced complexity and better synthetic accessibility than the entire GDB17 but retains higher sp 3 - carbon fraction and natural product likeness scores compared to known drugs. GDBMedChem molecules are more diverse and very different from known molecules in terms of substructures and represent an unprecedented source of diversity for drug design. GDBMedChem is available for 3D-visualization, similarity searching and for download at http://gdb.unibe.ch.</div>

Download Full-text

A Similarity Searching System for Biological Phenotype Images Using Deep Convolutional Encoder-decoder Architecture

Current Bioinformatics ◽

10.2174/1574893614666190204150109 ◽

2019 ◽

Vol 14 (7) ◽

pp. 628-639 ◽

Cited By ~ 10

Author(s):

Bizhi Wu ◽

Hangxiao Zhang ◽

Limei Lin ◽

Huiyuan Wang ◽

Yubang Gao ◽

...

Keyword(s):

Neural Network ◽

Retrieval System ◽

Sequence Similarity ◽

Local Alignment ◽

Similarity Searching ◽

Loss Of Function ◽

Biological Images ◽

The Neural Network ◽

Convolutional Autoencoder ◽

Biological Phenotype

Background: The BLAST (Basic Local Alignment Search Tool) algorithm has been widely used for sequence similarity searching. Analogously, the public phenotype images must be efficiently retrieved using biological images as queries and identify the phenotype with high similarity. Due to the accumulation of genotype-phenotype-mapping data, a system of searching for similar phenotypes is not available due to the bottleneck of image processing. Objective: In this study, we focus on the identification of similar query phenotypic images by searching the biological phenotype database, including information about loss-of-function and gain-of-function. Methods: We propose a deep convolutional autoencoder architecture to segment the biological phenotypic images and develop a phenotype retrieval system to enable a better understanding of genotype–phenotype correlation. Results: This study shows how deep convolutional autoencoder architecture can be trained on images from biological phenotypes to achieve state-of-the-art performance in a phenotypic images retrieval system. Conclusion: Taken together, the phenotype analysis system can provide further information on the correlation between genotype and phenotype. Additionally, it is obvious that the neural network model of image segmentation and the phenotype retrieval system is equally suitable for any species, which has enough phenotype images to train the neural network.

Download Full-text

LINGO-DL: a text-based approach for molecular similarity searching

Journal of Computer-Aided Molecular Design ◽

10.1007/s10822-021-00383-9 ◽

2021 ◽

Author(s):

Ammar Abdo ◽

Maude Pupin

Keyword(s):

Molecular Similarity ◽

Similarity Searching

Download Full-text

Nebula: ultra-efficient mapping-free structural variant genotyper

Nucleic Acids Research ◽

10.1093/nar/gkab025 ◽

2021 ◽

Author(s):

Parsoa Khorsand ◽

Fereydoun Hormozdiari

Keyword(s):

Large Scale ◽

Structural Variants ◽

Sequencing Technologies ◽

Generic Framework ◽

Common Genetic Variants ◽

Order Of Magnitude ◽

Complex Events ◽

Comparable Accuracy ◽

Using Data ◽

Computational Resources

Abstract Large scale catalogs of common genetic variants (including indels and structural variants) are being created using data from second and third generation whole-genome sequencing technologies. However, the genotyping of these variants in newly sequenced samples is a nontrivial task that requires extensive computational resources. Furthermore, current approaches are mostly limited to only specific types of variants and are generally prone to various errors and ambiguities when genotyping complex events. We are proposing an ultra-efficient approach for genotyping any type of structural variation that is not limited by the shortcomings and complexities of current mapping-based approaches. Our method Nebula utilizes the changes in the count of k-mers to predict the genotype of structural variants. We have shown that not only Nebula is an order of magnitude faster than mapping based approaches for genotyping structural variants, but also has comparable accuracy to state-of-the-art approaches. Furthermore, Nebula is a generic framework not limited to any specific type of event. Nebula is publicly available at https://github.com/Parsoa/Nebula.

Download Full-text