indexing methods
Recently Published Documents


TOTAL DOCUMENTS

122
(FIVE YEARS 26)

H-INDEX

12
(FIVE YEARS 2)

PLoS ONE ◽  
2021 ◽  
Vol 16 (8) ◽  
pp. e0255260
Author(s):  
Altti Ilari Maarala ◽  
Ossi Arasalo ◽  
Daniel Valenzuela ◽  
Veli Mäkinen ◽  
Keijo Heljanko

Computational pan-genomics utilizes information from multiple individual genomes in large-scale comparative analysis. Genetic variation between case-controls, ethnic groups, or species can be discovered thoroughly using pan-genomes of such subpopulations. Whole-genome sequencing (WGS) data volumes are growing rapidly, making genomic data compression and indexing methods very important. Despite current space-efficient repetitive sequence compression and indexing methods, the deployed compression methods are often sequential, computationally time-consuming, and do not provide efficient sequence alignment performance on vast collections of genomes such as pan-genomes. For performing rapid analytics with the ever-growing genomics data, data compression and indexing methods have to exploit distributed and parallel computing more efficiently. Instead of strict genome data compression methods, we will focus on the efficient construction of a compressed index for pan-genomes. Compressed hybrid-index enables fast sequence alignments to several genomes at once while shrinking the index size significantly compared to traditional indexes. We propose a scalable distributed compressed hybrid-indexing method for large genomic data sets enabling pan-genome-based sequence search and read alignment capabilities. We show the scalability of our tool, DHPGIndex, by executing experiments in a distributed Apache Spark-based computing cluster comprising 448 cores distributed over 26 nodes. The experiments have been performed both with human and bacterial genomes. DHPGIndex built a BLAST index for n = 250 human pan-genome with an 870:1 compression ratio (CR) in 342 minutes and a Bowtie2 index with 157:1 CR in 397 minutes. For n = 1,000 human pan-genome, the BLAST index was built in 1520 minutes with 532:1 CR and the Bowtie2 index in 1938 minutes with 76:1 CR. Bowtie2 aligned 14.6 GB of paired-end reads to the compressed (n = 1,000) index in 31.7 minutes on a single node. Compressing n = 13,375,031 (488 GB) GenBank database to BLAST index resulted in CR of 62:1 in 575 minutes. BLASTing 189,864 Crispr-Cas9 gRNA target sequences (23 MB in total) to the compressed index of human pan-genome (n = 1,000) finished in 45 minutes on a single node. 30 MB mixed bacterial sequences were (n = 599) were blasted to the compressed index of 488 GB GenBank database (n = 13,375,031) in 26 minutes on 25 nodes. 78 MB mixed sequences (n = 4,167) were blasted to the compressed index of 18 GB E. coli sequence database (n = 745,409) in 5.4 minutes on a single node.


2021 ◽  
Author(s):  
Frank Appiah

On understanding the intermedia animation with image-loop method using transition indexing approach. Image-Looping on photo-image, graphics object or graph figure creates a motion picture of still images with delayed seconds in between each frame. <div><br></div>


2021 ◽  
Author(s):  
Frank Appiah

On understanding the intermedia animation with image-loop method using transition indexing approach. Image-Looping on photo-image, graphics object or graph figure creates a motion picture of still images with delayed seconds in between each frame. <div><br></div>


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Amrita Namtirtha ◽  
Animesh Dutta ◽  
Biswanath Dutta ◽  
Amritha Sundararajan ◽  
Yogesh Simmhan

AbstractInfluential spreaders are the crucial nodes in a complex network that can act as a controller or a maximizer of a spreading process. For example, we can control the virus propagation in an epidemiological network by controlling the behavior of such influential nodes, and amplify the information propagation in a social network by using them as a maximizer. Many indexing methods have been proposed in the literature to identify the influential spreaders in a network. Nevertheless, we have notice that each individual network holds different connectivity structures that we classify as complete, incomplete, or in-between based on their components and density. These affect the accuracy of existing indexing methods in the identification of the best influential spreaders. Thus, no single indexing strategy is sufficient from all varieties of network connectivity structures. This article proposes a new indexing method Network Global Structure-based Centrality (ngsc) which intelligently combines existing kshell and sum of neighbors’ degree methods with knowledge of the network’s global structural properties, such as the giant component, average degree, and percolation threshold. The experimental results show that our proposed method yields a better spreading performance of the seed spreaders over a large variety of network connectivity structures, and correlates well with ranking based on an SIR model used as ground truth. It also out-performs contemporary techniques and is competitive with more sophisticated approaches that are computationally cost.


2020 ◽  
Vol 12 (22) ◽  
pp. 9727
Author(s):  
Hawon Chu ◽  
Jaeseong Kim ◽  
Seounghyeon Kim ◽  
Young-Kyoon Suh ◽  
Ryong Lee ◽  
...  

Recently, various environmental data, such as microdust pollution, temperature, humidity, etc., have been continuously collected by widely deployed Internet of Things (IoT) sensors. Although these data can provide great insight into developing sustainable application services, it is challenging to rapidly retrieve such data, due to their multidimensional properties and huge growth in volume over time. Existing indexing methods for efficiently locating those data expose several problems, such as high administrative cost, spatial overhead, and slow retrieval performance. To mitigate these problems, we propose a novel indexing scheme termed ST-Trie, for efficient retrieval over spatiotemporal IoT environment data. Given IoT sensor data with latitude, longitude, and time, the proposed scheme first converts the three-dimensional attributes to one-dimensional index keys. The scheme then builds a trie-based index, consisting of internal nodes inserted by the converted keys and leaf nodes containing the keys and pointers to actual IoT data. We leverage this index to process various types of queries. In our experiments with three real-world datasets, we show that the proposed ST-Trie index outperforms existing approaches by a substantial margin regarding response time. Furthermore, we show that the query processing performance via ST-Trie also scales very well with an increasing time interval. Finally, we demonstrate that when compressed, the ST-Trie index can significantly reduce its space overhead by approximately a factor of seven.


2020 ◽  
Vol 76 (6) ◽  
pp. 719-734
Author(s):  
Adam Morawiec

The task of determining the orientations of crystals is usually performed by indexing reflections detected on diffraction patterns. The well known underlying principle of indexing methods is universal: they are based on matching experimental scattering vectors to some vectors of the reciprocal lattice. Despite this, the standard attitude has been to devise algorithms applicable to patterns of a particular type. This paper provides a broader perspective. A general approach to indexing of diffraction patterns of various types is presented. References are made to formally similar problems in other research fields, e.g. in computational geometry, computer science, computer vision or star identification. Besides a general description of available methods, concrete algorithms are presented in detail and their applicability to patterns of various types is demonstrated; a program based on these algorithms is shown to index Kikuchi patterns, Kossel patterns and Laue patterns, among others.


2020 ◽  
Vol 71 (3) ◽  
pp. 230-234
Author(s):  
R.K. Karataev ◽  
◽  
А.Т. Baibaktina ◽  

Currently, the number of resources available on the Internet is significantly increasing. It then has a large amount of information, but without mastering the content. In this vast data warehouse, the research of modern information search engines does not allow users to get results for their queries that exactly meet their needs. This is largely due to indexing methods (keywords, thesaurus). The result is that the network user spends most of their time exploring a large number of web pages, looking for what they need, because the network does not provide services in this direction. The article presents the technology of creating a bibliographic information search system.We also consider the issues of choosing the environment for creating an information search system and establishing advanced information search modes (standard, advanced, professional, dictionary, etc.).


2020 ◽  
Vol 235 (6-7) ◽  
pp. 203-212
Author(s):  
Ivan Šimeček ◽  
Aleksandr Zaloga ◽  
Jan Trdlička

AbstractOne of the key parts of the crystal structure solution process from powder diffraction data is the determination of the lattice parameters from experimental data shortly called indexing. The successive dichotomy method is one of the most common ones for this process because it allows an exhaustive search. In this paper, we discuss several improvements for this indexing method that significantly reduces the search space and decrease the solution time. We also propose a combination of this method with other indexing methods: grid search and TREOR. The effectiveness and time-consumption of such algorithm were tested on several datasets, including orthorhombic, monoclinic, and triclinic examples. Finally, we discuss the impacts of the proposed improvements.


Sign in / Sign up

Export Citation Format

Share Document