hybrid index
Recently Published Documents


TOTAL DOCUMENTS

132
(FIVE YEARS 37)

H-INDEX

14
(FIVE YEARS 3)

2021 ◽  
Vol 11 (22) ◽  
pp. 10803
Author(s):  
Jiagang Song ◽  
Yunwu Lin ◽  
Jiayu Song ◽  
Weiren Yu ◽  
Leyuan Zhang

Mass multimedia data with geographical information (geo-multimedia) are collected and stored on the Internet due to the wide application of location-based services (LBS). How to find the high-level semantic relationship between geo-multimedia data and construct efficient index is crucial for large-scale geo-multimedia retrieval. To combat this challenge, the paper proposes a deep cross-modal hashing framework for geo-multimedia retrieval, termed as Triplet-based Deep Cross-Modal Retrieval (TDCMR), which utilizes deep neural network and an enhanced triplet constraint to capture high-level semantics. Besides, a novel hybrid index, called TH-Quadtree, is developed by combining cross-modal binary hash codes and quadtree to support high-performance search. Extensive experiments are conducted on three common used benchmarks, and the results show the superior performance of the proposed method.


2021 ◽  
Author(s):  
Tao Xu ◽  
Aopeng Xu ◽  
Joseph Mango ◽  
Pengfei Liu ◽  
Xiaqing Ma ◽  
...  

Abstract The rapid popularization of high-speed mobile communication technology and the continuous development of mobile network devices have given spatial textual big data (STBD) new dimensions due to their ability to record geographical objects from multiple sources and with complex attributes. Data mining from spatial textual datasets has become a meaningful study. As a popular topic for STBD, the top-k spatial keyword query has been developed in various forms to deal with different retrievals requirements. However, previous research focused mainly on indexing locational attributes and retrievals of few target attributes, and these correlations between large numbers of the textual attributes have not been fully studied and demonstrated. To further explore interrelated-knowledge in the textual attributes, this paper defines the top-k frequent spatial keyword query (tfSKQ) and proposes a novel hybrid index structure, named RCL-tree, based on the concept lattice theory. We also develop the tfSKQ algorithms to retrieve the most frequent and nearest spatial objects in STBD. One existing method and two baseline algorithms are implemented, and a series of experiments are carried out using real datasets to evaluate its performance. Results demonstrated the effectiveness and efficiency of the proposed RCL-tree in tfSKQ with the complex spatial multi keyword query conditions.


Plants ◽  
2021 ◽  
Vol 10 (10) ◽  
pp. 2168
Author(s):  
Fabrizio Olivieri ◽  
Salvatore Graci ◽  
Silvana Francesca ◽  
Maria Manuela Rigano ◽  
Amalia Barone

The constitution of heat tolerant F1 hybrids is a challenge to ensure high yield and good fruit quality in the global climate. In the present work, we evaluated 15 genotypes for yield-related traits highly affected by high temperatures (HT). This phenotypic analysis allowed to identify four parental genotypes showing promising yield performances under HT conditions. Two of these genotypes also exhibited good fruit quality traits. A molecular marker analysis was carried out for six resistance genes to pathogens mostly affecting tomatoes. This analysis evidenced the presence of a maximum of three resistant alleles in parental genotypes. Exploring single nucleotide polymorphisms (SNPs) revealed by two high-throughput genotyping platforms allowed identifying additional 12 genes potentially involved in resistance to biotic stress, to be further investigated. Following these considerations, 13 F1 hybrids were constituted combining the parental genotypes and then evaluated for multiple traits under HT conditions. By estimating a hybrid index based on yield performances, desirable quality and resistance gene, we identified seven hybrids showing the best performances. The promising results obtained in the present work should be confirmed by evaluating the best hybrids selected for additional years and environments before proposing them as novel commercial hybrids that could maintain high performances under HT conditions.


Author(s):  
Walter Bossert ◽  
Frank Stehling

Abstract We examine the notion of a price index as the solution to the problem of minimizing the distance between the index values and the vector of price ratios. To do so, the choice of a suitable distance function is of crucial importance. We use a generalized least-squares criterion for this purpose and show that the generalized quasilinear functions are the only solutions to the problem of minimizing the distance thus defined. There are numerous special cases that are obtained for specific choices of the requisite functions and weights. In particular, we show that, in addition to the well-established indexes of Laspeyres, Paasche, Marshall-Edgeworth, Walsh, and Törnqvist, the arithmetic-current-period index, the arithmetic-hybrid index, the harmonic-base-period index, and the harmonic-hybrid index can be obtained with suitably chosen distance functions. Furthermore, the logarithmic least-squares criterion is employed to obtain indexes that are based on geometric means.


PLoS ONE ◽  
2021 ◽  
Vol 16 (8) ◽  
pp. e0255260
Author(s):  
Altti Ilari Maarala ◽  
Ossi Arasalo ◽  
Daniel Valenzuela ◽  
Veli Mäkinen ◽  
Keijo Heljanko

Computational pan-genomics utilizes information from multiple individual genomes in large-scale comparative analysis. Genetic variation between case-controls, ethnic groups, or species can be discovered thoroughly using pan-genomes of such subpopulations. Whole-genome sequencing (WGS) data volumes are growing rapidly, making genomic data compression and indexing methods very important. Despite current space-efficient repetitive sequence compression and indexing methods, the deployed compression methods are often sequential, computationally time-consuming, and do not provide efficient sequence alignment performance on vast collections of genomes such as pan-genomes. For performing rapid analytics with the ever-growing genomics data, data compression and indexing methods have to exploit distributed and parallel computing more efficiently. Instead of strict genome data compression methods, we will focus on the efficient construction of a compressed index for pan-genomes. Compressed hybrid-index enables fast sequence alignments to several genomes at once while shrinking the index size significantly compared to traditional indexes. We propose a scalable distributed compressed hybrid-indexing method for large genomic data sets enabling pan-genome-based sequence search and read alignment capabilities. We show the scalability of our tool, DHPGIndex, by executing experiments in a distributed Apache Spark-based computing cluster comprising 448 cores distributed over 26 nodes. The experiments have been performed both with human and bacterial genomes. DHPGIndex built a BLAST index for n = 250 human pan-genome with an 870:1 compression ratio (CR) in 342 minutes and a Bowtie2 index with 157:1 CR in 397 minutes. For n = 1,000 human pan-genome, the BLAST index was built in 1520 minutes with 532:1 CR and the Bowtie2 index in 1938 minutes with 76:1 CR. Bowtie2 aligned 14.6 GB of paired-end reads to the compressed (n = 1,000) index in 31.7 minutes on a single node. Compressing n = 13,375,031 (488 GB) GenBank database to BLAST index resulted in CR of 62:1 in 575 minutes. BLASTing 189,864 Crispr-Cas9 gRNA target sequences (23 MB in total) to the compressed index of human pan-genome (n = 1,000) finished in 45 minutes on a single node. 30 MB mixed bacterial sequences were (n = 599) were blasted to the compressed index of 488 GB GenBank database (n = 13,375,031) in 26 minutes on 25 nodes. 78 MB mixed sequences (n = 4,167) were blasted to the compressed index of 18 GB E. coli sequence database (n = 745,409) in 5.4 minutes on a single node.


2021 ◽  
Author(s):  
Pengting Du ◽  
Yingjian Liu ◽  
Yue Li ◽  
Haoyu Yin ◽  
Limin Zhang
Keyword(s):  

2021 ◽  
pp. 1-47
Author(s):  
Ty A. Dickinson ◽  
Michael B. Richman ◽  
Jason C. Furtado

AbstractExtreme precipitation across multiple timescales is a natural hazard that creates a significant risk to life, with a commensurately large cost through property loss. We devise a method to create 14-day extreme event windows that characterize precipitation events in the contiguous United States (CONUS) for the years 1915 through 2018. Our algorithm imposes thresholds for both total precipitation and the duration of the precipitation to identify events with sufficient length to accentuate the synoptic and longer time scale contribution to the precipitation event. Kernel density estimation is employed to create extreme event polygons which are formed into a database spanning from 1915 through 2018. Using the developed database, we clustered events into regions using a k-means algorithm. We define the “Hybrid Index”, a weighted composite of silhouette score and number of clustered events, to show the optimal number of clusters is 14. We also show that 14-day extreme precipitation events are increasing in the CONUS, specifically in the Dakotas and much of New England. The algorithm presented in this work is designed to be sufficiently flexible to be extended to any desired number of days on the subseasonal-to-seasonal (S2S) timescale (e.g., 30 days). Additional databases generated using this framework are available for download from our GitHub. Consequently, these S2S databases can be analyzed in future works to determine the climatology of S2S extreme precipitation events and be used for predictability studies for identified events.


2021 ◽  
pp. 219-227
Author(s):  
S. A. Bakhoum

Immigrant narrow–barred Spanish mackerel, West African Spanish mackerel and specimens with an external appearance somewhere between these putative parents were collected from Abu Qir Bay, East Alexandria, Egypt. The hybrid index results and univariate and multivariate analysis indicated a natural hybridization between these two species. The discriminant function analysis successfully classified individual fish in the data to one of the three fish groups. Squared Mahalanobis distances extracted from the groups indicated the three groups were clearly distinct from each other. Moreover, distances between the hybrid and Scomberomorus tritor were longer than those of the hybrid and S. commerson. The mean values of the condition factor for the hybri were significantly higher than those of S. commerson. Natural mortality of the hybrid was significantly lower than that of the exotic parent (S. commerson), indicating that the environmental conditions in the examined region are more suitable for the hybrid type species than for the invasive parental species.


Author(s):  
Ibrahim Balogun ◽  
Nii Attoh-Okine

Abstract In discussions of track geometry, track safety takes precedence over other requirements because its shortfall often leads to unrecoverable loss. Track geometry is unanimously positioned as the index for safety evaluation—corrective or predictive—to predict the rightful maintenance regime based on track conditions. A recent study has shown that track defect probability thresholds can best be explored using a hybrid index. Hence, a dimension reduction technique that combines both safety components and geometry quality is needed. It is observed that dimensional space representation of track parameters without prior covariate shift evaluation could affect the overall distribution as the underlying discrepancies could pose a problem for the accuracy of the prediction. In this study, the authors applied a covariate shift framework to track geometry parameters before applying the dimension reduction techniques. Whilst both principal component analysis (PCA) and t-distributed stochastic neighbour embedding (TSNE) are viable techniques that express the probability distribution of parameters based on correlation in their embedded space and inclination to maximize the variance, shift distribution evaluation should be considered. In conclusion, we demonstrate that our framework can detect and evaluate a covariate shift likelihood in a high-dimensional track geometry defect problem.


2021 ◽  
Author(s):  
Shengnan Ke ◽  
Jun Gong ◽  
Songnian Li ◽  
Qing Zhu ◽  
Xintao Liu ◽  
...  

In recent years, there has been tremendous growth in the field of indoor and outdoor positioning sensors continuously producing huge volumes of trajectory data that has been used in many fields such as location-based services or location intelligence. Trajectory data is massively increased and semantically complicated, which poses a great challenge on spatio-temporal data indexing. This paper proposes a spatio-temporal data indexing method, named HBSTR-tree, which is a hybrid index structure comprising spatio-temporal R-tree, B*-tree and Hash table. To improve the index generation efficiency, rather than directly inserting trajectory points, we group consecutive trajectory points as nodes according to their spatio-temporal semantics and then insert them into spatio-temporal R-tree as leaf nodes. Hash table is used to manage the latest leaf nodes to reduce the frequency of insertion. A new spatio-temporal interval criterion and a new node-choosing sub-algorithm are also proposed to optimize spatio-temporal R-tree structures. In addition, a B*-tree sub-index of leaf nodes is built to query the trajectories of targeted objects efficiently. Furthermore, a database storage scheme based on a NoSQL-type DBMS is also proposed for the purpose of cloud storage. Experimental results prove that HBSTR-tree outperforms TB*-tree in some aspects such as generation efficiency, query performance and query type.


Sign in / Sign up

Export Citation Format

Share Document