distributed search Latest Research Papers

Search has been central to the development of the Web, enabling increasing engagement by a growing number of users. Proposals for the redecentalisation of the Web such as SOLID aim to give individuals sovereignty over their data by means of personal online datastores (pods). However, it is not clear whether search utilities that we currently take for granted would work efficiently in a redecentralised Web. In this paper we discuss the challenges of supporting distributed search on a large scale of pods. We present a system architecture which can allow research, development and testing of new algorithms for decentralised search across pods. We undertake an initial validation of this architecture by usage scenarios for decentralised search under user-defined access control and data governance constraints. We conclude with research directions for decentralised search algorithms and deployment.

Download Full-text

QiBAM: Approximate Sub-String Index Search on Quantum Accelerators Applied to DNA Read Alignment

Electronics ◽

10.3390/electronics10192433 ◽

2021 ◽

Vol 10 (19) ◽

pp. 2433

Author(s):

Aritra Sarkar ◽

Zaid Al-Ars ◽

Carmen G. Almudever ◽

Koen L. M. Bertels

Keyword(s):

Dna Sequences ◽

Search Algorithm ◽

Scale Up ◽

Quantum Algorithm ◽

Small Scale ◽

Distributed Search ◽

Sequence Reconstruction ◽

Grover’S Search Algorithm ◽

Physics Labs ◽

Pipeline Design

With small-scale quantum processors transitioning from experimental physics labs to industrial products, these processors in a few years are expected to scale up and be more robust for efficiently computing important algorithms in various fields. In this paper, we propose a quantum algorithm to address the challenging field of data processing for genome sequence reconstruction. This research describes an architecture-aware implementation of a quantum algorithm for sub-sequence alignment. A new algorithm named QiBAM (quantum indexed bidirectional associative memory) is proposed, which uses approximate pattern-matching based on Hamming distances. QiBAM extends the Grover’s search algorithm in two ways, allowing: (1) approximate matches needed for read errors in genomics, and (2) a distributed search for multiple solutions over the quantum encoding of DNA sequences. This approach gives a quadratic speedup over the classical algorithm. A full implementation of the algorithm is provided and verified using the OpenQL compiler and QX Simulator framework. Our implementation represents a first exploration towards a full-stack quantum accelerated genome sequencing pipeline design.

Download Full-text

NetANNS: A High-Performance Distributed Search Framework Based On In-Network Computing

10.1109/ispa-bdcloud-socialcom-sustaincom52081.2021.00047 ◽

2021 ◽

Author(s):

Penghao Zhang ◽

Heng Pan ◽

Zhenyu Li ◽

Gaogang Xie ◽

Penglai Cui

Keyword(s):

High Performance ◽

Distributed Search ◽

Network Computing

Download Full-text

Efficient distributed discovery of bidirectional order dependencies

The VLDB Journal ◽

10.1007/s00778-021-00683-4 ◽

2021 ◽

Author(s):

Sebastian Schmidl ◽

Thorsten Papenbrock

Keyword(s):

Canonical Form ◽

Search Strategy ◽

State Of The Art ◽

Publication Date ◽

Consistency Checking ◽

Distributed Search ◽

Exponential Complexity ◽

Pruning Techniques ◽

Discovery Algorithms ◽

Relational Table

AbstractBidirectional order dependencies (bODs) capture order relationships between lists of attributes in a relational table. They can express that, for example, sorting books by publication date in ascending order also sorts them by age in descending order. The knowledge about order relationships is useful for many data management tasks, such as query optimization, data cleaning, or consistency checking. Because the bODs of a specific dataset are usually not explicitly given, they need to be discovered. The discovery of all minimal bODs (in set-based canonical form) is a task with exponential complexity in the number of attributes, though, which is why existing bOD discovery algorithms cannot process datasets of practically relevant size in a reasonable time. In this paper, we propose the distributed bOD discovery algorithm DISTOD, whose execution time scales with the available hardware. DISTOD is a scalable, robust, and elastic bOD discovery approach that combines efficient pruning techniques for bOD candidates in set-based canonical form with a novel, reactive, and distributed search strategy. Our evaluation on various datasets shows that DISTOD outperforms both single-threaded and distributed state-of-the-art bOD discovery algorithms by up to orders of magnitude; it can, in particular, process much larger datasets.

Download Full-text

Accelerating LSH-based Distributed Search with In-network Computation

IEEE INFOCOM 2021 - IEEE Conference on Computer Communications ◽

10.1109/infocom42981.2021.9488722 ◽

2021 ◽

Author(s):

Penghao Zhang ◽

Heng Pan ◽

Zhenyu Li ◽

Peng He ◽

Zhibin Zhang ◽

...

Keyword(s):

Distributed Search

Download Full-text

Ensemble Distributed Search-FSGM-CRD Compressed Cache Algorithm for Large Datasets

Turkish Journal of Computer and Mathematics Education (TURCOMAT) ◽

10.17762/turcomat.v12i2.2317 ◽

2021 ◽

Vol 12 (2) ◽

pp. 2854-2858

Author(s):

M. Sailaja, Et. al.

Keyword(s):

Graph Mining ◽

Pattern Mining ◽

Frequent Pattern Mining ◽

Large Datasets ◽

Frequent Pattern ◽

Distributed Search ◽

The Past ◽

Huge Data ◽

Challenging Tasks ◽

Complex Relationships

Frequent sub-graph mining (FSM) is a alternative of frequent pattern mining where patterns are graphs. Among the entities, graph based representation is utilized to effectively represent the complex relationships. Various graph mining techniques are developed from the past many years, most the challenging tasks in graph mining is frequent sub-graph mining (FSM). In FSM many of the existing algorithms consider only graph based structure, the relationships based on entities involved and strength is not considered. It is very important to handle the complex and huge data. There is very huge demand in distributed computational approaches. In this paper, An Ensemble Distributed Search-FSGM-CRD Compressed Cache Algorithm is developed and implemented to find frequent sub graphs

Download Full-text

Efficient Lévy walks in virtual human foraging

Scientific Reports ◽

10.1038/s41598-021-84542-w ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Ketika Garg ◽

Christopher T Kello

Keyword(s):

Home Range ◽

Distributed Search ◽

Costs And Benefits ◽

Distributed Resources ◽

Task Constraints ◽

Human Scale ◽

Lévy Walks ◽

Human Foraging ◽

Plan Movement ◽

Time And Energy

AbstractEfficient foraging depends on decisions that account for the costs and benefits of various activities like movement, perception, and planning. We conducted a virtual foraging experiment set in the foothills of the Himalayas to examine how time and energy are expended to forage efficiently, and how foraging changes when constrained to a home range. Two hundred players foraged the human-scale landscape with simulated energy expenditure in search of naturally distributed resources. Results showed that efficient foragers produced periods of locomotion interleaved with perception and planning that approached theoretical expectations for Lévy walks, regardless of the home-range constraint. Despite this constancy, efficient home-range foraging trajectories were less diffusive by virtue of restricting locomotive search and spending more time instead scanning the environment to plan movement and detect far-away resources. Altogether, results demonstrate that humans can forage efficiently by arranging and adjusting Lévy-distributed search activities in response to environmental and task constraints.

Download Full-text

NetSHa: In-network Acceleration of LSH-based Distributed Search

IEEE Transactions on Parallel and Distributed Systems ◽

10.1109/tpds.2021.3135842 ◽

2021 ◽

pp. 1-1

Author(s):

Penghao Zhang ◽

Heng Pan ◽

Zhenyu Li ◽

Penglai Cui ◽

Ru Jia ◽

...

Keyword(s):

Distributed Search

Download Full-text

Research on Distributed Search Technology of Multiple Data Sources Intelligent Information Based on Knowledge Graph

Journal of Signal Processing Systems ◽

10.1007/s11265-020-01592-5 ◽

2020 ◽

Author(s):

Jihong Li ◽

Zhiqiang Wang ◽

Yuan Wang ◽

Zhaoyun Hua ◽

Wenfeng Jing

Keyword(s):

Data Sources ◽

Knowledge Graph ◽

Distributed Search ◽

Multiple Data Sources ◽

Multiple Data ◽

Intelligent Information

Download Full-text

Optimizing Search and Data Analytics of Twitter Data using Elastic Search Algorithms

International Journal of Engineering and Advanced Technology - Regular Issue ◽

10.35940/ijeat.f1535.089620 ◽

2020 ◽

Vol 9 (6) ◽

pp. 427-433

Keyword(s):

Social Media ◽

Business Intelligence ◽

Data Analytics ◽

Distributed Search ◽

Time Data ◽

Social Media Data ◽

Twitter Data ◽

Crucial Part ◽

Media Data ◽

Better Than

Quick data acquisition and analysis became an important tool in the contemporary era. Real time data is made available in World Wide Web (WWW) and social media. Especially social media data is rich in opinions of people of all walks of life. Searching and analysing such data provides required business intelligence (BI) for applications of various domains in the real world. The application may be in the area of politics or banking or insurance or healthcare industry. With the emergence of cloud computing, volumes of data are added to cloud storage infrastructure and it is growing exponentially. In this context, Elasticsearch is the distributed search and analytics engine that is very crucial part of Elastic Stack. For data collection, aggregation and enriching it Beats and Logstash are used and such data is stored in Elasticsearch. For interactive exploration and visualization Kibana is used. Elasticsearch helps in indexing of data, searching efficiently and performing data analytics. In this paper, the utility of Elasticsearch is evaluated for optimising search and data analytics of Twitter data. Empirical study is made with the Elasticsearch tool configured for Windows and also using Amazon Elasticsearch and the results are compared with state of art. The experimental results revealed that the Elasticsearch performs better than the existing ones.

Download Full-text

distributed search
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Search in a Redecentralised Web

QiBAM: Approximate Sub-String Index Search on Quantum Accelerators Applied to DNA Read Alignment

NetANNS: A High-Performance Distributed Search Framework Based On In-Network Computing

Efficient distributed discovery of bidirectional order dependencies

Accelerating LSH-based Distributed Search with In-network Computation

Ensemble Distributed Search-FSGM-CRD Compressed Cache Algorithm for Large Datasets

Efficient Lévy walks in virtual human foraging

NetSHa: In-network Acceleration of LSH-based Distributed Search

Research on Distributed Search Technology of Multiple Data Sources Intelligent Information Based on Knowledge Graph

Optimizing Search and Data Analytics of Twitter Data using Elastic Search Algorithms

Export Citation Format

distributed searchRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Search in a Redecentralised Web

QiBAM: Approximate Sub-String Index Search on Quantum Accelerators Applied to DNA Read Alignment

NetANNS: A High-Performance Distributed Search Framework Based On In-Network Computing

Efficient distributed discovery of bidirectional order dependencies

Accelerating LSH-based Distributed Search with In-network Computation

Ensemble Distributed Search-FSGM-CRD Compressed Cache Algorithm for Large Datasets

Efficient Lévy walks in virtual human foraging

NetSHa: In-network Acceleration of LSH-based Distributed Search

Research on Distributed Search Technology of Multiple Data Sources Intelligent Information Based on Knowledge Graph

Optimizing Search and Data Analytics of Twitter Data using Elastic Search Algorithms

distributed search
Recently Published Documents