scholarly journals Improving Query Response Time for Graph Data Using Materialization

Author(s):  
Abdul Waheed ◽  
◽  
Dr. Syed Saif ur Rahman
2019 ◽  
pp. 353-388
Author(s):  
S. Vasavi ◽  
Mallela Padma Priya ◽  
Anu A. Gokhale

We are moving towards digitization and making all our devices, such as sensors and cameras, connected to internet, producing bigdata. This bigdata has variety of data and has paved the way to the emergence of NoSQL databases, like Cassandra, for achieving scalability and availability. Hadoop framework has been developed for storing and processing distributed data. In this chapter, the authors investigated the storage and retrieval of geospatial data by integrating Hadoop and Cassandra using prefix-based partitioning and Cassandra's default partitioning algorithm (i.e., Murmur3partitioner) techniques. Geohash value is generated, which acts as a partition key and also helps in effective search. Hence, the time taken for retrieving data is optimized. When users request spatial queries like finding nearest locations, searching in Cassandra database starts using both partitioning techniques. A comparison on query response time is made so as to verify which method is more effective. Results show the prefix-based partitioning technique is more efficient than Murmur3 partitioning technique.


Author(s):  
S. Vasavi ◽  
Mallela Padma Priya ◽  
Anu A. Gokhale

We are moving towards digitization and making all our devices, such as sensors and cameras, connected to internet, producing bigdata. This bigdata has variety of data and has paved the way to the emergence of NoSQL databases, like Cassandra, for achieving scalability and availability. Hadoop framework has been developed for storing and processing distributed data. In this chapter, the authors investigated the storage and retrieval of geospatial data by integrating Hadoop and Cassandra using prefix-based partitioning and Cassandra's default partitioning algorithm (i.e., Murmur3partitioner) techniques. Geohash value is generated, which acts as a partition key and also helps in effective search. Hence, the time taken for retrieving data is optimized. When users request spatial queries like finding nearest locations, searching in Cassandra database starts using both partitioning techniques. A comparison on query response time is made so as to verify which method is more effective. Results show the prefix-based partitioning technique is more efficient than Murmur3 partitioning technique.


1995 ◽  
Vol 24 (2) ◽  
pp. 293-303 ◽  
Author(s):  
Weimin Du ◽  
Ming-Chien Shan ◽  
Umeshwar Dayal

2002 ◽  
Vol 11 (01n02) ◽  
pp. 119-144 ◽  
Author(s):  
NAVEEN ASHISH ◽  
CRAIG KNOBLOCK ◽  
CYRUS SHAHABI

There is currently great interest in building information mediators that can integrate information from multiple data sources such as databases or Web sources. The query response time for such mediators is typically quite high, mainly due to the time spent in retrieving data from remote sources. We present an approach for optimizing the performance of information mediators by selectively materializing data. We first present our overall framework for materialization in a mediator environment. The data is materialized selectively. We outline the factors that are considered in selecting data to materialize. We present an algorithm for identifying classes of data to materialize by analyzing one of the factors which is the distribution of user queries. We present results with an implemented version of our optimization system for the Ariadne information mediator, which show the effectiveness of our algorithm in extracting patterns of frequently accessed classes from user queries. We also demonstrate the effectiveness of approach in optimizing mediator performance by materializing such classes.


The data generated on social media platforms such as Twitter, Facebook, LinkedIn etc. are highly connected. Such data can be efficiently stored and analyzed using graph databases due to the inherent property of graphs to model connected data. To reduce the time complexity of data retrieval from huge graph databases, various indexing techniques are used. This paper presents an extensive empirical analysis on popular graph databases i.e. Neo4j, ArangoDB and OrientDB; with an aim to measure the competencies and effectiveness of primitive indexing techniques on query response time to identify the influencing entities from Twitter data. The analysis demonstrates that Neo4j performs efficient and stable for load, relation and property queries compare to other two databases whereas the performance of OrientDB can be improved using primitive indexing.


Sign in / Sign up

Export Citation Format

Share Document