Research of CouchDB Storage Plugin for Big Data Query Engine Apache Drill

Author(s):  
Yulei Liao ◽  
Liang Tan
Keyword(s):  
Big Data ◽  
2013 ◽  
Vol 756-759 ◽  
pp. 916-921
Author(s):  
Ye Liang

The amount of data in our industry and the world is exploding. Data is being collected and stored at unprecedented rates. The challenge is not only to store and manage the vast volume of data, which is also called big data, but also to analyze and query from it. In order to put forward the universal method to response mobile big data query, queries are separated and grouped according to kinds of query for massive mobile objects in the space. The indexing method for grouping the mobile objects with Grid (GG TPR-tree) has great efficiency to manage a massive capacity of mobile objects within a limited area, but it only could meet a part of requirements for mobile big data query if the GG TPR-tree was used solely. This thesis offers solutions to simple immediate query, simple continuous query, active window query, and continuous window query, dynamic condition query and other query requests by employing DTDI index structure. The experiments prove that with the support of DTDI index structure, query of massive mobile objects has higher precision and better query performance.


2017 ◽  
pp. 179-217
Author(s):  
Mohamed A. Soliman
Keyword(s):  
Big Data ◽  

2016 ◽  
Vol 9 (12) ◽  
pp. 1005-1016 ◽  
Author(s):  
Hai Liu ◽  
Dongqing Xiao ◽  
Pankaj Didwania ◽  
Mohamed Y. Eltabakh

2018 ◽  
Vol 232 ◽  
pp. 01004
Author(s):  
Wenshuai Ge ◽  
Gang He ◽  
Xinwen Liu

This paper proposes a big data query system for customized queries based on specific business needs. This paper introduces the components and structure of the query system. ANTLR tools are used as language recognizer to design and implement a customized SQL dialect. The system builds a simpler and easier query interface on Spark SQL, which satisfies the query requirements of the Internet user behavior analysis platform.


Author(s):  
Nurfadhlina Mohd Sharef ◽  
◽  
Yasser M. Shafazand ◽  
Mohd Zakree Ahmad Nazri ◽  
Nor Azura Husin ◽  
...  

2014 ◽  
Vol 989-994 ◽  
pp. 4594-4597
Author(s):  
Chun Zhi Xing

With the development of Internet, various Internet-based large-scale data are facing increasing competition. With the hope of satisfying the need of data query, it is necessary to use data mining and distributed processing. As a consequence, this paper proposes a large-scale data mining and distributed processing method based on decision tree algorithm.


2021 ◽  
Vol 12 (3) ◽  
Author(s):  
Sávio S. T. De Oliveira ◽  
Vagner J. S. Rodrigues ◽  
Wellington S. Martins

Spatiotemporal data has always been big data. In these days, big data analytics for spatiotemporal data is receiving considerable attention to allow users to analyze huge amounts of data. Traditional big data platforms cannot handle all the challenges of processing spatio-temporal data. Although some big data platforms have been proposed to process a massive volume of spatiotemporal data, neither is considered a clear winner for all possible scenarios. This paper presents the SmarT query engine, a machine learning-based solution that chooses the best big data platform for processing spatiotemporal queries on the fly. In a detailed experimental evaluation, considering the Apache Spark, Elasticsearch, and SciDB big data platforms, the response time decreased up to 22% when using SmarT.


2021 ◽  
Author(s):  
Anuja S. ◽  
Malathy C.

Abstract In today's world, most of the private and public sector organizations deal with massive amounts of raw data, which includes information and knowledge in their secret layer. In addition, the format, scale, variety, and velocity of generated data make it more difficult to use the algorithms in an efficient manner. This complexity necessitates the use of sophisticated methods, strategies, and algorithms to solve the challenges of managing raw data. Big data query optimization (BDQO) requires businesses to define, diagnose, forecast, prescribe, and cognize hidden growth opportunities and guiding them toward achieving market value. BDQO uses advanced analytical methods to extract information from an increasingly growing volume of data, resulting in a reduction in the difficulty of the decision-making process. Hadoop, Apache Hive, No SQL, Map Reduce, and HPCC are the technologies used in big data applications to manage large data. It is less costly to consume data for query processing because big data provides scalability. However, small businesses will never be able to query large databases. Joining tables with millions of tuples could take hours. Parallelism, which solves the problem by using more processors, may be a potential solution. Unfortunately, small businesses cannot afford to operate on a shoestring budget. There are many techniques to tackle the problem. The technologies used in the big data query optimization process are discussed in depth in this paper.


Sign in / Sign up

Export Citation Format

Share Document