A Survey on Big IoT Data Indexing: Potential Solutions, Recent Advancements, and Open Issues

Zineddine Kouahla; Ala-Eddine Benrazek; Mohamed Amine Ferrag; Brahim Farou; Hamid Seridi; Muhammet Kurulay; Adeel Anjum; Alia Asheralieva

doi:10.3390/fi14010019

A Survey on Big IoT Data Indexing: Potential Solutions, Recent Advancements, and Open Issues

Future Internet ◽

10.3390/fi14010019 ◽

2021 ◽

Vol 14 (1) ◽

pp. 19

Author(s):

Zineddine Kouahla ◽

Ala-Eddine Benrazek ◽

Mohamed Amine Ferrag ◽

Brahim Farou ◽

Hamid Seridi ◽

...

Keyword(s):

Data Storage ◽

Large Scale ◽

Search Time ◽

Large Data ◽

Open Problems ◽

Large Scale Data ◽

Indexing Techniques ◽

Efficient Retrieval ◽

Data Collections ◽

Scale Data

The past decade has been characterized by the growing volumes of data due to the widespread use of the Internet of Things (IoT) applications, which introduced many challenges for efficient data storage and management. Thus, the efficient indexing and searching of large data collections is a very topical and urgent issue. Such solutions can provide users with valuable information about IoT data. However, efficient retrieval and management of such information in terms of index size and search time require optimization of indexing schemes which is rather difficult to implement. The purpose of this paper is to examine and review existing indexing techniques for large-scale data. A taxonomy of indexing techniques is proposed to enable researchers to understand and select the techniques that will serve as a basis for designing a new indexing scheme. The real-world applications of the existing indexing techniques in different areas, such as health, business, scientific experiments, and social networks, are presented. Open problems and research challenges, e.g., privacy and large-scale data mining, are also discussed.

Download Full-text

Efficient Indexing RDF Query Algorithm for Big Data

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.441.691 ◽

2013 ◽

Vol 441 ◽

pp. 691-694

Author(s):

Yi Qun Zeng ◽

Jing Bin Wang

Keyword(s):

Large Scale ◽

Rapid Development ◽

Large Data ◽

Index Structure ◽

Data Query ◽

Large Scale Data ◽

Tree Index ◽

Rdf Data ◽

Query Algorithm ◽

Scale Data

With the rapid development of information technology, data grows explosionly, how to deal with the large scale data become more and more important. Based on the characteristics of RDF data, we propose to compress RDF data. We construct an index structure called PAR-Tree Index, then base on the MapReduce parallel computing framework and the PAR-Tree Index to execute the query. Experimental results show that the algorithm can improve the efficiency of large data query.

Download Full-text

Learning to classify short and sparse text & web with hidden topics from large-scale data collections

Proceeding of the 17th international conference on World Wide Web - WWW '08 ◽

10.1145/1367497.1367510 ◽

2008 ◽

Cited By ~ 313

Author(s):

Xuan-Hieu Phan ◽

Le-Minh Nguyen ◽

Susumu Horiguchi

Keyword(s):

Large Scale ◽

Large Scale Data ◽

Data Collections ◽

Scale Data

Download Full-text

A Hybrid Shared-Nothing/Shared-Data Storage Scheme for Large-Scale Data Processing

2011 IEEE Ninth International Symposium on Parallel and Distributed Processing with Applications ◽

10.1109/ispa.2011.43 ◽

2011 ◽

Author(s):

Huaiming Song ◽

Xian-He Sun ◽

Yong Chen

Keyword(s):

Data Processing ◽

Data Storage ◽

Large Scale ◽

Shared Data ◽

Large Scale Data ◽

Storage Scheme ◽

Large Scale Data Processing ◽

Scale Data

Download Full-text

Large-Scale Data Storage and Management Scheme Based on Distributed Database Systems

Proceedings of the 2017 International Conference on Information Technology and Intelligent Manufacturing (ITIM 2017) ◽

10.2991/itim-17.2017.4 ◽

2017 ◽

Author(s):

Qiao Sun ◽

Buqiao Deng ◽

Lanwei Fu ◽

Zhiqiang Wang ◽

Xubin Pei ◽

...

Keyword(s):

Data Storage ◽

Large Scale ◽

Database Systems ◽

Distributed Database ◽

Distributed Database Systems ◽

Large Scale Data ◽

Management Scheme ◽

Scale Data

Download Full-text

Large-Scale Data Streaming in Fog Computing and Its Applications

Large-Scale Data Streaming, Processing, and Blockchain Security - Advances in Information Security, Privacy, and Ethics ◽

10.4018/978-1-7998-3444-1.ch003 ◽

2021 ◽

pp. 50-65

Author(s):

Oshin Sharma ◽

Anusha S.

Keyword(s):

Data Storage ◽

Large Scale ◽

Fog Computing ◽

Data Streaming ◽

Cloud Data ◽

Large Scale Data ◽

Emerging Trends ◽

Iot Devices ◽

Data Centres ◽

Scale Data

The emerging trends in fog computing have increased the interests and focus in both industry and academia. Fog computing extends cloud computing facilities like the storage, networking, and computation towards the edge of networks wherein it offloads the cloud data centres and reduces the latency of providing services to the users. This paradigm is like cloud in terms of data, storage, application, and computation services, except with a fundamental difference: it is decentralized. Furthermore, these fog systems can process huge amounts of data locally and can be installed on hardware of different types. These characteristics make fog suitable for time- and location-based applications like internet of things (IoT) devices which can process large amounts of data. In this chapter, the authors present fog data streaming, its architecture, and various applications.

Download Full-text

Improvement of K-Means Algorithm for Accelerated Big Data Clustering

International Journal of Information Technologies and Systems Approach ◽

10.4018/ijitsa.2021070107 ◽

2021 ◽

Vol 14 (2) ◽

pp. 99-119

Author(s):

Chunqiong Wu ◽

Bingwen Yan ◽

Rongrui Yu ◽

Zhangshu Huang ◽

Baoqin Yu ◽

...

Keyword(s):

Data Mining ◽

Data Clustering ◽

Large Scale ◽

Rapid Development ◽

Large Data ◽

Data Retrieval ◽

Research Directions ◽

Large Scale Data ◽

Rich Information ◽

Scale Data

With the rapid development of the computer level, especially in recent years, “Internet +,” cloud platforms, etc. have been used in various industries, and various types of data have grown in large quantities. Behind these large amounts of data often contain very rich information, relying on traditional data retrieval and analysis methods, and data management models can no longer meet our needs for data acquisition and management. Therefore, data mining technology has become one of the solutions to how to quickly obtain useful information in today's society. Effectively processing large-scale data clustering is one of the important research directions in data mining. The k-means algorithm is the simplest and most basic method in processing large-scale data clustering. The k-means algorithm has the advantages of simple operation, fast speed, and good scalability in processing large data, but it also often exposes fatal defects in data processing. In view of some defects exposed by the traditional k-means algorithm, this paper mainly improves and analyzes from two aspects.

Download Full-text

DNA fingerprinting ofStreptococcus uberis: a useful tool for epidemiology of bovine mastitis

Epidemiology and Infection ◽

10.1017/s0950268800030466 ◽

1989 ◽

Vol 103 (1) ◽

pp. 165-171 ◽

Cited By ~ 35

Author(s):

A. W. Hill ◽

J. A. Leigh

Keyword(s):

Data Storage ◽

Large Scale ◽

Fragment Size ◽

Bovine Mastitis ◽

Chromosomal Dna ◽

Restriction Patterns ◽

Large Scale Data ◽

Streptococcus Uberis ◽

Image Analyser ◽

Scale Data

SUMMARYA simple and reproducible typing system based on restriction fragment size of chromosomal DNA was developed to compare isolates ofStreptococcus uberisobtained from the bovine mammary gland. The endonuclease giving the most useful restriction patterns wasHindIII, although seven other endonucleases (Bgl1,EcoR1,Not1,Pst1,Sfi1,Sma1,Xba1) were also tested in the system. An image analyser was used to obtain a densitometric scan and a graphic display of the restriction patterns. Such a system will allow large scale data storage for future computer-aided comparison.

Download Full-text

Effect of Replica Placement on the Reliability of Large-Scale Data Storage Systems

2010 IEEE International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems ◽

10.1109/mascots.2010.17 ◽

2010 ◽

Cited By ~ 11

Author(s):

Vinodh Venkatesan ◽

Ilias Iliadis ◽

Xiao-Yu Hu ◽

Robert Haas ◽

Christina Fragouli

Keyword(s):

Data Storage ◽

Large Scale ◽

Storage Systems ◽

Replica Placement ◽

Large Scale Data ◽

Scale Data

Download Full-text

Large-Scale Data Storage Scheme in Blockchain Ledger Using IPFS and NoSQL

Large-Scale Data Streaming, Processing, and Blockchain Security - Advances in Information Security, Privacy, and Ethics ◽

10.4018/978-1-7998-3444-1.ch005 ◽

2021 ◽

pp. 91-116

Author(s):

Randhir Kumar ◽

Rakesh Tripathi

Keyword(s):

Data Storage ◽

Real World ◽

Large Scale ◽

Original Data ◽

Size Reduction ◽

Storage Model ◽

Large Scale Data ◽

Efficient Storage ◽

Storage Scheme ◽

Scale Data

The future applications of blockchain are expected to serve millions of users. To provide variety of services to the users, using underlying technology has to consider large-scale storage and assessment behind the scene. Most of the current applications of blockchain are working either on simulators or via small blockchain network. However, the storage issue in the real world is unpredictable. To address the issue of large-scale data storage, the authors have introduced the data storage scheme in blockchain (DSSB). The storage model executes behind the blockchain ledger to store large-scale data. In DSSB, they have used hybrid storage model using IPFS and MongoDB(NoSQL) in order to provide efficient storage for large-scale data in blockchain. In this storage model, they have maintained the content-addressed hash of the transactions on blockchain network to ensure provenance. In DSSB, they are storing the original data (large-scale data) into MongoDB and IPFS. The DSSB model not only provides efficient storage of large-scale data but also provides storage size reduction of blockchain ledger.

Download Full-text

A Survey of Cloud-Based Services Leveraged by Big Data Applications

Web Services ◽

10.4018/978-1-5225-7501-6.ch088 ◽

2019 ◽

pp. 1706-1716

Author(s):

S. ZerAfshan Goher ◽

Barkha Javed ◽

Peter Bloodsworth

Keyword(s):

Big Data ◽

Data Storage ◽

Data Analytics ◽

Large Scale ◽

Future Trends ◽

Advantages And Disadvantages ◽

Large Scale Data ◽

Big Data Applications ◽

Big Data Storage ◽

Scale Data

Due to the growing interest in harnessing the hidden significance of data, more and more enterprises are moving to data analytics. Data analytics require the analysis and management of large-scale data to find the hidden patterns among various data components to gain useful insight. The derived information is then used to predict the future trends that can be advantageous for a business to flourish such as customers' likes/dislikes, reasons behind customers' churn and more. In this paper, several techniques for the big data analysis have been investigated along with their advantages and disadvantages. The significance of cloud computing for big data storage has also been discussed. Finally, the techniques to make the robust and efficient usage of big data have also been discussed.

Download Full-text