Design and Implementation of Efficient Storage and Retrieval Technology of Traffic Big Data

2019 ◽  
Vol 4 (2) ◽  
pp. 207-220
Author(s):  
김기수 ◽  
Yukun Hahm ◽  
장유림 ◽  
Jaejin Yi ◽  
HONGHOI KIM
2019 ◽  
Vol 35 (4) ◽  
pp. 893-903 ◽  
Author(s):  
Seemu Sharma ◽  
Seema Bawa

Abstract Cultural data and information on the web are continuously increasing, evolving, and reshaping in the form of big data due to globalization, digitization, and its vast exploration, with common people realizing the importance of ancient values. Therefore, before it becomes unwieldy and too complex to manage, its integration in the form of big data repositories is essential. This article analyzes the complexity of the growing cultural data and presents a Cultural Big Data Repository as an efficient way to store and retrieve cultural big data. The repository is highly scalable and provides integrated high-performance methods for big data analytics in cultural heritage. Experimental results demonstrate that the proposed repository outperforms in terms of space as well as storage and retrieval time of Cultural Big Data.


10.28945/2192 ◽  
2015 ◽  
Author(s):  
Rogério Rossi ◽  
Kechi Hirama

[The final form of this paper was published in the journal Issues in Informing Science and Information Technology.] Considering that big data is a reality for an increasing number of organizations in many areas, its management represents a set of challenges involving big data modeling, storage and retrieval, analysis and visualization. However, technological resources, people and processes are crucial dimensions to facilitate the management of big data in any organization, allowing information and knowledge from a large volume of data to support decision-making. Big data management must be supported by technology, people and processes; hence, this article discusses these three dimensions: the technologies for storage, analysis and visualization of big data; the human aspects of big data; and, in addition, the process management involved in a technological and business approach for big data management.


2018 ◽  
Vol 18 (03) ◽  
pp. e23 ◽  
Author(s):  
María José Basgall ◽  
Waldo Hasperué ◽  
Marcelo Naiouf ◽  
Alberto Fernández ◽  
Francisco Herrera

The volume of data in today's applications has meant a change in the way Machine Learning issues are addressed. Indeed, the Big Data scenario involves scalability constraints that can only be achieved through intelligent model design and the use of distributed technologies. In this context, solutions based on the Spark platform have established themselves as a de facto standard. In this contribution, we focus on a very important framework within Big Data Analytics, namely classification with imbalanced datasets. The main characteristic of this problem is that one of the classes is underrepresented, and therefore it is usually more complex to find a model that identifies it correctly. For this reason, it is common to apply preprocessing techniques such as oversampling to balance the distribution of examples in classes. In this work we present SMOTE-BD, a fully scalable preprocessing approach for imbalanced classification in Big Data. It is based on one of the most widespread preprocessing solutions for imbalanced classification, namely the SMOTE algorithm, which creates new synthetic instances according to the neighborhood of each example of the minority class. Our novel development is made to be independent of the number of partitions or processes created to achieve a higher degree of efficiency. Experiments conducted on different standard and Big Data datasets show the quality of the proposed design and implementation.


Sign in / Sign up

Export Citation Format

Share Document