Extracting-Transforming-Loading Modeling Approach for Big Data Analytics

2016 ◽  
Vol 8 (4) ◽  
pp. 50-69 ◽  
Author(s):  
Mahfoud Bala ◽  
Omar Boussaid ◽  
Zaia Alimazighi

Due to their widespread use, Internet, Web 2.0 and digital sensors create data in non-traditional volumes (at terabytes and petabytes scale). The big data characterized by the four V's has brought with it new challenges given the limited capabilities of traditional computing systems. This paper aims to provide solutions which can cope with very large data in Decision-Support Systems (DSSs). In the data integration phase, specifically, the authors propose a conceptual modeling approach for parallel and distributed Extracting-Transforming-Loading (ETL) processes. Among the complexity dimensions of big data, this study focuses on the volume of data to ensure a good performance for ETL processes. The authors' approach allows anticipating on the parallelization/distribution issues at the early stage of Data Warehouse (DW) projects. They have implemented an ETL platform called Parallel-ETL (P-ETL for short) and conducted some experiments. Their performance analysis reveals that the proposed approach enables to speed up ETL processes by up to 33% with the improvement rate being linear.

2021 ◽  
Vol 13 ◽  
pp. 175628722199813
Author(s):  
B. M. Zeeshan Hameed ◽  
Aiswarya V. L. S. Dhavileswarapu ◽  
Nithesh Naik ◽  
Hadis Karimi ◽  
Padmaraj Hegde ◽  
...  

Artificial intelligence (AI) has a proven record of application in the field of medicine and is used in various urological conditions such as oncology, urolithiasis, paediatric urology, urogynaecology, infertility and reconstruction. Data is the driving force of AI and the past decades have undoubtedly witnessed an upsurge in healthcare data. Urology is a specialty that has always been at the forefront of innovation and research and has rapidly embraced technologies to improve patient outcomes and experience. Advancements made in Big Data Analytics raised the expectations about the future of urology. This review aims to investigate the role of big data and its blend with AI for trends and use in urology. We explore the different sources of big data in urology and explicate their current and future applications. A positive trend has been exhibited by the advent and implementation of AI in urology with data available from several databases. The extensive use of big data for the diagnosis and treatment of urological disorders is still in its early stage and under validation. In future however, big data will no doubt play a major role in the management of urological conditions.


2017 ◽  
pp. 83-99
Author(s):  
Sivamathi Chokkalingam ◽  
Vijayarani S.

The term Big Data refers to large-scale information management and analysis technologies that exceed the capability of traditional data processing technologies. Big Data is differentiated from traditional technologies in three ways: volume, velocity and variety of data. Big data analytics is the process of analyzing large data sets which contains a variety of data types to uncover hidden patterns, unknown correlations, market trends, customer preferences and other useful business information. Since Big Data is new emerging field, there is a need for development of new technologies and algorithms for handling big data. The main objective of this paper is to provide knowledge about various research challenges of Big Data analytics. A brief overview of various types of Big Data analytics is discussed in this paper. For each analytics, the paper describes process steps and tools. A banking application is given for each analytics. Some of research challenges and possible solutions for those challenges of big data analytics are also discussed.


2019 ◽  
Vol 8 (S3) ◽  
pp. 35-40
Author(s):  
S. Mamatha ◽  
T. Sudha

In this digital world, as organizations are evolving rapidly with data centric asset the explosion of data and size of the databases have been growing exponentially. Data is generated from different sources like business processes, transactions, social networking sites, web servers, etc. and remains in structured as well as unstructured form. The term ― Big data is used for large data sets whose size is beyond the ability of commonly used software tools to capture, manage, and process the data within a tolerable elapsed time. Big data varies in size ranging from a few dozen terabytes to many petabytes of data in a single data set. Difficulties include capture, storage, search, sharing, analytics and visualizing. Big data is available in structured, unstructured and semi-structured data format. Relational database fails to store this multi-structured data. Apache Hadoop is efficient, robust, reliable and scalable framework to store, process, transforms and extracts big data. Hadoop framework is open source and fee software which is available at Apache Software Foundation. In this paper we will present Hadoop, HDFS, Map Reduce and c-means big data algorithm to minimize efforts of big data analysis using Map Reduce code. The objective of this paper is to summarize the state-of-the-art efforts in clinical big data analytics and highlight what might be needed to enhance the outcomes of clinical big data analytics tools and related fields.


Due to technological improvements in healthcare industry and clinical medicine, it requires to adapt new software techniques and tools to predict, diagnose and analyze disease patterns for making decisions in the early stage of disease. Parkinson’s disease is a neurodegenerative disorder. The PD damage the motor skills and may create speech problem and also affect the decision making process. Many people suffers with PD all over the world from many years. Day by day, the PD data has been increased, so the existing data mining predictive methods and tools does not give accurate results early for making decisions by doctors to save and increase the patient life period. Early PD symptoms can be detected by Big Data Analytics and proper medicine will be provided at the right time. In this paper, we are doing survey of predictive methods, Big Data Analytical techniques and also earlier researchers results presented.


Author(s):  
Ezz El-Din Hemdan ◽  
Manjaiah D. H.

Big Data Analytics has become an important paradigm that can help digital investigators to investigate cybercrimes as well as provide solutions to malware and threat prediction, detection and prevention at an early stage. Big Data Analytics techniques can use to analysis enormous amount of generated data from new technologies such as Social Networks, Cloud Computing and Internet of Things to understand the committed crimes in addition to predict the new coming severe attacks and crimes in the future. This chapter introduce principles of Digital Forensics and Big Data as well as exploring Big Data Analytics and Deep Learning benefits and advantages that can help the digital investigators to develop and propose new techniques and methods based on Big Data Analytics using Deep Learning techniques that can be adapted to the unique context of Digital Forensics as well as support performing digital investigation process in forensically sound and timely fashion manner.


Author(s):  
Mamata Rath

Big data analytics is an refined advancement for fusion of large data sets that include a collection of data elements to expose hidden prototype, undetected associations, showcase business logic, client inclinations, and other helpful business information. Big data analytics involves challenging techniques to mine and extract relevant data that includes the actions of penetrating a database, effectively mining the data, querying and inspecting data committed to enhance the technical execution of various task segments. The capacity to synthesize a lot of data can enable an association to manage impressive data that can influence the business. In this way, the primary goal of big data analytics is to help business relationship to have enhanced comprehension of data and, subsequently, settle on proficient and educated decisions.


Author(s):  
Pushpa Mannava

Data mining is considered as a vital procedure as it is used for locating brand-new, legitimate, useful as well as reasonable kinds of data. The assimilation of data mining methods in cloud computing gives a versatile and also scalable design that can be made use of for reliable mining of significant quantity of data from virtually incorporated data resources with the goal of creating beneficial information which is useful in decision making. The procedure of removing concealed, beneficial patterns, as well as useful info from big data is called big data analytics. This is done via using advanced analytics techniques on large data collections. This paper provides the information about big data analytics in intra-data center networks, components of data mining and also techniques of Data mining.


Sign in / Sign up

Export Citation Format

Share Document