Big Data Mining and Knowledge Discovery

Author(s):  
Tomas Ruzgas ◽  
Kristina Jakubėlienė ◽  
Aistė Buivytė

The article dealt with exploration methods and tools for big data. It identifies the challenges encountered in the analysis of big data. Defined notion of big data. describe the technology for big data analysis. Article provides an overview of tools which are designed for big data analytics.

Author(s):  
Cerene Mariam Abraham ◽  
Mannathazhathu Sudheep Elayidom ◽  
Thankappan Santhanakrishnan

Background: Machine learning is one of the most popular research areas today. It relates closely to the field of data mining, which extracts information and trends from large datasets. Aims: The objective of this paper is to (a) illustrate big data analytics for the Indian derivative market and (b) identify trends in the data. Methods: Based on input from experts in the equity domain, the data are verified statistically using data mining techniques. Specifically, ten years of daily derivative data is used for training and testing purposes. The methods that are adopted for this research work include model generation using ARIMA, Hadoop framework which comprises mapping and reducing for big data analysis. Results: The results of this work are the observation of a trend that indicates the rise and fall of price in derivatives , generation of time-series similarity graph and plotting of frequency of temporal data. Conclusion: Big data analytics is an underexplored topic in the Indian derivative market and the results from this paper can be used by investors to earn both short-term and long-term benefits.


2018 ◽  
Vol 20 (1) ◽  
Author(s):  
Tiko Iyamu

Background: Over the years, big data analytics has been statically carried out in a programmed way, which does not allow for translation of data sets from a subjective perspective. This approach affects an understanding of why and how data sets manifest themselves into various forms in the way that they do. This has a negative impact on the accuracy, redundancy and usefulness of data sets, which in turn affects the value of operations and the competitive effectiveness of an organisation. Also, the current single approach lacks a detailed examination of data sets, which big data deserve in order to improve purposefulness and usefulness.Objective: The purpose of this study was to propose a multilevel approach to big data analysis. This includes examining how a sociotechnical theory, the actor network theory (ANT), can be complementarily used with analytic tools for big data analysis.Method: In the study, the qualitative methods were employed from the interpretivist approach perspective.Results: From the findings, a framework that offers big data analytics at two levels, micro- (strategic) and macro- (operational) levels, was developed. Based on the framework, a model was developed, which can be used to guide the analysis of heterogeneous data sets that exist within networks.Conclusion: The multilevel approach ensures a fully detailed analysis, which is intended to increase accuracy, reduce redundancy and put the manipulation and manifestation of data sets into perspectives for improved organisations’ competitiveness.


2019 ◽  
Vol 26 (2) ◽  
pp. 981-998 ◽  
Author(s):  
Kenneth David Strang ◽  
Zhaohao Sun

The goal of the study was to identify big data analysis issues that can impact empirical research in the healthcare industry. To accomplish that the author analyzed big data related keywords from a literature review of peer reviewed journal articles published since 2011. Topics, methods and techniques were summarized along with strengths and weaknesses. A panel of subject matter experts was interviewed to validate the intermediate results and synthesize the key problems that would likely impact researchers conducting quantitative big data analysis in healthcare studies. The systems thinking action research method was applied to identify and describe the hidden issues. The findings were similar to the extant literature but three hidden fatal issues were detected. Methodical and statistical control solutions were proposed to overcome the three fatal healthcare big data analysis issues.


2014 ◽  
Vol 484-485 ◽  
pp. 922-926
Author(s):  
Xiang Ju Liu

This paper introduces the operational characteristics of the era of big data and the current era of big data challenges, and exhaustive research and design of big data analytics platform based on cloud computing, including big data analytics platform architecture system, big data analytics platform software architecture , big data analytics platform network architecture big data analysis platform unified program features and so on. The paper also analyzes the cloud computing platform for big data analysis program unified competitive advantage and development of business telecom operators play a certain role in the future.


2020 ◽  
Author(s):  
Elham Nazari ◽  
Maryam Edalati Khodabandeh ◽  
Ali Dadashi ◽  
Marjan Rasoulian ◽  
hamed tabesh

Abstract Introdution Today, with the advent of technologies and the production of huge amounts of data, Big Data analytics has received much attention especially in healthcare. Understanding this field and recognizing its benefits, applications and challenges provide useful background for conducting efficient research. Therefore, the purpose of this study was to evaluate the students' familiarity from different universities of Mashhad with the benefits, applications and challenges of Big Data analysis.Method This is a cross-sectional study that was conducted on students of Medical Engineering, Medical Informatics, Medical Records and Health Information Management in Mashhad-Iran. A questionnaire was designed based on literature review in pubmed, google scholar, science direct and EMBASE databases, using Delphi method and presence of 10 experts from different fields of study. The designed questionnaire evaluated the opinion of students regarding benefits, challenges and applications of Big Data analytics. 200 students participated in the study and completed the designed questionnaire. Participants' opinions were evaluated descriptively and analytically. Result Most students were between 20 and 30 years old. 63% of them were male and 43.5% had no work experience. Current and previous field of study of most of the students were HIT, HIM, and Medical Records. Most of the participants in this study were undergraduates. 61.5% were economically active, 54.5% were exposed to Big Data. The mean scores of participants in benefits, applications, and challenges section were 3.71, 3.68, and 3.71, respectively, and process management was significant in different age groups (p=0.046), information, modeling, research, and health informatics across different fields of studies were significant (p=0.015, 0.033, 0.001, 0.024) Information and research were significantly different between groups (p=0.043 and 0.019), research in groups with / without economic activity was significant (p= 0.017) and information in exposure / non exposure to Big Data groups was significant (p=0.02). Conclusion Despite the importance and benefits of Big Data analytics, students' lack of familiarity with the necessity and importance of these analytics in industries and research is significant. The field of study and level of study do not appear to have an effect on the degree of knowledge of individuals regarding Big Data analysis. The design of technical training courses in this field may increase the level of knowledge of individuals regarding Big Data analysis.


This chapter aims at exploring the intersection of cloud computing with big data. The big data analysis, mining, and privacy concerns are discussed. First, this chapter deals with the software framework, MapReduce™ that is commonly used for performing Big Data Analysis in the clouds. In addition, some of the most used techniques for performing Big Data Mining are detailed. For instance, Clustering, Co-Clustering, and Association Rules are described in detail. In particular, the k-center problem is described while with reference to the association rules beyond the basic definitions, the Apriori Algorithm is outlined and illustrated by some numerical examples. These techniques are also described with reference to their versions based on MapReduce. Finally, the description of some real applications conclude the chapter.


Author(s):  
Adeel Shiraz Hashmi ◽  
Tanvir Ahmad

We are now in Big Data era, and there is a growing demand for tools which can process and analyze it. Big data analytics deals with extracting valuable information from that complex data which can’t be handled by traditional data mining tools. This paper surveys the available tools which can handle large volumes of data as well as evolving data streams. The data mining tools and algorithms which can handle big data have also been summarized, and one of the tools has been used for mining of large datasets using distributed algorithms.


Author(s):  
В.Т. Чая ◽  
Н.И. Чупахина

В связи с развитием технологий цифровой экономики возрастает по экспоненте и объем оцифрованной информации. Но информация имеет ценность, только если она анализируется определенным образом. Большие же объемы информации привычными методами анализировать невозможно. Речь уже идет о больших данных и технологиях больших данных. В статье описаны особенности больших данных. Рассмотрены методы и инструменты анализа больших данных. Подробно рассматривается такой метод решения задач на основе больших данных, как машинное обучение. In connection with the development of digital economy technologies, the volume of digitized information is growing exponentially. But information has value only if it is analyzed in a certain way. It is impossible to analyze large amounts of information using the usual methods. We are already talking about big data and big data technologies. The article describes the features of big data. Methods and tools for big data analysis are considered. Such a method of solving problems based on big data as machine learning is considered in detail.


Sign in / Sign up

Export Citation Format

Share Document