BIG: a large-scale data integration tool for renal physiology

Continued technological advancements of the 21st Century afford massive data generation in sectors of our economy to include the domains of agriculture, manufacturing, and education. However, harnessing such large-scale data, using modern technologies for effective decision-making appears to be an evolving science that requires knowledge of Big Data management and analytics. Big data in agriculture, manufacturing, and education are varied such as voluminous text, images, and graphs. Applying Big data science techniques (e.g., functional algorithms) for extracting intelligence data affords decision markers quick response to productivity, market resilience, and student enrollment challenges in today's unpredictable markets. This chapter serves to employ data science for potential solutions to Big Data applications in the sectors of agriculture, manufacturing and education to a lesser extent, using modern technological tools such as Hadoop, Hive, Sqoop, and MongoDB.

Download Full-text

Affordances of Data Science in Agriculture, Manufacturing, and Education

Privacy and Security Policies in Big Data - Advances in Information Security, Privacy, and Ethics ◽

10.4018/978-1-5225-2486-1.ch002 ◽

2017 ◽

pp. 14-40 ◽

Cited By ~ 2

Author(s):

Krishnan Umachandran ◽

Debra Sharon Ferdinand-James

Keyword(s):

Big Data ◽

Large Scale ◽

Data Science ◽

Data Generation ◽

Large Scale Data ◽

Big Data Applications ◽

Effective Decision ◽

Effective Decision Making ◽

Text Images ◽

Scale Data

Continued technological advancements of the 21st Century afford massive data generation in sectors of our economy to include the domains of agriculture, manufacturing, and education. However, harnessing such large-scale data, using modern technologies for effective decision-making appears to be an evolving science that requires knowledge of Big Data management and analytics. Big data in agriculture, manufacturing, and education are varied such as voluminous text, images, and graphs. Applying Big data science techniques (e.g., functional algorithms) for extracting intelligence data affords decision markers quick response to productivity, market resilience, and student enrollment challenges in today's unpredictable markets. This chapter serves to employ data science for potential solutions to Big Data applications in the sectors of agriculture, manufacturing and education to a lesser extent, using modern technological tools such as Hadoop, Hive, Sqoop, and MongoDB.

Download Full-text

Neural Network for Big Data Sets

10.4018/978-1-6684-2408-7.ch003 ◽

2022 ◽

pp. 41-67

Author(s):

Vo Ngoc Phu ◽

Vo Thi Ngoc Tran

Keyword(s):

Neural Network ◽

Big Data ◽

Computer Science ◽

Large Scale ◽

Massive Data ◽

Data Sets ◽

Massive Data Sets ◽

Large Scale Data ◽

Commercial Applications ◽

Novel Model

Machine learning (ML), neural network (NN), evolutionary algorithm (EA), fuzzy systems (FSs), as well as computer science have been very famous and very significant for many years. They have been applied to many different areas. They have contributed much to developments of many large-scale corporations, massive organizations, etc. Lots of information and massive data sets (MDSs) have been generated from these big corporations, organizations, etc. These big data sets (BDSs) have been the challenges of many commercial applications, researches, etc. Therefore, there have been many algorithms of the ML, the NN, the EA, the FSs, as well as computer science which have been developed to handle these massive data sets successfully. To support for this process, the authors have displayed all the possible algorithms of the NN for the large-scale data sets (LSDSs) successfully in this chapter. Finally, they have presented a novel model of the NN for the BDS in a sequential environment (SE) and a distributed network environment (DNE).

Download Full-text

On the Effectiveness of Hybrid Canopy with Hoeffding Adaptive Naive Bayes Trees

International Journal of Applied Evolutionary Computation ◽

10.4018/ijaec.2017040102 ◽

2017 ◽

Vol 8 (2) ◽

pp. 30-43

Author(s):

Mrutyunjaya Panda

Keyword(s):

Big Data ◽

Clustering Analysis ◽

Large Scale ◽

Data Sets ◽

Recent Past ◽

Large Scale Data ◽

Huge Data ◽

With Memory ◽

Memory Constraints ◽

Scale Data

The Big Data, due to its complicated and diverse nature, poses a lot of challenges for extracting meaningful observations. This sought smart and efficient algorithms that can deal with computational complexity along with memory constraints out of their iterative behavior. This issue may be solved by using parallel computing techniques, where a single machine or a multiple machine can perform the work simultaneously, dividing the problem into sub problems and assigning some private memory to each sub problems. Clustering analysis are found to be useful in handling such a huge data in the recent past. Even though, there are many investigations in Big data analysis are on, still, to solve this issue, Canopy and K-Means++ clustering are used for processing the large-scale data in shorter amount of time with no memory constraints. In order to find the suitability of the approach, several data sets are considered ranging from small to very large ones having diverse filed of applications. The experimental results opine that the proposed approach is fast and accurate.

Download Full-text

Influencing Factors of e-Commerce Enterprise Development Based on Mobile Computing Big Data Analysis

Wireless Communications and Mobile Computing ◽

10.1155/2021/8750111 ◽

2021 ◽

Vol 2021 ◽

pp. 1-12

Author(s):

Yixue Zhu ◽

Boyue Chai

Keyword(s):

Big Data ◽

Data Analysis ◽

Large Scale ◽

Big Data Analysis ◽

Support Vector ◽

Data Sets ◽

Large Scale Data ◽

Vector Machines ◽

Physical Information ◽

Scale Data

With the development of increasingly advanced information technology and electronic technology, especially with regard to physical information systems, cloud computing systems, and social services, big data will be widely visible, creating benefits for people and at the same time facing huge challenges. In addition, with the advent of the era of big data, the scale of data sets is getting larger and larger. Traditional data analysis methods can no longer solve the problem of large-scale data sets, and the hidden information behind big data is digging out, especially in the field of e-commerce. We have become a key factor in competition among enterprises. We use a support vector machine method based on parallel computing to analyze the data. First, the training samples are divided into several working subsets through the SOM self-organizing neural network classification method. Compared with the ever-increasing progress of information technology and electronic equipment, especially the related physical information system finally merges the training results of each working set, so as to quickly deal with the problem of massive data prediction and analysis. This paper proposes that big data has the flexibility of expansion and quality assessment system, so it is meaningful to replace the double-sidedness of quality assessment with big data. Finally, considering the excellent performance of parallel support vector machines in data mining and analysis, we apply this method to the big data analysis of e-commerce. The research results show that parallel support vector machines can solve the problem of processing large-scale data sets. The emergence of data dirty problems has increased the effective rate by at least 70%.

Download Full-text

Analysis of Economic Development Trend in Postepidemic Era Based on Improved Clustering Algorithm

Scientific Programming ◽

10.1155/2021/4467001 ◽

2021 ◽

Vol 2021 ◽

pp. 1-14

Author(s):

Li Guo ◽

Kunlin Zhu ◽

Ruijun Duan

Keyword(s):

Economic Development ◽

Large Scale ◽

Clustering Algorithm ◽

Development Trend ◽

Data Sets ◽

Analysis Model ◽

Data Set ◽

Clustering Problem ◽

Large Scale Data ◽

Compressed Data

In order to explore the economic development trend in the postepidemic era, this paper improves the traditional clustering algorithm and constructs a postepidemic economic development trend analysis model based on intelligent algorithms. In order to solve the clustering problem of large-scale nonuniform density data sets, this paper proposes an adaptive nonuniform density clustering algorithm based on balanced iterative reduction and uses the algorithm to further cluster the compressed data sets. For large-scale data sets, the clustering results can accurately reflect the class characteristics of the data set as a whole. Moreover, the algorithm greatly improves the time efficiency of clustering. From the research results, we can see that the improved clustering algorithm has a certain effect on the analysis of economic development trends in the postepidemic era and can continue to play a role in subsequent economic analysis.

Download Full-text

Big Data Analysis with Hadoop on Personalized Incentive Model with Statistical Hotel Customer Data

International Journal of Software Innovation ◽

10.4018/ijsi.2016070101 ◽

2016 ◽

Vol 4 (3) ◽

pp. 1-21 ◽

Cited By ~ 4

Author(s):

Sungchul Lee ◽

Eunmin Hwang ◽

Ju-Yeon Jo ◽

Yoohwan Kim

Keyword(s):

Big Data ◽

Data Analysis ◽

Large Scale ◽

Big Data Analysis ◽

Data Sets ◽

New Approach ◽

Customer Data ◽

Customer Information ◽

Large Scale Data ◽

Incentive Model

Due to the advancement of Information Technology (IT), the hospitality industry is seeing a great value in gathering various kinds of and a large amount of customers' data. However, many hotels are facing a challenge in analyzing customer data and using it as an effective tool to understand the hospitality customers better and, ultimately, to increase the revenue. The authors' research attempts to resolve the current challenges of analyzing customer data in hospitality by utilizing the big data analysis tools, especially Hadoop and R. Hadoop is a framework for processing large-scale data. With the integration of new approach, their study demonstrates the ways of aggregating and analyzing the hospitality customer data to find meaningful customer information. Multiple decision trees are constructed from the customer data sets with the intention of classifying customers' needs and customers' clusters. By analyzing the customer data, the study suggests three strategies to increase the total expenditure of the customers within a limited amount of time during their stay.

Download Full-text

Neurosharing: large-scale data sets (spike, LFP) recorded from the hippocampal-entorhinal system in behaving rats

F1000Research ◽

10.12688/f1000research.3895.1 ◽

2014 ◽

Vol 3 ◽

pp. 98 ◽

Cited By ~ 28

Author(s):

Kenji Mizuseki ◽

Kamran Diba ◽

Eva Pastalkova ◽

Jeff Teeters ◽

Anton Sirota ◽

...

Keyword(s):

Hippocampal Neurons ◽

Large Scale ◽

Dorsal Hippocampus ◽

Field Potential ◽

Data Sets ◽

Data Set ◽

Large Scale Data ◽

Silicon Based ◽

Spike Waveforms ◽

Wave Shapes

Using silicon-based recording electrodes, we recorded neuronal activity of the dorsal hippocampus and dorsomedial entorhinal cortex from behaving rats. The entorhinal neurons were classified as principal neurons and interneurons based on monosynaptic interactions and wave-shapes. The hippocampal neurons were classified as principal neurons and interneurons based on monosynaptic interactions, wave-shapes and burstiness. The data set contains recordings from 7,736 neurons (6,100 classified as principal neurons, 1,132 as interneurons, and 504 cells that did not clearly fit into either category) obtained during 442 recording sessions from 11 rats (a total of 204.5 hours) while they were engaged in one of eight different behaviours/tasks. Both original and processed data (time stamp of spikes, spike waveforms, result of spike sorting and local field potential) are included, along with metadata of behavioural markers. Community-driven data sharing may offer cross-validation of findings, refinement of interpretations and facilitate discoveries.

Download Full-text

GeoBoost: An Incremental Deep Learning Approach toward Global Mapping of Buildings from VHR Remote Sensing Images

Remote Sensing ◽

10.3390/rs12111794 ◽

2020 ◽

Vol 12 (11) ◽

pp. 1794

Author(s):

Naisen Yang ◽

Hong Tang

Keyword(s):

Neural Networks ◽

Convolutional Neural Networks ◽

Large Scale ◽

Semantic Segmentation ◽

Data Sets ◽

Data Set ◽

Large Scale Data ◽

Learning Tasks ◽

Global Mapping ◽

Scale Data

Modern convolutional neural networks (CNNs) are often trained on pre-set data sets with a fixed size. As for the large-scale applications of satellite images, for example, global or regional mappings, these images are collected incrementally by multiple stages in general. In other words, the sizes of training datasets might be increased for the tasks of mapping rather than be fixed beforehand. In this paper, we present a novel algorithm, called GeoBoost, for the incremental-learning tasks of semantic segmentation via convolutional neural networks. Specifically, the GeoBoost algorithm is trained in an end-to-end manner on the newly available data, and it does not decrease the performance of previously trained models. The effectiveness of the GeoBoost algorithm is verified on the large-scale data set of DREAM-B. This method avoids the need for training on the enlarged data set from scratch and would become more effective along with more available data.

Download Full-text

Explanations as Discourse

Australasian Journal of Information Systems ◽

10.3127/ajis.v24i0.2519 ◽

2020 ◽

Vol 24 ◽

Author(s):

Sadaf Afrashteh ◽

Ida Someh ◽

Michael Davern

Keyword(s):

Decision Making ◽

Big Data ◽

Data Analytics ◽

Large Scale ◽

Big Data Analytics ◽

Customer Engagement ◽

Data Sets ◽

Research Directions ◽

Large Scale Data ◽

Scale Data

Big data analytics uses algorithms for decision-making and targeting of customers. These algorithms process large-scale data sets and create efficiencies in the decision-making process for organizations but are often incomprehensible to customers and inherently opaque in nature. Recent European Union regulations require that organizations communicate meaningful information to customers on the use of algorithms and the reasons behind decisions made about them. In this paper, we explore the use of explanations in big data analytics services. We rely on discourse ethics to argue that explanations can facilitate a balanced communication between organizations and customers, leading to transparency and trust for customers as well as customer engagement and reduced reputation risks for organizations. We conclude the paper by proposing future empirical research directions.

Download Full-text