massive data processing
Recently Published Documents


TOTAL DOCUMENTS

64
(FIVE YEARS 20)

H-INDEX

4
(FIVE YEARS 2)

2021 ◽  
Author(s):  
QIN Jun ◽  
SONG Yanyan ◽  
ZONG Ping

With the rapid development and popularization of information technology, cloud computing technology provides a good environment for solving massive data processing. Hadoop is an open-source implementation of MapReduce and has the ability to process large amounts of data. Aiming at the shortcomings of the fault-tolerant technology in the MapReduce programming model, this paper proposes a reliability task scheduling strategy that introduces a failure recovery mechanism, evaluates the trustworthiness of resource nodes in the cloud environment, establishes a trustworthiness model, and avoids task allocation to low reliability node, causing the task to be re-executed, wasting time and resources. Finally, the simulation platform CloudSim verifies the validity and stability of the task scheduling algorithm and scheduling model proposed in this paper.


2021 ◽  
Author(s):  
mengxi tan ◽  
xingyuan xu ◽  
David Moss

Abstract Optical artificial neural networks (ONNs) have significant potential for ultra-high computing speed and energy efficiency. We report a novel approach to ONNs that uses integrated Kerr optical micro-combs. This approach is programmable and scalable and is capable of reaching ultra-high speeds. We demonstrate the basic building block ONNs — a single neuron perceptron — by mapping synapses onto 49 wavelengths to achieve an operating speed of 11.9 x 109 operations per second, or Giga-OPS, at 8 bits per operation, which equates to 95.2 gigabits/s (Gbps). We test the perceptron on handwritten-digit recognition and cancer-cell detection — achieving over 90% and 85% accuracy, respectively. By scaling the perceptron to a deep learning network using off-the-shelf telecom technology we can achieve high throughput operation for matrix multiplication for real-time massive data processing.


Author(s):  
Mengxi Tan ◽  
Xingyuan Xu ◽  
David Moss

Optical artificial neural networks (ONNs) have significant potential for ultra-high computing speed and energy efficiency. We report a novel approach to ONNs that uses integrated Kerr optical micro-combs. This approach is programmable and scalable and is capable of reaching ultra-high speeds. We demonstrate the basic building block ONNs — a single neuron perceptron — by mapping synapses onto 49 wavelengths to achieve an operating speed of 11.9 x 109 operations per second, or Giga-OPS, at 8 bits per operation, which equates to 95.2 gigabits/s (Gbps). We test the perceptron on handwritten-digit recognition and cancer-cell detection — achieving over 90% and 85% accuracy, respectively. By scaling the perceptron to a deep learning network using off-the-shelf telecom technology we can achieve high throughput operation for matrix multiplication for real-time massive data processing.


2021 ◽  
Author(s):  
David Moss

Optical artificial neural networks (ONNs) have significant potential for ultra-high computing speed and energy efficiency. We report a new approach to ONNs based on integrated Kerr micro-combs that is programmable, highly scalable and capable of reaching ultra-high speeds, demonstrating the building block of the ONN — a single neuron perceptron — by mapping synapses onto 49 wavelengths to achieve a single-unit throughput of 11.9 Giga-OPS at 8 bits per OP, or 95.2 Gbps. We test the perceptron on handwritten-digit recognition and cancer-cell detection — achieving over 90% and 85% accuracy, respectively. By scaling the perceptron to a deep learning network using off-the-shelf telecom technology we can achieve high throughput operation for matrix multiplication for real-time massive data processing.


IEEE Access ◽  
2021 ◽  
pp. 1-1
Author(s):  
Hua Shen ◽  
Mingwu Zhang ◽  
Hao Wang ◽  
Fuchun Guo ◽  
Willy Susilo

2021 ◽  
Vol 314 ◽  
pp. 06003
Author(s):  
Aniss Moumen ◽  
Hajar Slimani ◽  
Nezha Mejjad ◽  
Mohamed Ben-Daoud

Nowadays, big data technologies are becoming increasingly important in the modernization of organizations’ information systems. Indeed, water and climatology data producers and users deal daily with massive data processing. These actors need new technology to overcome the difficulties in data integration, processing and visualization. This paper presents an exploratory study about the intention to use big data technology by the water stakeholders in Morocco; we also present an exploratory review of technology acceptance model theory, a theoretical framework that explains the factors of adopting new technologies by users.


2020 ◽  
Vol 6 ◽  
pp. e321
Author(s):  
Mozamel M. Saeed ◽  
Zaher Al Aghbari ◽  
Mohammed Alsharidah

A popular unsupervised learning method, known as clustering, is extensively used in data mining, machine learning and pattern recognition. The procedure involves grouping of single and distinct points in a group in such a way that they are either similar to each other or dissimilar to points of other clusters. Traditional clustering methods are greatly challenged by the recent massive growth of data. Therefore, several research works proposed novel designs for clustering methods that leverage the benefits of Big Data platforms, such as Apache Spark, which is designed for fast and distributed massive data processing. However, Spark-based clustering research is still in its early days. In this systematic survey, we investigate the existing Spark-based clustering methods in terms of their support to the characteristics Big Data. Moreover, we propose a new taxonomy for the Spark-based clustering methods. To the best of our knowledge, no survey has been conducted on Spark-based clustering of Big Data. Therefore, this survey aims to present a comprehensive summary of the previous studies in the field of Big Data clustering using Apache Spark during the span of 2010–2020. This survey also highlights the new research directions in the field of clustering massive data.


Author(s):  
J. Chen ◽  
W. Feng ◽  
Y. Huang

Abstract. Optimal discretization of continuously valued attributes is an uncertainty problem. The uncertainty of discretization is propagated and accumulated in the process of data mining, which has a direct influence on the usability and operation of the output results for mining. To address the limitations of existing discretization evaluation indices in describing accuracy and operation efficiency, this work suggests a discretization uncertainty index based on individuals. This method takes the local standard score as the general similarity measure in and between the intervals and evaluates discretization reliability according to the relative position of individuals in each interval. The experiment shows the new evaluation index is consistent with commonly used metrics. Under the premise of guaranteeing the validity of discrete evaluation, the proposed method has greater description accuracy and operation efficiency than extant approaches; it also has more advantages for massive data processing and special distribution detection.


Author(s):  
Ammar Odeh

Objective: In the last decade, with the advancement of big data technology and the internet of things, Wireless Sensor Networks (WSN) become fundamental for the success of a different range of applications especially those demanding massive data processing. Methods: This paper investigates several tracking methods to introduce a novel cluster-based target tracking analysis model. Results: Some crucial factors of the cluster-based routing protocols are demonstrated, and a comparison among these different methods is conducted according to our taxonomy such as cluster formation, predicate/proactive, target speed, single or multi-object tracking, boundary problem, scalability, energy efficiency, and communication cost. This can help the community of researchers by providing clear information for further study. Conclusion: The proposed paper compares the differences and similarities between the available approaches across different categories in terms of the Cluster construction, Clustering method, Object Speed, Number of Objects, Boundary problem, and scalability. Finally, we can recognize some open issues that have so far gained little publicity or remain unexplored yet.


Sign in / Sign up

Export Citation Format

Share Document