A novel approach for big data classification based on hybrid parallel dimensionality reduction using spark cluster

The big data concept has elicited studies on how to accurately and efficiently extract valuable information from such huge dataset. The major problem during big data mining is data dimensionality due to a large number of dimensions in such datasets. This major consequence of high data dimensionality is that it affects the accuracy of machine learning (ML) classifiers; it also results in time wastage due to the presence of several redundant features in the dataset. This problem can be possibly solved using a fast feature reduction method. Hence, this study presents a fast HP-PL which is a new hybrid parallel feature reduction framework that utilizes spark to facilitate feature reduction on shared/distributed-memory clusters. The evaluation of the proposed HP-PL on KDD99 dataset showed the algorithm to be significantly faster than the conventional feature reduction techniques. The proposed technique required 1 minute to select 4 dataset features from over 79 features and 3,000,000 samples on a 3-node cluster (total of 21 cores). For the comparative algorithm, more than 2 hours was required to achieve the same feat. In the proposed system, Hadoop’s distributed file system (HDFS) was used to achieve distributed storage while Apache Spark was used as the computing engine. The model development was based on a parallel model with full consideration of the high performance and throughput of distributed computing. Conclusively, the proposed HP-PL method can achieve good accuracy with less memory and time compared to the conventional methods of feature reduction. This tool can be publicly accessed at https://github.com/ahmed/Fast-HP-PL.

Download Full-text

Parallel Fuzzy Cognitive Map Using Evolutionary Feature Reduction for Big Data Classification Problem

Social Transformation – Digital Way - Communications in Computer and Information Science ◽

10.1007/978-981-13-1343-1_22 ◽

2018 ◽

pp. 226-239 ◽

Cited By ~ 1

Author(s):

M. V. Judy ◽

Gayathri Soman

Keyword(s):

Big Data ◽

Data Classification ◽

Classification Problem ◽

Cognitive Map ◽

Feature Reduction ◽

Fuzzy Cognitive Map ◽

Big Data Classification

Download Full-text

A Novel Approach for Big Data Classification and Transportation in Rail Networks

IEEE Transactions on Intelligent Transportation Systems ◽

10.1109/tits.2019.2905611 ◽

2020 ◽

Vol 21 (3) ◽

pp. 1239-1249 ◽

Cited By ~ 3

Author(s):

Mahdi Saki ◽

Mehran Abolhasan ◽

Justin Lipman

Keyword(s):

Big Data ◽

Data Classification ◽

Novel Approach ◽

Big Data Classification

Download Full-text

Cache Support in a High Performance Fault-Tolerant Distributed Storage System for Cloud and Big Data

2015 IEEE International Parallel and Distributed Processing Symposium Workshop ◽

10.1109/ipdpsw.2015.65 ◽

2015 ◽

Cited By ~ 1

Author(s):

Lars Lundberg ◽

Hakan Grahn ◽

Dragos Ilie ◽

Christian Melander

Keyword(s):

Big Data ◽

High Performance ◽

Fault Tolerant ◽

Distributed Storage ◽

Storage System ◽

Distributed Storage System

Download Full-text

High Performance GP-Based Approach for fMRI Big Data Classification

Proceedings of the Practice and Experience in Advanced Research Computing 2017 on Sustainability, Success and Impact - PEARC17 ◽

10.1145/3093338.3104145 ◽

2017 ◽

Cited By ~ 4

Author(s):

Amirhessam Tahmassebi ◽

Amir H. Gandomi ◽

Anke Meyer-Bäse

Keyword(s):

Big Data ◽

High Performance ◽

Data Classification ◽

Big Data Classification

Download Full-text

A novel approach for big data processing using message passing interface based on memory mapping

Journal Of Big Data ◽

10.1186/s40537-019-0275-3 ◽

2019 ◽

Vol 6 (1) ◽

Cited By ~ 1

Author(s):

Saad Ahmed Dheyab ◽

Mohammed Najm Abdullah ◽

Buthainah Fahran Abed

Keyword(s):

Big Data ◽

Data Processing ◽

Message Passing ◽

High Performance ◽

Message Passing Interface ◽

Memory Storage ◽

Big Data Processing ◽

Memory Space ◽

Novel Approach ◽

Memory Mapping

AbstractThe analysis and processing of big data are one of the most important challenges that researchers are working on to find the best approaches to handle it with high performance, low cost and high accuracy. In this paper, a novel approach for big data processing and management was proposed that differed from the existing ones; the proposed method employs not only the memory space to reads and handle big data, it also uses space of memory-mapped extended from memory storage. From a methodological viewpoint, the novelty of this paper is the segmentation stage of big data using memory mapping and broadcasting all segments to a number of processors using a parallel message passing interface. From an application viewpoint, the paper presents a high-performance approach based on a homogenous network which works parallelly to encrypt-decrypt big data using AES algorithm. This approach can be done on Windows Operating System using .NET libraries.

Download Full-text

Fast Number Theoretic Transform for Ring-LWE on 8-bit AVR Embedded Processor

Sensors ◽

10.3390/s20072039 ◽

2020 ◽

Vol 20 (7) ◽

pp. 2039 ◽

Cited By ~ 1

Author(s):

Hwajeong Seo ◽

Hyeokdong Kwon ◽

Yongbeen Kwon ◽

Kyungho Kim ◽

Seungju Choi ◽

...

Keyword(s):

Random Sampling ◽

High Performance ◽

Random Number ◽

Assembly Language ◽

Random Number Generator ◽

Random Number Generation ◽

Optimization Techniques ◽

Security Level ◽

Reduction Techniques ◽

Novel Approach

In this paper, we optimized Number Theoretic Transform (NTT) and random sampling operations on low-end 8-bit AVR microcontrollers. We focused on the optimized modular multiplication with secure countermeasure (i.e., constant timing), which ensures high performance and prevents timing attack and simple power analysis. In particular, we presented combined Look-Up Table (LUT)-based fast reduction techniques in a regular fashion. This novel approach only requires two times of LUT access to perform the whole modular reduction routine. The implementation is carefully written in assembly language, which reduces the number of memory access and function call routines. With LUT-based optimization techniques, proposed NTT implementations outperform the previous best results by 9.0% and 14.6% for 128-bit security level and 256-bit security level, respectively. Furthermore, we adopted the most optimized AES software implementation to improve the performance of pseudo random number generation for random sampling operation. The encryption of AES-256 counter (CTR) mode used for random number generator requires only 3184 clock cycles for 128-bit data input, which is 9.5% faster than previous state-of-art results. Finally, proposed methods are applied to the whole process of Ring-LWE key scheduling and encryption operations, which require only 524,211 and 659,603 clock cycles for 128-bit security level, respectively. For the key generation of 256-bit security level, 1,325,171 and 1,775,475 clock cycles are required for H/W and S/W AES-based implementations, respectively. For the encryption of 256-bit security level, 1,430,601 and 2,042,474 clock cycles are required for H/W and S/W AES-based implementations, respectively.

Download Full-text

A Novel Approach With Dynamic Fuzzy Inference for Big Data Classification Problems

Advances in Computational Intelligence and Robotics - Innovations, Algorithms, and Applications in Cognitive Informatics and Natural Intelligence ◽

10.4018/978-1-7998-3038-2.ch003 ◽

2020 ◽

pp. 43-58

Author(s):

Shangzhu Jin ◽

Jun Peng

Keyword(s):

Big Data ◽

Missing Values ◽

Fuzzy Inference ◽

Interpolation Method ◽

Fuzzy Rule ◽

Data Sets ◽

Classification Problems ◽

Fuzzy Interpolation ◽

Novel Approach ◽

Big Data Classification

Currently, big data and its applications have become emergent topics. To deal with the uncertainty in data sets, fuzzy system-based models were explored and stand out for many applications. However, when a given observation has no overlap with antecedent values, no rule can be invoked, or even the invoked rules with missing values in classical fuzzy inference can also appear in big data environment, and therefore, no consequence can be derived. Fortunately, fuzzy rule interpolation techniques can support inference in such cases. Combining traditional fuzzy reasoning technique and fuzzy interpolation method may promote the accuracy of inference conclusion. Therefore, in this chapter, an initial investigation into the framework of MapReduce with dynamic fuzzy inference/interpolation for big data applications (BigData-DFRI) is reported. The results of an experimental investigation of this method are represented, demonstrating the potential and efficacy of the proposed approach.

Download Full-text

CNB-MRF: Adapting Correlative Naive Bayes Classifier and MapReduce Framework for Big Data Classification

International Review on Computers and Software (IRECOS) ◽

10.15866/irecos.v11i11.10116 ◽

2016 ◽

Vol 11 (11) ◽

pp. 1007 ◽

Cited By ~ 3

Author(s):

Chitrakant Banchhor ◽

N. Srinivasu

Keyword(s):

Big Data ◽

Naive Bayes ◽

Data Classification ◽

Naïve Bayes ◽

Naive Bayes Classifier ◽

Bayes Classifier ◽

Naïve Bayes Classifier ◽

Mapreduce Framework ◽

Big Data Classification

Download Full-text

Human emotion modeling (HEM): an interface for IoT systems

Journal of Ambient Intelligence and Humanized Computing ◽

10.1007/s12652-021-03437-w ◽

2021 ◽

Author(s):

Mohammed R. Elkobaisi ◽

Fadi Al Machot

Keyword(s):

Assisted Living ◽

High Performance ◽

Modeling Language ◽

Emotional States ◽

Human Emotion ◽

Care Givers ◽

Novel Approach ◽

Domain Specific Modeling ◽

Increasing Demand ◽

Human Support

AbstractThe use of IoT-based Emotion Recognition (ER) systems is in increasing demand in many domains such as active and assisted living (AAL), health care and industry. Combining the emotion and the context in a unified system could enhance the human support scope, but it is currently a challenging task due to the lack of a common interface that is capable to provide such a combination. In this sense, we aim at providing a novel approach based on a modeling language that can be used even by care-givers or non-experts to model human emotion w.r.t. context for human support services. The proposed modeling approach is based on Domain-Specific Modeling Language (DSML) which helps to integrate different IoT data sources in AAL environment. Consequently, it provides a conceptual support level related to the current emotional states of the observed subject. For the evaluation, we show the evaluation of the well-validated System Usability Score (SUS) to prove that the proposed modeling language achieves high performance in terms of usability and learn-ability metrics. Furthermore, we evaluate the performance at runtime of the model instantiation by measuring the execution time using well-known IoT services.

Download Full-text

Tracking by Deblatting

International Journal of Computer Vision ◽

10.1007/s11263-021-01480-w ◽

2021 ◽

Author(s):

Denys Rozumnyi ◽

Jan Kotera ◽

Filip Šroubek ◽

Jiří Matas

Keyword(s):

High Speed ◽

High Performance ◽

Motion Blur ◽

Velocity Estimation ◽

High Speed Camera ◽

Trajectory Function ◽

Novel Approach ◽

Blind Deblurring ◽

Object Trajectories ◽

Complex Trajectories

AbstractObjects moving at high speed along complex trajectories often appear in videos, especially videos of sports. Such objects travel a considerable distance during exposure time of a single frame, and therefore, their position in the frame is not well defined. They appear as semi-transparent streaks due to the motion blur and cannot be reliably tracked by general trackers. We propose a novel approach called Tracking by Deblatting based on the observation that motion blur is directly related to the intra-frame trajectory of an object. Blur is estimated by solving two intertwined inverse problems, blind deblurring and image matting, which we call deblatting. By postprocessing, non-causal Tracking by Deblatting estimates continuous, complete, and accurate object trajectories for the whole sequence. Tracked objects are precisely localized with higher temporal resolution than by conventional trackers. Energy minimization by dynamic programming is used to detect abrupt changes of motion, called bounces. High-order polynomials are then fitted to smooth trajectory segments between bounces. The output is a continuous trajectory function that assigns location for every real-valued time stamp from zero to the number of frames. The proposed algorithm was evaluated on a newly created dataset of videos from a high-speed camera using a novel Trajectory-IoU metric that generalizes the traditional Intersection over Union and measures the accuracy of the intra-frame trajectory. The proposed method outperforms the baselines both in recall and trajectory accuracy. Additionally, we show that from the trajectory function precise physical calculations are possible, such as radius, gravity, and sub-frame object velocity. Velocity estimation is compared to the high-speed camera measurements and radars. Results show high performance of the proposed method in terms of Trajectory-IoU, recall, and velocity estimation.

Download Full-text