An Enhanced Unsupervised Fuzzy Expectation Maximization Clustering for Deduplication of Records in Big data

The main issue while handling records in data warehouse or cloud storage is the presence of duplicate records which may unnecessarily test the storage capacity and computation complexity. This is an issue while integrating various databases. This paper focuses on discovering records, entirely and partly replicated, before storing them in cloud storage. This work converts whole content of data to numeric values for applying deduplication using radix method. Fuzzy Expectation Maximization (FEM) is used to cluster the numerals, so that the time taken for comparison between records is reduced. To discover and eliminate the duplicate records, this paper used divided-and-conquer-algorithm to match records among intra-clusters, which further enhances the performance of the model. The simulation results have proved that the performance of the proposed model achieves higher detection rate of duplicate records.

Download Full-text

Multi Disease-Prediction Framework Using Hybrid Deep Learning: An Optimal Prediction Model (Preprint)

10.2196/preprints.22865 ◽

2020 ◽

Author(s):

Anusha Ampavathi ◽

Vijaya Saradhi T

Keyword(s):

Feature Extraction ◽

Big Data ◽

Deep Learning ◽

Weight Function ◽

Optimization Algorithm ◽

Large Scale ◽

Heuristic Algorithms ◽

Disease Prediction ◽

Health Care Decisions ◽

Proposed Model

UNSTRUCTURED Big data and its approaches are generally helpful for healthcare and biomedical sectors for predicting the disease. For trivial symptoms, the difficulty is to meet the doctors at any time in the hospital. Thus, big data provides essential data regarding the diseases on the basis of the patient’s symptoms. For several medical organizations, disease prediction is important for making the best feasible health care decisions. Conversely, the conventional medical care model offers input as structured that requires more accurate and consistent prediction. This paper is planned to develop the multi-disease prediction using the improvised deep learning concept. Here, the different datasets pertain to “Diabetes, Hepatitis, lung cancer, liver tumor, heart disease, Parkinson’s disease, and Alzheimer’s disease”, from the benchmark UCI repository is gathered for conducting the experiment. The proposed model involves three phases (a) Data normalization (b) Weighted normalized feature extraction, and (c) prediction. Initially, the dataset is normalized in order to make the attribute's range at a certain level. Further, weighted feature extraction is performed, in which a weight function is multiplied with each attribute value for making large scale deviation. Here, the weight function is optimized using the combination of two meta-heuristic algorithms termed as Jaya Algorithm-based Multi-Verse Optimization algorithm (JA-MVO). The optimally extracted features are subjected to the hybrid deep learning algorithms like “Deep Belief Network (DBN) and Recurrent Neural Network (RNN)”. As a modification to hybrid deep learning architecture, the weight of both DBN and RNN is optimized using the same hybrid optimization algorithm. Further, the comparative evaluation of the proposed prediction over the existing models certifies its effectiveness through various performance measures.

Download Full-text

Transparent CoAP Services to IoT Endpoints through ICN Operator Networks

Sensors ◽

10.3390/s19061339 ◽

2019 ◽

Vol 19 (6) ◽

pp. 1339 ◽

Cited By ~ 2

Author(s):

Hasan Islam ◽

Dmitrij Lagutin ◽

Antti Ylä-Jääski ◽

Nikos Fotiou ◽

Andrei Gurtov

Keyword(s):

Storage Capacity ◽

Communication Overhead ◽

Full Potential ◽

Core Network ◽

Computation Complexity ◽

State Management ◽

Iot Devices ◽

Novel Applications ◽

And Storage ◽

The Internet Of Things

The Constrained Application Protocol (CoAP) is a specialized web transfer protocol which is intended to be used for constrained networks and devices. CoAP and its extensions (e.g., CoAP observe and group communication) provide the potential for developing novel applications in the Internet-of-Things (IoT). However, a full-fledged CoAP-based application may require significant computing capability, power, and storage capacity in IoT devices. To address these challenges, we present the design, implementation, and experimentation with the CoAP handler which provides transparent CoAP services through the ICN core network. In addition, we demonstrate how the CoAP traffic over an ICN network can unleash the full potential of the CoAP, shifting both overhead and complexity from the (constrained) endpoints to the ICN network. The experiments prove that the CoAP Handler helps to decrease the required computation complexity, communication overhead, and state management of the CoAP server.

Download Full-text

How to evaluate sustainability of supply chains? A dynamic network DEA approach

Industrial Management & Data Systems ◽

10.1108/imds-09-2016-0389 ◽

2017 ◽

Vol 117 (9) ◽

pp. 1866-1889 ◽

Cited By ~ 21

Author(s):

Vahid Shokri Kahi ◽

Saeed Yousefi ◽

Hadi Shabanpour ◽

Reza Farzipoor Saen

Keyword(s):

Big Data ◽

Supply Chains ◽

Real World ◽

Performance Optimization ◽

Network Dynamic ◽

Network Dea ◽

Content Type ◽

Dea Model ◽

Proposed Model ◽

First Time

Purpose The purpose of this paper is to develop a novel network and dynamic data envelopment analysis (DEA) model for evaluating sustainability of supply chains. In the proposed model, all links can be considered in calculation of efficiency score. Design/methodology/approach A dynamic DEA model to evaluate sustainable supply chains in which networks have series structure is proposed. Nature of free links is defined and subsequently applied in calculating relative efficiency of supply chains. An additive network DEA model is developed to evaluate sustainability of supply chains in several periods. A case study demonstrates applicability of proposed approach. Findings This paper assists managers to identify inefficient supply chains and take proper remedial actions for performance optimization. Besides, overall efficiency scores of supply chains have less fluctuation. By utilizing the proposed model and determining dual-role factors, managers can plan their supply chains properly and more accurately. Research limitations/implications In real world, managers face with big data. Therefore, we need to develop an approach to deal with big data. Practical implications The proposed model offers useful managerial implications along with means for managers to monitor and measure efficiency of their production processes. The proposed model can be applied in real world problems in which decision makers are faced with multi-stage processes such as supply chains, production systems, etc. Originality/value For the first time, the authors present additive model of network-dynamic DEA. For the first time, the authors outline the links in a way that carry-overs of networks are connected in different periods and not in different stages.

Download Full-text

Wear of a Tool in Double-Disk Lapping of Silicon Wafers

ASME 2010 International Manufacturing Science and Engineering Conference, Volume 1 ◽

10.1115/msec2010-34323 ◽

2010 ◽

Cited By ~ 3

Author(s):

Adam Barylski ◽

Mariusz Deja

Keyword(s):

Experimental Data ◽

Integrated Circuits ◽

Tool Wear ◽

Silicon Wafers ◽

Silicon Ingot ◽

Proposed Model ◽

Tool Wear Prediction ◽

Simulation Results ◽

High Degree ◽

Kinematical Parameters

Silicon wafers are the most widely used substrates for fabricating integrated circuits. A sequence of processes is needed to turn a silicon ingot into silicon wafers. One of the processes is flattening by lapping or by grinding to achieve a high degree of flatness and parallelism of the wafer [1, 2, 3]. Lapping can effectively remove or reduce the waviness induced by preceding operations [2, 4]. The main aim of this paper is to compare the simulation results with lapping experimental data obtained from the Polish producer of silicon wafers, the company Cemat Silicon from Warsaw (www.cematsil.com). Proposed model is going to be implemented by this company for the tool wear prediction. Proposed model can be applied for lapping or grinding with single or double-disc lapping kinematics [5, 6, 7]. Geometrical and kinematical relations with the simulations are presented in the work. Generated results for given workpiece diameter and for different kinematical parameters are studied using models programmed in the Matlab environment.

Download Full-text

Research on the Application of Computer Big Data Technology in Cloud Storage Security

10.1109/icdsca53499.2021.9650284 ◽

2021 ◽

Author(s):

Jie Zhang

Keyword(s):

Big Data ◽

Cloud Storage ◽

Storage Security ◽

Big Data Technology

Download Full-text

Research on Particle Swarm Optimization Clustering Algorithm for Big Data Based on Cloud Storage Environment

10.1145/3482632.3484002 ◽

2021 ◽

Author(s):

Dan Liu

Keyword(s):

Big Data ◽

Particle Swarm Optimization ◽

Cloud Storage ◽

Clustering Algorithm ◽

Particle Swarm ◽

Swarm Optimization ◽

Cloud Storage Environment

Download Full-text

Methods of Mathematical Modeling of the Leaching Process of Cobalt-Containing Solutions

Solid State Phenomena ◽

10.4028/www.scientific.net/ssp.316.661 ◽

2021 ◽

Vol 316 ◽

pp. 661-666

Author(s):

Nataliya V. Mokrova

Keyword(s):

Mathematical Modeling ◽

Mathematical Model ◽

Chemical Kinetics ◽

Group Method ◽

Data Handling ◽

Multistage Processes ◽

Leaching Process ◽

Proposed Model ◽

Simulation Results ◽

The Mathematical Model

Current cobalt processing practices are described. This article discusses the advantages of the group argument accounting method for mathematical modeling of the leaching process of cobalt solutions. Identification of the mathematical model of the cascade of reactors of cobalt-producing is presented. Group method of data handling is allowing: to eliminate the need to calculate quantities of chemical kinetics; to get the opportunity to take into account the results of mixed experiments; to exclude the influence of random interference on the simulation results. The proposed model confirms the capabilities of the group method of data handling for describing multistage processes.

Download Full-text

ACCURATE ANALYSIS OF GLOBAL INTERCONNECTS IN NANO-FPGAs

NANO ◽

10.1142/s1793292009001629 ◽

2009 ◽

Vol 04 (03) ◽

pp. 171-176 ◽

Cited By ~ 2

Author(s):

DAVOOD FATHI ◽

BEHJAT FOROUZANDEH

Keyword(s):

Propagation Delay ◽

Modeling Method ◽

Accurate Analysis ◽

Proposed Model ◽

Simulation Results ◽

The Difference ◽

A New Technique ◽

Global Interconnects ◽

Technology Nodes ◽

Model Technique

This paper introduces a new technique for analyzing the behavior of global interconnects in FPGAs, for nanoscale technologies. Using this new enhanced modeling method, new enhanced accurate expressions for calculating the propagation delay of global interconnects in nano-FPGAs have been derived. In order to verify the proposed model, we have performed the delay simulations in 45 nm, 65 nm, 90 nm, and 130 nm technology nodes, with our modeling method and the conventional Pi-model technique. Then, the results obtained from these two methods have been compared with HSPICE simulation results. The obtained results show a better match in the propagation delay computations for global interconnects between our proposed model and HSPICE simulations, with respect to the conventional techniques such as Pi-model. According to the obtained results, the difference between our model and HSPICE simulations in the mentioned technology nodes is (0.29–22.92)%, whereas this difference is (11.13–38.29)% for another model.

Download Full-text

Formalizing the Mapping of UML Conceptual Schemas to Column-Oriented Databases

International Journal of Data Warehousing and Mining ◽

10.4018/ijdwm.2018070103 ◽

2018 ◽

Vol 14 (3) ◽

pp. 44-68 ◽

Cited By ~ 1

Author(s):

Fatma Abdelhedi ◽

Amal Ait Brahim ◽

Gilles Zurfluh

Keyword(s):

Big Data ◽

Data Warehouse ◽

Relational Databases ◽

Traditional Approach ◽

Physical Models ◽

Decision Making Process ◽

Nosql Databases ◽

Care Field ◽

Sufficient Degree

Nowadays, most organizations need to improve their decision-making process using Big Data. To achieve this, they have to store Big Data, perform an analysis, and transform the results into useful and valuable information. To perform this, it's necessary to deal with new challenges in designing and creating data warehouse. Traditionally, creating a data warehouse followed well-governed process based on relational databases. The influence of Big Data challenged this traditional approach primarily due to the changing nature of data. As a result, using NoSQL databases has become a necessity to handle Big Data challenges. In this article, the authors show how to create a data warehouse on NoSQL systems. They propose the Object2NoSQL process that generates column-oriented physical models starting from a UML conceptual model. To ensure efficient automatic transformation, they propose a logical model that exhibits a sufficient degree of independence so as to enable its mapping to one or more column-oriented platforms. The authors provide experiments of their approach using a case study in the health care field.

Download Full-text

Uncovering Data Warehouse Issues and Challenges in Big Data Management

Communications in Computer and Information Science - Big Data, Machine Learning, and Applications ◽

10.1007/978-3-030-62625-9_5 ◽

2020 ◽

pp. 48-59

Author(s):

Rohit Kr Batwada ◽

Namita Mittal ◽

Emmanuel S. Pilli

Keyword(s):

Big Data ◽

Data Management ◽

Data Warehouse

Download Full-text