Recurrent authenticity of the clustering of great tribute to the function of special type

A method of credibilistic fuzzy clustering is proposed for problems when data are fed sequentially, in online mode and forms large arrays (Big Data). The introduced procedures are essentially gradient algorithms for optimizing the objective function of a special type, and have a number of advantages over known probabilistic and possible approaches and, above all, robustness to anomalous observations. The approach is based on similarity measure, parameters of that are determined automatically in the process of self-learning. The proposed procedures are a generalization of the known methods, characterized by high speed and simple in numerical implementation.

Download Full-text

Big Data in the philippines: How do we actually use them?

Statistical Journal of the IAOS ◽

10.3233/sji-210826 ◽

2021 ◽

pp. 1-30

Author(s):

Lisa Grace S. Bersales ◽

Josefina V. Almeda ◽

Sabrina O. Romasoc ◽

Marie Nadeen R. Martinez ◽

Dannela Jann B. Galias

Keyword(s):

Big Data ◽

High Speed ◽

The Philippines ◽

Complex Data ◽

Official Statistics ◽

Group Discussions ◽

The Public ◽

The Government ◽

The Many ◽

Current Utilization

With the advancement of technology, digitalization, and the internet of things, large amounts of complex data are being produced daily. This vast quantity of various data produced at high speed is referred to as Big Data. The utilization of Big Data is being implemented with success in the private sector, yet the public sector seems to be falling behind despite the many potentials Big Data has already presented. In this regard, this paper explores ways in which the government can recognize the use of Big Data for official statistics. It begins by gathering and presenting Big Data-related initiatives and projects across the globe for various types and sources of Big Data implemented. Further, this paper discusses the opportunities, challenges, and risks associated with using Big Data, particularly in official statistics. This paper also aims to assess the current utilization of Big Data in the country through focus group discussions and key informant interviews. Based on desk review, discussions, and interviews, the paper then concludes with a proposed framework that provides ways in which Big Data may be utilized by the government to augment official statistics.

Download Full-text

Efficient indexing and retrieval of patient information from the big data using MapReduce framework and optimisation

Journal of Information Science ◽

10.1177/01655515211013708 ◽

2021 ◽

pp. 016555152110137

Author(s):

N.R. Gladiss Merlin ◽

Vigilson Prem. M

Keyword(s):

Big Data ◽

Similarity Measure ◽

Patient Information ◽

Complex Data ◽

Mapreduce Framework ◽

Maximum Value ◽

User Query ◽

Indexing And Retrieval ◽

Sine Cosine Algorithm ◽

Disparate Source

Large and complex data becomes a valuable resource in biomedical discovery, which is highly facilitated to increase the scientific resources for retrieving the helpful information. However, indexing and retrieving the patient information from the disparate source of big data is challenging in biomedical research. Indexing and retrieving the patient information from big data is performed using the MapReduce framework. In this research, the indexing and retrieval of information are performed using the proposed Jaya-Sine Cosine Algorithm (Jaya–SCA)-based MapReduce framework. Initially, the input big data is forwarded to the mapper randomly. The average of each mapper data is calculated, and these data are forwarded to the reducer, where the representative data are stored. For each user query, the input query is matched with the reducer, and thereby, it switches over to the mapper for retrieving the matched best result. The bilevel matching is performed while retrieving the data from the mapper based on the distance between the query. The similarity measure is computed based on the parametric-enabled similarity measure (PESM), cosine similarity and the proposed Jaya–SCA, which is the integration of the Jaya algorithm and the SCA. Moreover, the proposed Jaya–SCA algorithm attained the maximum value of F-measure, recall and precision of 0.5323, 0.4400 and 0.6867, respectively, using the StatLog Heart Disease dataset.

Download Full-text

An Extended Objective Function for Prototype-less Fuzzy Clustering

NAFIPS 2007 - 2007 Annual Meeting of the North American Fuzzy Information Processing Society ◽

10.1109/nafips.2007.383827 ◽

2007 ◽

Author(s):

Christian Borgelt ◽

Rudolf Kruse

Keyword(s):

Objective Function ◽

Fuzzy Clustering

Download Full-text

Inverse estimation of time-varying heat transfer coefficients for a hollow cylinder by using self-learning particle swarm optimization

International Journal of Nonlinear Sciences and Numerical Simulation ◽

10.1515/ijnsns-2020-0178 ◽

2021 ◽

Vol 0 (0) ◽

Author(s):

Kun-Yung Chen ◽

Te-Wen Tu

Keyword(s):

Heat Transfer ◽

Particle Swarm Optimization ◽

Objective Function ◽

Hollow Cylinder ◽

General Type ◽

Particle Swarm ◽

Time History ◽

Time Varying ◽

Swarm Optimization ◽

Self Learning

Abstract An inverse methodology is proposed to estimate a time-varying heat transfer coefficient (HTC) for a hollow cylinder with time-dependent boundary conditions of different kinds on inner and outer surfaces. The temperatures at both the inner surface and the interior domain are measured for the hollow cylinder, while the time history of HTC of the outer surface will be inversely determined. This work first expressed the unknown function of HTC in a general form with unknown coefficients, and then regarded these unknown coefficients as the estimated parameters which can be randomly searched and found by the self-learning particle swarm optimization (SLPSO) method. The objective function which wants to be minimized was found with the absolute errors between the measured and estimated temperatures at several measurement times. If the objective function converges toward the null, the inverse solution of the estimated HTC will be found eventually. From numerical experiments, when the function of HTC with exponential type is performed, the unknown coefficients of the HTC function can be accurately estimated. On the contrary, when the function of HTC with a general type is conducted, the unknown coefficients of HTC are poorly estimated. However, the estimated coefficients of an HTC function with the general type can be regarded as the equivalent coefficients for the real function of HTC.

Download Full-text

Research on complex attribute big data classification based on iterative fuzzy clustering algorithm

Web Intelligence ◽

10.3233/web-210463 ◽

2021 ◽

pp. 1-12

Author(s):

Li Qian

Keyword(s):

Big Data ◽

Fuzzy Clustering ◽

Classification Accuracy ◽

Clustering Algorithm ◽

Principal Component ◽

Data Classification ◽

Fisher Discriminant Analysis ◽

Fuzzy Clustering Algorithm ◽

Local Fisher Discriminant Analysis ◽

Big Data Classification

In order to overcome the low classification accuracy of traditional methods, this paper proposes a new classification method of complex attribute big data based on iterative fuzzy clustering algorithm. Firstly, principal component analysis and kernel local Fisher discriminant analysis were used to reduce dimensionality of complex attribute big data. Then, the Bloom Filter data structure is introduced to eliminate the redundancy of the complex attribute big data after dimensionality reduction. Secondly, the redundant complex attribute big data is classified in parallel by iterative fuzzy clustering algorithm, so as to complete the complex attribute big data classification. Finally, the simulation results show that the accuracy, the normalized mutual information index and the Richter’s index of the proposed method are close to 1, the classification accuracy is high, and the RDV value is low, which indicates that the proposed method has high classification effectiveness and fast convergence speed.

Download Full-text

Big Data Analytics Using Fuzzy Clustering for Network Security

10.1109/bigdata52589.2021.9671554 ◽

2021 ◽

Author(s):

Terrence P. Fries

Keyword(s):

Big Data ◽

Network Security ◽

Fuzzy Clustering ◽

Data Analytics ◽

Big Data Analytics

Download Full-text

Detection of Jihadism in Social Networks Using Big Data Techniques Supported by Graphs and Fuzzy Clustering

Complexity ◽

10.1155/2019/1238780 ◽

2019 ◽

Vol 2019 ◽

pp. 1-13

Author(s):

Cristina Sánchez-Rebollo ◽

Cristina Puente ◽

Rafael Palacios ◽

Claudia Piriz ◽

Juan P. Fuentes ◽

...

Keyword(s):

Social Networks ◽

Big Data ◽

Fuzzy Clustering ◽

Extraction Techniques ◽

Public Database ◽

New Members ◽

Terrorist Organizations ◽

Clustering Techniques ◽

Level Of Activity ◽

Data Architecture

Social networks are being used by terrorist organizations to distribute messages with the intention of influencing people and recruiting new members. The research presented in this paper focuses on the analysis of Twitter messages to detect the leaders orchestrating terrorist networks and their followers. A big data architecture is proposed to analyze messages in real time in order to classify users according to different parameters like level of activity, the ability to influence other users, and the contents of their messages. Graphs have been used to analyze how the messages propagate through the network, and this involves a study of the followers based on retweets and general impact on other users. Then, fuzzy clustering techniques were used to classify users in profiles, with the advantage over other classifications techniques of providing a probability for each profile instead of a binary categorization. Algorithms were tested using public database from Kaggle and other Twitter extraction techniques. The resulting profiles detected automatically by the system were manually analyzed, and the parameters that describe each profile correspond to the type of information that any expert may expect. Future applications are not limited to detecting terrorist activism. Human resources departments can apply the power of profile identification to automatically classify candidates, security teams can detect undesirable clients in the financial or insurance sectors, and immigration officers can extract additional insights with these techniques.

Download Full-text