Multiple Frame CT Image Sequencing Big Data Batch Clustering Method

Author(s):  
Xiao-yan Wang ◽  
Guo-hui Wei ◽  
Zheng-wei Gu ◽  
Ming Li ◽  
Jin-gang Ma
Author(s):  
Alexander Troussov ◽  
Sergey Maruev ◽  
Sergey Vinogradov ◽  
Mikhail Zhizhin

Techno-social systems generate data, which are rather different, than data, traditionally studied in social network analysis and other fields. In massive social networks agents simultaneously participate in several contexts, in different communities. Network models of many real data from techno-social systems reflect various dimensionalities and rationales of actor's actions and interactions. The data are inherently multidimensional, where “everything is deeply intertwingled”. The multidimensional nature of Big Data and the emergence of typical network characteristics in Big Data, makes it reasonable to address the challenges of structure detection in network models, including a) development of novel methods for local overlapping clustering with outliers, b) with near linear performance, c) preferably combined with the computation of the structural importance of nodes. In this chapter the spreading connectivity based clustering method is introduced. The viability of the approach and its advantages are demonstrated on the data from the largest European social network VK.


Author(s):  
Mohamed Aymen Ben HajKacem ◽  
Chiheb-Eddine Ben N′Cir ◽  
Nadia Essoussi

Big Data clustering has become an important challenge in data analysis since several applications require scalable clustering methods to organize such data into groups of similar objects. Given the computational cost of most of the existing clustering methods, we propose in this paper a new clustering method, referred to as STiMR [Formula: see text]-means, able to provide good tradeoff between scalability and clustering quality. The proposed method is based on the combination of three acceleration techniques: sampling, triangle inequality and MapReduce. Sampling is used to reduce the number of data points when building cluster prototypes, triangle inequality is used to reduce the number of comparisons when looking for nearest clusters and MapReduce is used to configure a parallel framework for running the proposed method. Experiments performed on simulated and real datasets have shown the effectiveness of the proposed method, with the existing ones, in terms of running time, scalability and internal validity measures.


2021 ◽  
Vol 27 (11) ◽  
pp. 1203-1221
Author(s):  
Amal Rekik ◽  
Salma Jamoussi

Clustering data streams in order to detect trending topic on social networks is a chal- lenging task that interests the researchers in the big data field. In fact, analyzing such data needs several requirements to be addressed due to their large amount and evolving nature. For this purpose, we propose, in this paper, a new evolving clustering method which can take into account the incremental nature of the data and meet with its principal requirements. Our method explores a deep learning technique to learn incrementally from unlabelled examples generated at high speed which need to be clustered instantly. To evaluate the performance of our method, we have conducted several experiments using the Sanders, HCR and Terr-Attacks datasets.


Author(s):  
Maitham D Naeemi ◽  
Johnny Ren ◽  
Nathan Hollcroft ◽  
Adam M Alessio ◽  
Sohini Roychowdhury

Author(s):  
Vidadi Akhundov Vidadi Akhundov

In this study, attention is drawn to the under-explored area of strategic content analysis and the development of strategic vision for managers, with the supporting role of interpreting visualized big data to apply appropriate knowledge management strategies in regional companies. The study suggests improved models that can be used to process data and apply solutions to Big Data. The paper proposes a model of business processes in the region in the context of information clusters, which become the object of analysis in the conditions of active accumulation of big data about the external and internal environment. Research has shown that traditional econometric and data collection techniques cannot be directly applied to Big Data analysis due to computational volatility or computational complexity. The paper provides a brief description of the essence of the methods of associative and causal data analysis and the problems that complicate its application in Big Data. The scheme of accelerated search for a set of causal relationships is described. The use of semantically structured models, cause-effect models and the K-clustering method for decision making in big data is practical and ensures the adequacy of the results. The article explains the stages of applying these models in practice. In the course of the study, content analysis was carried out using the main methods of processing structured data on the example of the countries of the world using synthetic indicators showing the trends of Industry 4.0. When assessing Industry 4.0 technologies by region, the diversity of country grouping attributes should be considered. Therefore, during the analysis, the countries of the world were compared in two groups. The first group - the results for developed countries are presented in tabular form. For the second group, the results are presented in an explanatory form. In the process of assessing industrial 4.0 technologies, statistical indicators were used: "The share of medium and high-tech activities", "Competitiveness indicators", "Results in the field of knowledge and technology", "The share of medium and high-tech production in the total value added in the manufacturing industry", “Industrial Competitiveness Index (CIP score)”. As a result, the rating of the countries was determined based on the analysis of these indicators. . The reasons for the difficulties of calculations when processing Big Data are given in the concluding part of the article. Keywords: K - clustering method, causal links, data point, Euclidean distance


Sign in / Sign up

Export Citation Format

Share Document