scholarly journals Analysis of Artistic Modeling of Opera Stage Clothing Based on Big Data Clustering Algorithm

2021 ◽  
Vol 2021 ◽  
pp. 1-9
Author(s):  
Weiwei Luo

In order to deal with the problem that the traditional stage costume artistry analysis method cannot correct the results of big data clustering, which leads to deviations in the extraction of costume artistry features, this paper proposes a clothing artistic modeling method based on big data clustering algorithm. The proposed method provides a database for big data clustering by constructing the attribute set of the big data feature sequence training set and, at the same time, constructing a second-order cone programming model to correct the big data. Aiming at the problem that traditional stage costume art analysis methods cannot correct the clustering results of big data. On this basis, the costume elements of the opera stage are segmented, initialized, and transformed into a binary function. Finally, using the convolutional neural network, combining the element segmentation results and the large data clustering space state vector, a feature extraction model of stage costume art is constructed. Experimental results show that the model has good convergence, short time-consuming, high accuracy, and ideal feature recognition capabilities.

Author(s):  
B. K. Tripathy ◽  
Hari Seetha ◽  
M. N. Murty

Data clustering plays a very important role in Data mining, machine learning and Image processing areas. As modern day databases have inherent uncertainties, many uncertainty-based data clustering algorithms have been developed in this direction. These algorithms are fuzzy c-means, rough c-means, intuitionistic fuzzy c-means and the means like rough fuzzy c-means, rough intuitionistic fuzzy c-means which base on hybrid models. Also, we find many variants of these algorithms which improve them in different directions like their Kernelised versions, possibilistic versions, and possibilistic Kernelised versions. However, all the above algorithms are not effective on big data for various reasons. So, researchers have been trying for the past few years to improve these algorithms in order they can be applied to cluster big data. The algorithms are relatively few in comparison to those for datasets of reasonable size. It is our aim in this chapter to present the uncertainty based clustering algorithms developed so far and proposes a few new algorithms which can be developed further.


2016 ◽  
pp. 1220-1243
Author(s):  
Ilias K. Savvas ◽  
Georgia N. Sofianidou ◽  
M-Tahar Kechadi

Big data refers to data sets whose size is beyond the capabilities of most current hardware and software technologies. The Apache Hadoop software library is a framework for distributed processing of large data sets, while HDFS is a distributed file system that provides high-throughput access to data-driven applications, and MapReduce is software framework for distributed computing of large data sets. Huge collections of raw data require fast and accurate mining processes in order to extract useful knowledge. One of the most popular techniques of data mining is the K-means clustering algorithm. In this study, the authors develop a distributed version of the K-means algorithm using the MapReduce framework on the Hadoop Distributed File System. The theoretical and experimental results of the technique prove its efficiency; thus, HDFS and MapReduce can apply to big data with very promising results.


Author(s):  
Hind Bangui ◽  
Mouzhi Ge ◽  
Barbora Buhnova

Due to the massive data increase in different Internet of Things (IoT) domains such as healthcare IoT and Smart City IoT, Big Data technologies have been emerged as critical analytics tools for analyzing the IoT data. Among the Big Data technologies, data clustering is one of the essential approaches to process the IoT data. However, how to select a suitable clustering algorithm for IoT data is still unclear. Furthermore, since Big Data technology are still in its initial stage for different IoT domains, it is thus valuable to propose and structure the research challenges between Big Data and IoT. Therefore, this article starts by reviewing and comparing the data clustering algorithms that can be applied in IoT datasets, and then extends the discussions to a broader IoT context such as IoT dynamics and IoT mobile networks. Finally, this article identifies a set of research challenges that harvest a research roadmap for the Big Data research in IoT domains. The proposed research roadmap aims at bridging the research gaps between Big Data and various IoT contexts.


Author(s):  
Padmanathan Anantharaman ◽  
H.V. Ramakrishan

As data volumes continue to grow, they quickly consume the capacity of data warehouses and application databases. Is your IT organization forced into costly upgrades to expensive databases and data warehouse hardware appliances and enormous amount of data is getting explored through Internet of Things (IoT) as technologies are advancing and people uses these technologies in day to day activities, this data is termed as Big Data having its characteristics and challenges. Frequent Itemset Mining algorithms are aimed to disclose frequent itemsets from transactional database but as the dataset size increases, it cannot be handled by traditional frequent itemset mining. MapReduce programming model solves the problem of large datasets but it has large communication cost which reduces execution efficiency. This proposed new pre-processed k-means technique applied on BigFIM algorithm. ClustBigFIM uses hybrid approach, clustering using k-means algorithm to generate Clusters from huge datasets and Apriori and Eclat to mine frequent itemsets from generated clusters using MapReduce programming model. Results shown that execution efficiency of ClustBigFIM algorithm is increased by applying k-means clustering algorithm before BigFIM algorithm as one of the pre-processing technique.


2019 ◽  
Vol 16 (9) ◽  
pp. 3824-3829
Author(s):  
Deepak Ahlawat ◽  
Deepali Gupta

Due to advancement in the technological world, there is a great surge in data. The main sources of generating such a large amount of data are social websites, internet sites etc. The large data files are combined together to create a big data architecture. Managing the data file in such a large volume is not easy. Therefore, modern techniques are developed to manage bulk data. To arrange and utilize such big data, Hadoop Distributed File System (HDFS) architecture from Hadoop was presented in the early stage of 2015. This architecture is used when traditional methods are insufficient to manage the data. In this paper, a novel clustering algorithm is implemented to manage a large amount of data. The concepts and frames of Big Data are studied. A novel algorithm is developed using the K means and cosine-based similarity clustering in this paper. The developed clustering algorithm is evaluated using the precision and recall parameters. The prominent results are obtained which successfully manages the big data issue.


Author(s):  
Yao Wu ◽  
Long Zheng ◽  
Brian Heilig ◽  
Guang R Gao

As the attention given to big data grows, cluster computing systems for distributed processing of large data sets become the mainstream and critical requirement in high performance distributed system research. One of the most successful systems is Hadoop, which uses MapReduce as a programming/execution model and takes disks as intermedia to process huge volumes of data. Spark, as an in-memory computing engine, can solve the iterative and interactive problems more efficiently. However, currently it is a consensus that they are not the final solutions to big data due to a MapReduce-like programming model, synchronous execution model and the constraint that only supports batch processing, and so on. A new solution, especially, a fundamental evolution is needed to bring big data solutions into a new era. In this paper, we introduce a new cluster computing system called HAMR which supports both batch and streaming processing. To achieve better performance, HAMR integrates high performance computing approaches, i.e. dataflow fundamental into a big data solution. With more specifications, HAMR is fully designed based on in-memory computing to reduce the unnecessary disk access overhead; task scheduling and memory management are in fine-grain manner to explore more parallelism; asynchronous execution improves efficiency of computation resource usage, and also makes workload balance across the whole cluster better. The experimental results show that HAMR can outperform Hadoop MapReduce and Spark by up to 19x and 7x respectively, in the same cluster environment. Furthermore, HAMR can handle scaling data size well beyond the capabilities of Spark.


2021 ◽  
Vol 2021 ◽  
pp. 1-10
Author(s):  
Wang Zhouhuo

In order to solve the problem of large data classification of human resources, a new parallel classification algorithm of large data of human resources based on the Spark platform is proposed in this study. According to the spark platform, it can complete the update and distance calculation of the human resource big data clustering center and design the big data clustering process. Based on this, the K-means clustering method is introduced to mine frequent itemsets of large data and optimize the aggregation degree of similar large data. A fuzzy genetic algorithm is used to identify the balance of big data. This study adopts the selective integration method to study the unbalanced human resource database classifier in the process of transmission, introduces the decision contour matrix to construct the anomaly support model of the set of unbalanced human resource data classifier, identifies the features of the big data of human resource in parallel, repairs the relevance of the big data of human resource, introduces the improved ant colony algorithm, and finally realizes the design of the parallel classification algorithm of the big data of human resource. The experimental results show that the proposed algorithm has a low time cost, good classification effect, and ideal parallel classification rule complexity.


2017 ◽  
Vol 7 (1.1) ◽  
pp. 237
Author(s):  
MD. A R Quadri ◽  
B. Sruthi ◽  
A. D. SriRam ◽  
B. Lavanya

Java is one of the finest language for big data because of its write once and run anywhere nature. The new release of java 8 introduced few strategies like lambda expressions and streams which are helpful for parallel computing. Though these new strategies helps in extracting, sorting and filtering data from collections and arrays, still there are problems with it. Streams cannot properly process with the large data sets like big data. Also, there are few problems associated while executing in distributed environment. The new streams introduced in java are restricted to computations inside the single system there is no method for distributed computing over multiple systems. And streams store data in their memory and therefore cannot support huge data sets. Now, this paper cope with java 8 behalf of massive data and deed in distributed environment by providing extensions to the Programming model with distributed streams. The distributed computing of large data programming models may be consummated by introducing distributed stream frameworks.


2019 ◽  
Vol 8 (2) ◽  
pp. 2490-2494

Big data is a new technology, which is defined by large amount of data, so it is possible to extract value from the capturing and analysis process. Large data faced many challenges due to various features such as volume, speed, variation, value, complexity and performance. Many organizations face challenges while facing test strategies for structured and unstructured data validation, establishing a proper testing environment, working with non relational databases and maintaining functional testing. These challenges have low quality data in production, delay in execution and increase in cost. Reduce the map for data intensive business and scientific applications Provides parallel and scalable programming model. To get the performance of big data applications, defined as response time, maximum online user data capacity size, and a certain maximum processing capacity. In proposed, to test the health care big data . In health care data contains text file, image file, audio file and video file. To test the big data document, by using two concepts such as big data preprocessing testing and post processing testing. To classify the data from unstructured format to structured format using SVM algorithm. In preprocessing testing test all the data, for the purpose data accuracy. In preprocessing testing such as file size testing, file extension testing and de-duplication testing. In Post Processing to implement the map reduce concept for the use of easily to fetch the data.


Sign in / Sign up

Export Citation Format

Share Document