scholarly journals DETECTION AND CLASSIFICATION OF CHANGES IN EVOLVING DATA STREAMS

Author(s):  
MOHAMED MEDHAT GABER ◽  
PHILIP S. YU

Data stream mining has attracted considerable attention over the past few years owing to the significance of its applications. Streaming data is often evolving over time. Capturing changes could be used for detecting an event or a phenomenon in various applications. Weather conditions, economical changes, astronomical, and scientific phenomena are among a wide range of applications. Because of the high volume and speed of data streams, it is computationally hard to capture these changes from raw data in real-time. In this paper, we propose a novel algorithm that we term as STREAM-DETECT to capture these changes in data stream distribution and/or domain using clustering result deviation. STREAM-DETECT is followed by a process of offline classification CHANGE-CLASS. This classification is concerned with the association of the history of change characteristics with the observed event or phenomenon. Experimental results show the efficiency of the proposed framework in both detecting the changes and classification accuracy.

2012 ◽  
Vol 235 ◽  
pp. 9-14
Author(s):  
Chun Hua Ju ◽  
Li Li Mao

Data stream mining has been applied in many domains, but the concept drifts of data streams bring great obstacles to data mining. Current researches about classification algorithm for streaming data with concept drift have achieved many successes, while they pay little attention to the iterancy of data streams, namely, the situation of the historical concept reappears. For this characteristic, this paper puts forward that it utilizes the classifier model of the historical concepts or high similarity concepts through calculating the concept similarity to classify and predict. In this way, we don’t need training any more. Meanwhile, it reduces the cost of update model, speeds up the classification of the rate and improves the prediction efficiency.


2020 ◽  
Vol 8 (4) ◽  
pp. 63-73
Author(s):  
Sikha Bagui ◽  
Katie Jin

This survey performs a thorough enumeration and analysis of existing methods for data stream processing. It is a survey of the challenges facing streaming data. The challenges addressed are preprocessing of streaming data, detection and dealing with concept drifts in streaming data, data reduction in the face of data streams, approximate queries and blocking operations in streaming data.


2021 ◽  
Author(s):  
Christian Nordahl ◽  
Veselka Boeva ◽  
Håkan Grahn ◽  
Marie Persson Netz

AbstractData has become an integral part of our society in the past years, arriving faster and in larger quantities than before. Traditional clustering algorithms rely on the availability of entire datasets to model them correctly and efficiently. Such requirements are not possible in the data stream clustering scenario, where data arrives and needs to be analyzed continuously. This paper proposes a novel evolutionary clustering algorithm, entitled EvolveCluster, capable of modeling evolving data streams. We compare EvolveCluster against two other evolutionary clustering algorithms, PivotBiCluster and Split-Merge Evolutionary Clustering, by conducting experiments on three different datasets. Furthermore, we perform additional experiments on EvolveCluster to further evaluate its capabilities on clustering evolving data streams. Our results show that EvolveCluster manages to capture evolving data stream behaviors and adapts accordingly.


Sensors ◽  
2020 ◽  
Vol 20 (20) ◽  
pp. 5829 ◽  
Author(s):  
Jen-Wei Huang ◽  
Meng-Xun Zhong ◽  
Bijay Prasad Jaysawal

Outlier detection in data streams is crucial to successful data mining. However, this task is made increasingly difficult by the enormous growth in the quantity of data generated by the expansion of Internet of Things (IoT). Recent advances in outlier detection based on the density-based local outlier factor (LOF) algorithms do not consider variations in data that change over time. For example, there may appear a new cluster of data points over time in the data stream. Therefore, we present a novel algorithm for streaming data, referred to as time-aware density-based incremental local outlier detection (TADILOF) to overcome this issue. In addition, we have developed a means for estimating the LOF score, termed "approximate LOF," based on historical information following the removal of outdated data. The results of experiments demonstrate that TADILOF outperforms current state-of-the-art methods in terms of AUC while achieving similar performance in terms of execution time. Moreover, we present an application of the proposed scheme to the development of an air-quality monitoring system.


Author(s):  
Yu.M. Parkhomenko ◽  
◽  
G.V. Donchenko ◽  

The book describes the history of the discovery of vitamins, presents modern ideas about the properties of vitamins and their importance for humans as essential nutritional factors. General information is provided about the modern classification of vitamins, physicochemical and biological properties of water- and fat-soluble vitamins and vitamin-like compounds, their role in metabolism and, in general, in human health. The causes of hypovitaminosis are analyzed, advice is given on their prevention and storage of vitamins in food. The book is intended for specialists in the field of biology, medicine, as well as for a wide range of readers, including teachers, students and other people interested in health issues.


Author(s):  
Taegong Kim ◽  
Cheong Hee Park

Abstract Anomaly pattern detection in a data stream aims to detect a time point where outliers begin to occur abnormally. Recently, a method for anomaly pattern detection has been proposed based on binary classification for outliers and statistical tests in the data stream of binary labels of normal or an outlier. It showed that an anomaly pattern can be detected accurately even when outlier detection performance is relatively low. However, since the anomaly pattern detection method is based on the binary classification for outliers, most well-known outlier detection methods, with the output of real-valued outlier scores, can not be used directly. In this paper, we propose an anomaly pattern detection method in a data stream using the transformation to multiple binary-valued data streams from real-valued outlier scores. By using three outlier detection methods, Isolation Forest(IF), Autoencoder-based outlier detection, and Local outlier factor(LOF), the proposed anomaly pattern detection method is tested using artificial and real data sets. The experimental results show that anomaly pattern detection using Isolation Forest gives the best performance.


2013 ◽  
Vol 10 (5) ◽  
pp. 1580-1586
Author(s):  
V.sidda Reddy ◽  
Dr T.V. Rao ◽  
Dr A. Govardhan

Data Stream Mining algorithms performs under constraints called space used and time taken, which is due to the streaming property. The relaxation in these constraints is inversely proportional to the streaming speed of the data. Since the caching and mining the streaming-data is sensitive, here in this paper a scalable, memory efficient caching and frequent itemset mining model is devised. The proposed model is an incremental approach that builds single level multi node trees called bushes from each window of the streaming data; henceforth we refer this proposed algorithm as a Tree (bush) based Incremental Frequent Itemset Mining (TIFIM) over data streams.


Author(s):  
Adil Markhaba ◽  
◽  
Islam Zhemeney ◽  
Aman K. Rakhmetullin ◽  
Kalamkas B. Bolatova ◽  
...  

The relevance of this topic lies in the analysis of the study of medieval Kazakh history. After gaining independence, the processes of the revival of national identity, reinstatement of primitive spiritual and moral values and human mentality, which were sharply suppressed during the period of the Soviet totalitarian system, became widespread. Therewith, the widely discussed national-historical structure of the population, the knowledge of ethnic roots, the restoration of traditions and customs, which served as a connecting link, as well as the specificity and originality of the approach are of particular importance. Currently, the problem of objective reading, coverage, and popularisation of the ancient and medieval Kazakh history and culture is acute. By rejecting one-sided interpretations of historical events, established clichés require impartial, academic analysis based on evidence drawn from a wide range of sources. The purpose of this study is to identify the problems of the history of Kazakhstan in the 13th-14th centuries, the general laws of world historical development and the features of the historical process, folk traditions by using a scientific and systematic approach. Based on the systematisation and classification of data from the geographical and Arab historical records of the 13th-14th centuries, the analysis of written monuments is performed, their interdependence is established, and the degree of completeness and reliability of the data in the works of the narrative is determined in an integral system. Due to the scientific expeditions and research trips to Mongolia, China, and Germany, Kazakh orientalists analysed and performed the first systematic processing of archival materials and historical evidence of the early history of resettlement based on the ancient Turkic manuscript, ancient Indian, and Chinese sources that formed a picture of the proto and ancient history. For example, the features of stone figures give an idea of the military hierarchy, military operations, the settlement of ethnic groups (ethnogeography), the worldview of the Turks, etc.


Author(s):  
Jonathan R. Eller

This book completes the biography trilogy begun in Becoming Ray Bradbury and continued in Ray Bradbury Unbound. Bradbury Beyond Apollo begins in the early 1970s, as Bradbury found himself fully established as a witness and celebrant of the Space Age. His storytelling powers were turning to stage, screen, and television adaptations of his classic midcentury titles, including The Martian Chronicles, Fahrenheit 451, Dandelion Wine, and Something Wicked This Way Comes. Although he was no longer producing a high volume of masterful tales, Bradbury Beyond Apollo chronicles how the last four decades of his life produced the playful fantasies of The Halloween Tree, his award-winning television series The Ray Bradbury Theater, a collaboration with Disney Imagineers on EPCOT’s Spaceship Earth, and significant essays on the common ground between science and religion represented by humanity’s Space Age achievements. The book also documents how Bradbury’s influential lectures, interviews, and essays explored the history of ideas, the nature of creativity, and his own evolving work ethic of optimal behaviorism. Mid-book chapters analyze Bradbury’s significant late-life achievements in fictionalized autobiography and his completion of books that originated decades earlier, including Somewhere a Band Is Playing, perhaps his most significant late-life reflection on time and memory. The book’s overarching contention is that Bradbury’s wide range of ventures were largely sustained by his ever-increasing prominence as a Space Age visionary.


2018 ◽  
Vol 8 (8) ◽  
pp. 1248 ◽  
Author(s):  
Haiqing Yao ◽  
Xiuwen Fu ◽  
Yongsheng Yang ◽  
Octavian Postolache

Outlier detection has attracted a wide range of attention for its broad applications, such as fault diagnosis and intrusion detection, among which the outlier analysis in data streams with high uncertainty and infinity is more challenging. Recent major work of outlier detection has focused on principle research of the local outlier factor, and there are few studies on incremental updating strategies, which are vital to outlier detection in data streams. In this paper, a novel incremental local outlier detection approach is introduced to dynamically evaluate the local outlier in the data stream. An extended local neighborhood consisting of k nearest neighbors, reverse nearest neighbors and shared nearest neighbors is estimated for each data. The theoretical evidence of algorithm complexity for the insertion of new data and deletion of old data in the composite neighborhood shows that the amount of affected data in the incremental calculation is finite. Finally, experiments performed on both synthetic and real datasets verify its scalability and outlier detection accuracy. All results show that the proposed approach has comparable performance with state-of-the-art k nearest neighbor-based methods.


Sign in / Sign up

Export Citation Format

Share Document