Improved Macro-clusters generation using Top-k shared Micro-clusters in Data Streams

Author(s):  
LAKSHMI PRANEETHA

Now-a-days data streams or information streams are gigantic and quick changing. The usage of information streams can fluctuate from basic logical, scientific applications to vital business and money related ones. The useful information is abstracted from the stream and represented in the form of micro-clusters in the online phase. In offline phase micro-clusters are merged to form the macro clusters. DBSTREAM technique captures the density between micro-clusters by means of a shared density graph in the online phase. The density data in this graph is then used in reclustering for improving the formation of clusters but DBSTREAM takes more time in handling the corrupted data points In this paper an early pruning algorithm is used before pre-processing of information and a bloom filter is used for recognizing the corrupted information. Our experiments on real time datasets shows that using this approach improves the efficiency of macro-clusters by 90% and increases the generation of more number of micro-clusters within in a short time.

Sensors ◽  
2021 ◽  
Vol 21 (5) ◽  
pp. 1924
Author(s):  
Patrick Seeling ◽  
Martin Reisslein ◽  
Frank H. P. Fitzek

The Tactile Internet will require ultra-low latencies for combining machines and humans in systems where humans are in the control loop. Real-time and perceptual coding in these systems commonly require content-specific approaches. We present a generic approach based on deliberately reduced number accuracy and evaluate the trade-off between savings achieved and errors introduced with real-world data for kinesthetic movement and tele-surgery. Our combination of bitplane-level accuracy adaptability with perceptual threshold-based limits allows for great flexibility in broad application scenarios. Combining the attainable savings with the relatively small introduced errors enables the optimal selection of a working point for the method in actual implementations.


Entropy ◽  
2021 ◽  
Vol 23 (7) ◽  
pp. 859
Author(s):  
Abdulaziz O. AlQabbany ◽  
Aqil M. Azmi

We are living in the age of big data, a majority of which is stream data. The real-time processing of this data requires careful consideration from different perspectives. Concept drift is a change in the data’s underlying distribution, a significant issue, especially when learning from data streams. It requires learners to be adaptive to dynamic changes. Random forest is an ensemble approach that is widely used in classical non-streaming settings of machine learning applications. At the same time, the Adaptive Random Forest (ARF) is a stream learning algorithm that showed promising results in terms of its accuracy and ability to deal with various types of drift. The incoming instances’ continuity allows for their binomial distribution to be approximated to a Poisson(1) distribution. In this study, we propose a mechanism to increase such streaming algorithms’ efficiency by focusing on resampling. Our measure, resampling effectiveness (ρ), fuses the two most essential aspects in online learning; accuracy and execution time. We use six different synthetic data sets, each having a different type of drift, to empirically select the parameter λ of the Poisson distribution that yields the best value for ρ. By comparing the standard ARF with its tuned variations, we show that ARF performance can be enhanced by tackling this important aspect. Finally, we present three case studies from different contexts to test our proposed enhancement method and demonstrate its effectiveness in processing large data sets: (a) Amazon customer reviews (written in English), (b) hotel reviews (in Arabic), and (c) real-time aspect-based sentiment analysis of COVID-19-related tweets in the United States during April 2020. Results indicate that our proposed method of enhancement exhibited considerable improvement in most of the situations.


2021 ◽  
Author(s):  
Ahmed Al-Sabaa ◽  
Hany Gamal ◽  
Salaheldin Elkatatny

Abstract The formation porosity of drilled rock is an important parameter that determines the formation storage capacity. The common industrial technique for rock porosity acquisition is through the downhole logging tool. Usually logging while drilling, or wireline porosity logging provides a complete porosity log for the section of interest, however, the operational constraints for the logging tool might preclude the logging job, in addition to the job cost. The objective of this study is to provide an intelligent prediction model to predict the porosity from the drilling parameters. Artificial neural network (ANN) is a tool of artificial intelligence (AI) and it was employed in this study to build the porosity prediction model based on the drilling parameters as the weight on bit (WOB), drill string rotating-speed (RS), drilling torque (T), stand-pipe pressure (SPP), mud pumping rate (Q). The novel contribution of this study is to provide a rock porosity model for complex lithology formations using drilling parameters in real-time. The model was built using 2,700 data points from well (A) with 74:26 training to testing ratio. Many sensitivity analyses were performed to optimize the ANN model. The model was validated using unseen data set (1,000 data points) of Well (B), which is located in the same field and drilled across the same complex lithology. The results showed the high performance for the model either for training and testing or validation processes. The overall accuracy for the model was determined in terms of correlation coefficient (R) and average absolute percentage error (AAPE). Overall, R was higher than 0.91 and AAPE was less than 6.1 % for the model building and validation. Predicting the rock porosity while drilling in real-time will save the logging cost, and besides, will provide a guide for the formation storage capacity and interpretation analysis.


1998 ◽  
Vol 88 (1) ◽  
pp. 95-106 ◽  
Author(s):  
Mitchell Withers ◽  
Richard Aster ◽  
Christopher Young ◽  
Judy Beiriger ◽  
Mark Harris ◽  
...  

Abstract Digital algorithms for robust detection of phase arrivals in the presence of stationary and nonstationary noise have a long history in seismology and have been exploited primarily to reduce the amount of data recorded by data logging systems to manageable levels. In the present era of inexpensive digital storage, however, such algorithms are increasingly being used to flag signal segments in continuously recorded digital data streams for subsequent processing by automatic and/or expert interpretation systems. In the course of our development of an automated, near-real-time, waveform correlation event-detection and location system (WCEDS), we have surveyed the abilities of such algorithms to enhance seismic phase arrivals in teleseismic data streams. Specifically, we have considered envelopes generated by energy transient (STA/LTA), Z-statistic, frequency transient, and polarization algorithms. The WCEDS system requires a set of input data streams that have a smooth, low-amplitude response to background noise and seismic coda and that contain peaks at times corresponding to phase arrivals. The algorithm used to generate these input streams from raw seismograms must perform well under a wide range of source, path, receiver, and noise scenarios. Present computational capabilities allow the application of considerably more robust algorithms than have been historically used in real time. However, highly complex calculations can still be computationally prohibitive for current workstations when the number of data streams become large. While no algorithm was clearly optimal under all source, receiver, path, and noise conditions tested, an STA/LTA algorithm incorporating adaptive window lengths controlled by nonstationary seismogram spectral characteristics was found to provide an output that best met the requirements of a global correlation-based event-detection and location system.


2019 ◽  
Vol 6 (2) ◽  
pp. 2651-2668 ◽  
Author(s):  
Sefki Kolozali ◽  
Daniel Kuemper ◽  
Ralf Tonjes ◽  
Maria Bermudez-Edo ◽  
Nazli Farajidavar ◽  
...  
Keyword(s):  

Sign in / Sign up

Export Citation Format

Share Document