anomalous data
Recently Published Documents


TOTAL DOCUMENTS

120
(FIVE YEARS 51)

H-INDEX

17
(FIVE YEARS 3)

2022 ◽  
Vol 3 (1) ◽  
pp. 1-23
Author(s):  
Mao V. Ngo ◽  
Tie Luo ◽  
Tony Q. S. Quek

The advances in deep neural networks (DNN) have significantly enhanced real-time detection of anomalous data in IoT applications. However, the complexity-accuracy-delay dilemma persists: Complex DNN models offer higher accuracy, but typical IoT devices can barely afford the computation load, and the remedy of offloading the load to the cloud incurs long delay. In this article, we address this challenge by proposing an adaptive anomaly detection scheme with hierarchical edge computing (HEC). Specifically, we first construct multiple anomaly detection DNN models with increasing complexity and associate each of them to a corresponding HEC layer. Then, we design an adaptive model selection scheme that is formulated as a contextual-bandit problem and solved by using a reinforcement learning policy network . We also incorporate a parallelism policy training method to accelerate the training process by taking advantage of distributed models. We build an HEC testbed using real IoT devices and implement and evaluate our contextual-bandit approach with both univariate and multivariate IoT datasets. In comparison with both baseline and state-of-the-art schemes, our adaptive approach strikes the best accuracy-delay tradeoff on the univariate dataset and achieves the best accuracy and F1-score on the multivariate dataset with only negligibly longer delay than the best (but inflexible) scheme.


2021 ◽  
Vol 32 (1) ◽  
Author(s):  
Umberto Amato ◽  
Anestis Antoniadis ◽  
Italia De Feis ◽  
Irène Gijbels

AbstractThis article studies M-type estimators for fitting robust additive models in the presence of anomalous data. The components in the additive model are allowed to have different degrees of smoothness. We introduce a new class of wavelet-based robust M-type estimators for performing simultaneous additive component estimation and variable selection in such inhomogeneous additive models. Each additive component is approximated by a truncated series expansion of wavelet bases, making it feasible to apply the method to nonequispaced data and sample sizes that are not necessarily a power of 2. Sparsity of the additive components together with sparsity of the wavelet coefficients within each component (group), results into a bi-level group variable selection problem. In this framework, we discuss robust estimation and variable selection. A two-stage computational algorithm, consisting of a fast accelerated proximal gradient algorithm of coordinate descend type, and thresholding, is proposed. When using nonconvex redescending loss functions, and appropriate nonconvex penalty functions at the group level, we establish optimal convergence rates of the estimates. We prove variable selection consistency under a weak compatibility condition for sparse additive models. The theoretical results are complemented with some simulations and real data analysis, as well as a comparison to other existing methods.


2021 ◽  
Vol 11 (22) ◽  
pp. 10861
Author(s):  
Lucas A. da Silva ◽  
Eulanda M. dos Santos ◽  
Leo Araújo ◽  
Natalia S. Freire ◽  
Max Vasconcelos ◽  
...  

Data-driven methods—particularly machine learning techniques—are expected to play a key role in the headway of Industry 4.0. One increasingly popular application in this context is when anomaly detection is employed to test manufactured goods in assembly lines. In this work, we compare supervised, semi/weakly-supervised, and unsupervised strategies to detect anomalous sequences in video samples which may be indicative of defective televisions assembled in a factory. We compare 3D autoencoders, convolutional neural networks, and generative adversarial networks (GANs) with data collected in a laboratory. Our methodology to simulate anomalies commonly found in TV devices is discussed in this paper. We also propose an approach to generate anomalous sequences similar to those produced by a defective device as part of our GAN approach. Our results show that autoencoders perform poorly when trained with only non-anomalous data—which is important because class imbalance in industrial applications is typically skewed towards the non-anomalous class. However, we show that fine-tuning the GAN is a feasible approach to overcome this problem, achieving results comparable to those of supervised methods.


2021 ◽  
Vol 11 (10) ◽  
pp. 639
Author(s):  
Sabine Meister ◽  
Annette Upmeier zu Belzen

In this study, we investigated participants’ reactions to supportive and anomalous data in the context of population dynamics. Based on previous findings on conceptions about ecosystems and responses to anomalous data, we assumed a tendency to confirm the initial prediction after dealing with contradicting data. Our aim was to integrate a product-based analysis, operationalized as prediction group changes with process-based analyses of individual data-based scientific reasoning processes to gain a deeper insight into the ongoing cognitive processes. Based on a theoretical framework describing a data-based scientific reasoning process, we developed an instrument assessing initial and subsequent predictions, confidence change toward these predictions, and the subprocesses data appraisal, data explanation, and data interpretation. We analyzed the data of twenty pre-service biology teachers applying a mixed-methods approach. Our results show that participants tend to maintain their initial prediction fully or change to predictions associated with a mix of different conceptions. Maintenance was observed even if most participants were able to use sophisticated conceptual knowledge during their processes of data-based scientific reasoning. Furthermore, our findings implicate the role of confidence changes and the influences of test wiseness.


2021 ◽  
Vol 13 (19) ◽  
pp. 3869
Author(s):  
Lu Lee ◽  
Chunqiang Wu ◽  
Chengli Qi ◽  
Xiuqing Hu ◽  
Mingge Yuan ◽  
...  

The deep-space (DS) view spectra are used as a cold reference to calibrate the Hyperspectral Infrared Atmospheric Sounder (HIRAS) Earth scene (ES) observations. The DS spectra stability in the moving average window is crucial to the calibration accuracy of ES radiances. While in the winter and spring seasons, the HIRAS detector-3 DS view is susceptible to solar stray light intrusion when the satellite flies towards the tail of every descending orbit, and as a result, the measured DS spectra are contaminated by the stray light pseudo spectra, especially in the short-wave infrared (SWIR) band. The solar light intrusion issue was addressed on 13 December 2019 when the DS view angle of the scene selection mirror (SSM) was adjusted from −77.4° to −87°. As for the historic contaminated data, a correction method is applied to detect the anomalous data by checking the continuity of the DS spectra and then replace them with the proximate normal ones. The historic ES observations are recalibrated after the contaminated DS spectra correction. The effect of the correction is assessed by comparing the recalibrated HIRAS radiances with those measured by the Cross-track Infrared Sounder onboard the Suomi National Polar-orbiting Partnership Satellite (SNPP/CrIS) via the extended simultaneous nadir overpasses (SNOx) technique and by checking the consistency among the radiance data from different HIRAS detectors. The results show that the large biases of the radiance brightness temperature (BT) caused by the contamination are ameliorated greatly to the levels observed in the normal conditions.


Author(s):  
Siriwan Phongsasiri ◽  
Suwanna Rasmequan

In this paper, the Probabilistic Mapped Mean-Shift Algorithm is proposed to detect anomalous data in public datasets and local hospital children’s wellness clinic databases. The proposed framework consists of two main parts. First, the Probabilistic Mapping step consists of k-NN instance acquisition, data distribution calculation, and data point reposition.  Truncated Gaussian Distribution (TGD) was used for controlling the boundary of the mapped points. Second, the Outlier Detection step consists of outlier score calculation and outlier selection.  Experimental results show that the proposed algorithm outperformed the existing algorithms with real-world benchmark datasets and  a Children’s Wellness Clinic dataset (CWD). Outlier detection accuracy obtained from the proposed algorithm based on Wellness, Stamps, Arrhythmia, Pima, and Parkinson datasets was 93%, 94%, 80%, 75%, and 72%, respectively.


Author(s):  
Amar Prajapati ◽  
Airi Palva ◽  
Ingemar von Ossowski ◽  
Vengadesan Krishnan

Adhesion to host surfaces for bacterial survival and colonization involves a variety of molecular mechanisms. Ligilactobacillus ruminis, a strict anaerobe and gut autochthonous (indigenous) commensal, relies on sortase-dependent pili (LrpCBA) for adherence to the intestinal inner walls, thereby withstanding luminal content flow. Here, the LrpCBA pilus is a promiscuous binder to gut collagen, fibronectin and epithelial cells. Structurally, the LrpCBA pilus displays a representative hetero-oligomeric arrangement and consists of three types of pilin subunit, each with its own location and function, i.e. tip LrpC for adhesion, basal LrpB for anchoring and backbone LrpA for length. To provide further structural insights into the assembly, anchoring and functional mechanisms of sortase-dependent pili, each of the L. ruminis pilus proteins was produced recombinantly for crystallization and X-ray diffraction analysis. Crystals of LrpC, LrpB, LrpA and truncated LrpA generated by limited proteolysis were obtained and diffracted to resolutions of 3.0, 1.5, 2.2 and 1.4 Å, respectively. Anomalous data were also collected from crystals of selenomethionine-substituted LrpC and an iodide derivative of truncated LrpA. Successful strategies for protein production, crystallization and derivatization are reported.


Robotics ◽  
2021 ◽  
Vol 10 (3) ◽  
pp. 93
Author(s):  
Luigi Pangione ◽  
Guy Burroughes ◽  
Robert Skilton

For robotic systems involved in challenging environments, it is crucial to be able to identify faults as early as possible. In challenging environments, it is not always possible to explore all of the fault space, thus anomalous data can act as a broader surrogate, where an anomaly may represent a fault or a predecessor to a fault. This paper proposes a method for identifying anomalous data from a robot, whilst using minimal nominal data for training. A Monte Carlo ensemble sampled Variational AutoEncoder was utilised to determine nominal and anomalous data through reconstructing live data. This was tested on simulated anomalies of real data, demonstrating that the technique is capable of reliably identifying an anomaly without any previous knowledge of the system. With the proposed system, we obtained an F1-score of 0.85 through testing.


Author(s):  
Sajid Nazir ◽  
Shushma Patel ◽  
Dilip Patel

Supervisory control and data acquisition (SCADA) systems are industrial control systems that are used to monitor critical infrastructures such as airports, transport, health, and public services of national importance. These are cyber physical systems, which are increasingly integrated with networks and internet of things devices. However, this results in a larger attack surface for cyber threats, making it important to identify and thwart cyber-attacks by detecting anomalous network traffic patterns. Compared to other techniques, as well as detecting known attack patterns, machine learning can also detect new and evolving threats. Autoencoders are a type of neural network that generates a compressed representation of its input data and through reconstruction loss of inputs can help identify anomalous data. This paper proposes the use of autoencoders for unsupervised anomaly-based intrusion detection using an appropriate differentiating threshold from the loss distribution and demonstrate improvements in results compared to other techniques for SCADA gas pipeline dataset.


2021 ◽  
Vol 56 (2) ◽  
pp. 178-216
Author(s):  
Ciaran B. Trace ◽  
Yan Zhang

Sign in / Sign up

Export Citation Format

Share Document