scholarly journals Periodicity Detection Method for Small-Sample Time Series Datasets

2010 ◽  
Vol 4 ◽  
pp. BBI.S5983 ◽  
Author(s):  
Daisuke Tominaga

Time series of gene expression often exhibit periodic behavior under the influence of multiple signal pathways, and are represented by a model that incorporates multiple harmonics and noise. Most of these data, which are observed using DNA microarrays, consist of few sampling points in time, but most periodicity detection methods require a relatively large number of sampling points. We have previously developed a detection algorithm based on the discrete Fourier transform and Akaike's information criterion. Here we demonstrate the performance of the algorithm for small-sample time series data through a comparison with conventional and newly proposed periodicity detection methods based on a statistical analysis of the power of harmonics. We show that this method has higher sensitivity for data consisting of multiple harmonics, and is more robust against noise than other methods. Although “combinatorial explosion” occurs for large datasets, the computational time is not a problem for small-sample datasets. The MATLAB/GNU Octave script of the algorithm is available on the author's web site: http://www.cbrc.jp/%7Etominaga/piccolo/ .

2021 ◽  
Vol 3 (1) ◽  
Author(s):  
Hitoshi Iuchi ◽  
Michiaki Hamada

Abstract Time-course experiments using parallel sequencers have the potential to uncover gradual changes in cells over time that cannot be observed in a two-point comparison. An essential step in time-series data analysis is the identification of temporal differentially expressed genes (TEGs) under two conditions (e.g. control versus case). Model-based approaches, which are typical TEG detection methods, often set one parameter (e.g. degree or degree of freedom) for one dataset. This approach risks modeling of linearly increasing genes with higher-order functions, or fitting of cyclic gene expression with linear functions, thereby leading to false positives/negatives. Here, we present a Jonckheere–Terpstra–Kendall (JTK)-based non-parametric algorithm for TEG detection. Benchmarks, using simulation data, show that the JTK-based approach outperforms existing methods, especially in long time-series experiments. Additionally, application of JTK in the analysis of time-series RNA-seq data from seven tissue types, across developmental stages in mouse and rat, suggested that the wave pattern contributes to the TEG identification of JTK, not the difference in expression levels. This result suggests that JTK is a suitable algorithm when focusing on expression patterns over time rather than expression levels, such as comparisons between different species. These results show that JTK is an excellent candidate for TEG detection.


Author(s):  
Kei Ishida ◽  
Masato Kiyama ◽  
Ali Ercan ◽  
Motoki Amagasaki ◽  
Tongbi Tu

Abstract This study proposes two effective approaches to reduce the required computational time of the training process for time-series modeling through a recurrent neural network (RNN) using multi-time-scale time-series data as input. One approach provides coarse and fine temporal resolutions of the input time-series data to RNN in parallel. The other concatenates the coarse and fine temporal resolutions of the input time-series data over time before considering them as the input to RNN. In both approaches, first, the finer temporal resolution data are utilized to learn the fine temporal scale behavior of the target data. Then, coarser temporal resolution data are expected to capture long-duration dependencies between the input and target variables. The proposed approaches were implemented for hourly rainfall–runoff modeling at a snow-dominated watershed by employing a long short-term memory network, which is a type of RNN. Subsequently, the daily and hourly meteorological data were utilized as the input, and hourly flow discharge was considered as the target data. The results confirm that both of the proposed approaches can reduce the required computational time for the training of RNN significantly. Lastly, one of the proposed approaches improves the estimation accuracy considerably in addition to computational efficiency.


2019 ◽  
pp. 147592171988711
Author(s):  
Wen-Jun Cao ◽  
Shanli Zhang ◽  
Numa J Bertola ◽  
I F C Smith ◽  
C G Koh

Train wheel flats are formed when wheels slip on rails. Crucial for passenger comfort and the safe operation of train systems, early detection and quantification of wheel-flat severity without interrupting railway operations is a desirable and challenging goal. Our method involves identifying the wheel-flat size by using a model updating strategy based on dynamic measurements. Although measurement and modelling uncertainties influence the identification results, they are rarely taken into account in most wheel-flat detection methods. Another challenge is the interpretation of time series data from multiple sensors. In this article, the size of the wheel flat is identified using a model-falsification approach that explicitly includes uncertainties in both measurement and modelling. A two-step important point selection method is proposed to interpret high-dimensional time series in the context of inverse identification. Perceptually important points, which are consistent with the human visual identification process, are extracted and further selected using joint entropy as an information gain metric. The proposed model-based methodology is applied to a field train track test in Singapore. The results show that the wheel-flat size identified using the proposed methodology is within the range of true observations. In addition, it is also shown that the inclusion of measurement and modelling uncertainties is essential to accurately evaluate the wheel-flat size because identification without uncertainties may lead to an underestimation of the wheel-flat size.


2021 ◽  
Vol 2021 ◽  
pp. 1-9
Author(s):  
Zhiwen Xiao ◽  
Jianbin Jiao

Fraud detection technology is an important method to ensure financial security. It is necessary to develop explainable fraud detection methods to express significant causality for participants in the transaction. The main contribution of our work is to propose an explainable classification method in the framework of multiple instance learning (MIL), which incorporates the AP clustering method in the self-training LSTM model to obtain a clear explanation. Based on a real-world dataset and a simulated dataset, we conducted two comparative studies to evaluate the effectiveness of the proposed method. Experimental results show that our proposed method achieves the similar predictive performance as the state-of-art method, while our method can generate clear causal explanations for a few labeled time series data. The significance of the research work is that financial institutions can use this method to efficiently identify fraudulent behaviors and easily give reasons for rejecting transactions so as to reduce fraud losses and management costs.


Cryptography ◽  
2021 ◽  
Vol 5 (4) ◽  
pp. 28
Author(s):  
Hossein Sayadi ◽  
Yifeng Gao ◽  
Hosein Mohammadi Makrani ◽  
Jessica Lin ◽  
Paulo Cesar Costa ◽  
...  

According to recent security analysis reports, malicious software (a.k.a. malware) is rising at an alarming rate in numbers, complexity, and harmful purposes to compromise the security of modern computer systems. Recently, malware detection based on low-level hardware features (e.g., Hardware Performance Counters (HPCs) information) has emerged as an effective alternative solution to address the complexity and performance overheads of traditional software-based detection methods. Hardware-assisted Malware Detection (HMD) techniques depend on standard Machine Learning (ML) classifiers to detect signatures of malicious applications by monitoring built-in HPC registers during execution at run-time. Prior HMD methods though effective have limited their study on detecting malicious applications that are spawned as a separate thread during application execution, hence detecting stealthy malware patterns at run-time remains a critical challenge. Stealthy malware refers to harmful cyber attacks in which malicious code is hidden within benign applications and remains undetected by traditional malware detection approaches. In this paper, we first present a comprehensive review of recent advances in hardware-assisted malware detection studies that have used standard ML techniques to detect the malware signatures. Next, to address the challenge of stealthy malware detection at the processor’s hardware level, we propose StealthMiner, a novel specialized time series machine learning-based approach to accurately detect stealthy malware trace at run-time using branch instructions, the most prominent HPC feature. StealthMiner is based on a lightweight time series Fully Convolutional Neural Network (FCN) model that automatically identifies potentially contaminated samples in HPC-based time series data and utilizes them to accurately recognize the trace of stealthy malware. Our analysis demonstrates that using state-of-the-art ML-based malware detection methods is not effective in detecting stealthy malware samples since the captured HPC data not only represents malware but also carries benign applications’ microarchitectural data. The experimental results demonstrate that with the aid of our novel intelligent approach, stealthy malware can be detected at run-time with 94% detection performance on average with only one HPC feature, outperforming the detection performance of state-of-the-art HMD and general time series classification methods by up to 42% and 36%, respectively.


Author(s):  
Daisuke Miki ◽  
Kazuyuki Demachi

Abstract Bearings are one of the main components of rotating machinery, and their failure is one of the most common cause of mechanical failure. Therefore, many fault detection methods based on artificial intelligence, such as machine learning and deep learning, have been proposed. Particularly, with recent advances in deep learning, many anomaly detection methods based on deep neural networks (DNN) have been proposed. DNNs provide high-performance recognition and are easy to implement; however, optimizing DNNs require large annotated datasets. Additionally, the annotation of time-series data, such as abnormal vibration signals, is time consuming. To solve these problems, we proposed a method to automatically extract features from abnormal vibration signals from the time-series data. In this research, we propose a new DNN training method and fault detection method inspired by multi-instance learning. Additionally, we propose a new loss function for optimizing the DNN model that identifies anomalies from a time-series data. Furthermore, to evaluate the feasibility of automatic feature extraction from vibration signal data using the proposed method, we conducted experiments to determine whether anomalies could be detected, identified, and localized in published datasets.


Author(s):  
Baoquan Wang ◽  
Tonghai Jiang ◽  
Xi Zhou ◽  
Bo Ma ◽  
Fan Zhao ◽  
...  

For abnormal detection of time series data, the supervised anomaly detection methods require labeled data. While the range of outlier factors used by the existing semi-supervised methods varies with data, model and time, the threshold for determining abnormality is difficult to obtain, in addition, the computational cost of the way to calculate outlier factors from other data points in the data set is also very large. These make such methods difficult to practically apply. This paper proposes a framework named LSTM-VE which uses clustering combined with visualization method to roughly label normal data, and then uses the normal data to train long short-term memory (LSTM) neural network for semi-supervised anomaly detection. The variance error (VE) of the normal data category classification probability sequence is used as outlier factor. The framework enables anomaly detection based on deep learning to be practically applied and using VE avoids the shortcomings of existing outlier factors and gains a better performance. In addition, the framework is easy to expand because the LSTM neural network can be replaced with other classification models. Experiments on the labeled and real unlabeled data sets prove that the framework is better than replicator neural networks with reconstruction error (RNN-RS) and has good scalability as well as practicability.


Author(s):  
W. Liu ◽  
J. Yang ◽  
J. Zhao ◽  
H. Shi ◽  
L. Yang

Most of the existing change detection methods using full polarimetric synthetic aperture radar (PolSAR) are limited to detecting change between two points in time. In this paper, a novel method was proposed to detect the change based on time-series data from different sensors. Firstly, the overall difference image of a time-series PolSAR was calculated by ominous statistic test. Secondly, difference images between any two images in different times ware acquired by R<sub>j</sub> statistic test. Generalized Gaussian mixture model (GGMM) was used to obtain time-series change detection maps in the last step for the proposed method. To verify the effectiveness of the proposed method, we carried out the experiment of change detection by using the time-series PolSAR images acquired by Radarsat-2 and Gaofen-3 over the city of Wuhan, in China. Results show that the proposed method can detect the time-series change from different sensors.


Sign in / Sign up

Export Citation Format

Share Document