Periodicity Detection Method for Small-Sample Time Series Datasets

Bioinformatics and Biology Insights ◽

10.4137/bbi.s5983 ◽

2010 ◽

Vol 4 ◽

pp. BBI.S5983 ◽

Cited By ~ 3

Author(s):

Daisuke Tominaga

Keyword(s):

Time Series ◽

Time Series Data ◽

Small Sample ◽

Detection Methods ◽

Series Data ◽

Computational Time ◽

Signal Pathways ◽

Periodicity Detection ◽

Sampling Points ◽

Sample Time

Time series of gene expression often exhibit periodic behavior under the influence of multiple signal pathways, and are represented by a model that incorporates multiple harmonics and noise. Most of these data, which are observed using DNA microarrays, consist of few sampling points in time, but most periodicity detection methods require a relatively large number of sampling points. We have previously developed a detection algorithm based on the discrete Fourier transform and Akaike's information criterion. Here we demonstrate the performance of the algorithm for small-sample time series data through a comparison with conventional and newly proposed periodicity detection methods based on a statistical analysis of the power of harmonics. We show that this method has higher sensitivity for data consisting of multiple harmonics, and is more robust against noise than other methods. Although “combinatorial explosion” occurs for large datasets, the computational time is not a problem for small-sample datasets. The MATLAB/GNU Octave script of the algorithm is available on the author's web site: http://www.cbrc.jp/%7Etominaga/piccolo/ .

Download Full-text

Jonckheere–Terpstra–Kendall-based non-parametric analysis of temporal differential gene expression

NAR Genomics and Bioinformatics ◽

10.1093/nargab/lqab021 ◽

2021 ◽

Vol 3 (1) ◽

Author(s):

Hitoshi Iuchi ◽

Michiaki Hamada

Keyword(s):

Gene Expression ◽

Time Series ◽

Time Course ◽

Time Series Data ◽

Expression Patterns ◽

Detection Methods ◽

Series Data ◽

Expression Levels ◽

Over Time ◽

Non Parametric

Abstract Time-course experiments using parallel sequencers have the potential to uncover gradual changes in cells over time that cannot be observed in a two-point comparison. An essential step in time-series data analysis is the identification of temporal differentially expressed genes (TEGs) under two conditions (e.g. control versus case). Model-based approaches, which are typical TEG detection methods, often set one parameter (e.g. degree or degree of freedom) for one dataset. This approach risks modeling of linearly increasing genes with higher-order functions, or fitting of cyclic gene expression with linear functions, thereby leading to false positives/negatives. Here, we present a Jonckheere–Terpstra–Kendall (JTK)-based non-parametric algorithm for TEG detection. Benchmarks, using simulation data, show that the JTK-based approach outperforms existing methods, especially in long time-series experiments. Additionally, application of JTK in the analysis of time-series RNA-seq data from seven tissue types, across developmental stages in mouse and rat, suggested that the wave pattern contributes to the TEG identification of JTK, not the difference in expression levels. This result suggests that JTK is a suitable algorithm when focusing on expression patterns over time rather than expression levels, such as comparisons between different species. These results show that JTK is an excellent candidate for TEG detection.

Download Full-text

Multi-time-scale input approaches for hourly-scale rainfall–runoff modeling based on recurrent neural networks

Journal of Hydroinformatics ◽

10.2166/hydro.2021.095 ◽

2021 ◽

Author(s):

Kei Ishida ◽

Masato Kiyama ◽

Ali Ercan ◽

Motoki Amagasaki ◽

Tongbi Tu

Keyword(s):

Time Series ◽

Time Scale ◽

Time Series Data ◽

Series Data ◽

Computational Time ◽

Rainfall Runoff ◽

Runoff Modeling ◽

Input Time ◽

Target Data ◽

Resolution Data

Abstract This study proposes two effective approaches to reduce the required computational time of the training process for time-series modeling through a recurrent neural network (RNN) using multi-time-scale time-series data as input. One approach provides coarse and fine temporal resolutions of the input time-series data to RNN in parallel. The other concatenates the coarse and fine temporal resolutions of the input time-series data over time before considering them as the input to RNN. In both approaches, first, the finer temporal resolution data are utilized to learn the fine temporal scale behavior of the target data. Then, coarser temporal resolution data are expected to capture long-duration dependencies between the input and target variables. The proposed approaches were implemented for hourly rainfall–runoff modeling at a snow-dominated watershed by employing a long short-term memory network, which is a type of RNN. Subsequently, the daily and hourly meteorological data were utilized as the input, and hourly flow discharge was considered as the target data. The results confirm that both of the proposed approaches can reduce the required computational time for the training of RNN significantly. Lastly, one of the proposed approaches improves the estimation accuracy considerably in addition to computational efficiency.

Download Full-text

Time series data interpretation for ‘wheel-flat’ identification including uncertainties

Structural Health Monitoring ◽

10.1177/1475921719887117 ◽

2019 ◽

pp. 147592171988711

Author(s):

Wen-Jun Cao ◽

Shanli Zhang ◽

Numa J Bertola ◽

I F C Smith ◽

C G Koh

Keyword(s):

Time Series ◽

Time Series Data ◽

Model Updating ◽

Information Gain ◽

Data Interpretation ◽

Detection Methods ◽

Series Data ◽

Point Selection ◽

Modelling Uncertainties ◽

Wheel Flat

Train wheel flats are formed when wheels slip on rails. Crucial for passenger comfort and the safe operation of train systems, early detection and quantification of wheel-flat severity without interrupting railway operations is a desirable and challenging goal. Our method involves identifying the wheel-flat size by using a model updating strategy based on dynamic measurements. Although measurement and modelling uncertainties influence the identification results, they are rarely taken into account in most wheel-flat detection methods. Another challenge is the interpretation of time series data from multiple sensors. In this article, the size of the wheel flat is identified using a model-falsification approach that explicitly includes uncertainties in both measurement and modelling. A two-step important point selection method is proposed to interpret high-dimensional time series in the context of inverse identification. Perceptually important points, which are consistent with the human visual identification process, are extracted and further selected using joint entropy as an information gain metric. The proposed model-based methodology is applied to a field train track test in Singapore. The results show that the wheel-flat size identified using the proposed methodology is within the range of true observations. In addition, it is also shown that the inclusion of measurement and modelling uncertainties is essential to accurately evaluate the wheel-flat size because identification without uncertainties may lead to an underestimation of the wheel-flat size.

Download Full-text

Explainable Fraud Detection for Few Labeled Time Series Data

Security and Communication Networks ◽

10.1155/2021/9941464 ◽

2021 ◽

Vol 2021 ◽

pp. 1-9

Author(s):

Zhiwen Xiao ◽

Jianbin Jiao

Keyword(s):

Time Series ◽

Time Series Data ◽

Research Work ◽

Fraud Detection ◽

Predictive Performance ◽

Multiple Instance Learning ◽

Detection Methods ◽

Series Data ◽

Causal Explanations ◽

Detection Technology

Fraud detection technology is an important method to ensure financial security. It is necessary to develop explainable fraud detection methods to express significant causality for participants in the transaction. The main contribution of our work is to propose an explainable classification method in the framework of multiple instance learning (MIL), which incorporates the AP clustering method in the self-training LSTM model to obtain a clear explanation. Based on a real-world dataset and a simulated dataset, we conducted two comparative studies to evaluate the effectiveness of the proposed method. Experimental results show that our proposed method achieves the similar predictive performance as the state-of-art method, while our method can generate clear causal explanations for a few labeled time series data. The significance of the research work is that financial institutions can use this method to efficiently identify fraudulent behaviors and easily give reasons for rejecting transactions so as to reduce fraud losses and management costs.

Download Full-text

Towards Accurate Run-Time Hardware-Assisted Stealthy Malware Detection: A Lightweight, Yet Effective Time Series CNN-Based Approach

Cryptography ◽

10.3390/cryptography5040028 ◽

2021 ◽

Vol 5 (4) ◽

pp. 28

Author(s):

Hossein Sayadi ◽

Yifeng Gao ◽

Hosein Mohammadi Makrani ◽

Jessica Lin ◽

Paulo Cesar Costa ◽

...

Keyword(s):

Machine Learning ◽

Time Series ◽

Time Series Data ◽

State Of The Art ◽

Malware Detection ◽

Detection Performance ◽

Malicious Code ◽

Detection Methods ◽

Series Data ◽

Run Time

According to recent security analysis reports, malicious software (a.k.a. malware) is rising at an alarming rate in numbers, complexity, and harmful purposes to compromise the security of modern computer systems. Recently, malware detection based on low-level hardware features (e.g., Hardware Performance Counters (HPCs) information) has emerged as an effective alternative solution to address the complexity and performance overheads of traditional software-based detection methods. Hardware-assisted Malware Detection (HMD) techniques depend on standard Machine Learning (ML) classifiers to detect signatures of malicious applications by monitoring built-in HPC registers during execution at run-time. Prior HMD methods though effective have limited their study on detecting malicious applications that are spawned as a separate thread during application execution, hence detecting stealthy malware patterns at run-time remains a critical challenge. Stealthy malware refers to harmful cyber attacks in which malicious code is hidden within benign applications and remains undetected by traditional malware detection approaches. In this paper, we first present a comprehensive review of recent advances in hardware-assisted malware detection studies that have used standard ML techniques to detect the malware signatures. Next, to address the challenge of stealthy malware detection at the processor’s hardware level, we propose StealthMiner, a novel specialized time series machine learning-based approach to accurately detect stealthy malware trace at run-time using branch instructions, the most prominent HPC feature. StealthMiner is based on a lightweight time series Fully Convolutional Neural Network (FCN) model that automatically identifies potentially contaminated samples in HPC-based time series data and utilizes them to accurately recognize the trace of stealthy malware. Our analysis demonstrates that using state-of-the-art ML-based malware detection methods is not effective in detecting stealthy malware samples since the captured HPC data not only represents malware but also carries benign applications’ microarchitectural data. The experimental results demonstrate that with the aid of our novel intelligent approach, stealthy malware can be detected at run-time with 94% detection performance on average with only one HPC feature, outperforming the detection performance of state-of-the-art HMD and general time series classification methods by up to 42% and 36%, respectively.

Download Full-text

Weakly Supervised Deep Neural Network for Bearing Fault Diagnosis

Volume 2: Nuclear Policy; Nuclear Safety, Security, and Cyber Security; Operating Plant Experience; Probabilistic Risk Assessments; SMR and Advanced Reactors ◽

10.1115/icone2020-16380 ◽

2020 ◽

Author(s):

Daisuke Miki ◽

Kazuyuki Demachi

Keyword(s):

Time Series ◽

Deep Learning ◽

Fault Detection ◽

High Performance ◽

Time Series Data ◽

Vibration Signal ◽

Detection Methods ◽

Series Data ◽

Vibration Signals ◽

Automatic Feature Extraction

Abstract Bearings are one of the main components of rotating machinery, and their failure is one of the most common cause of mechanical failure. Therefore, many fault detection methods based on artificial intelligence, such as machine learning and deep learning, have been proposed. Particularly, with recent advances in deep learning, many anomaly detection methods based on deep neural networks (DNN) have been proposed. DNNs provide high-performance recognition and are easy to implement; however, optimizing DNNs require large annotated datasets. Additionally, the annotation of time-series data, such as abnormal vibration signals, is time consuming. To solve these problems, we proposed a method to automatically extract features from abnormal vibration signals from the time-series data. In this research, we propose a new DNN training method and fault detection method inspired by multi-instance learning. Additionally, we propose a new loss function for optimizing the DNN model that identifies anomalies from a time-series data. Furthermore, to evaluate the feasibility of automatic feature extraction from vibration signal data using the proposed method, we conducted experiments to determine whether anomalies could be detected, identified, and localized in published datasets.

Download Full-text

Variance error of multi-classification based anomaly detection for time series data

Journal of Computational Methods in Sciences and Engineering ◽

10.3233/jcm-204699 ◽

2020 ◽

pp. 1-16

Author(s):

Baoquan Wang ◽

Tonghai Jiang ◽

Xi Zhou ◽

Bo Ma ◽

Fan Zhao ◽

...

Keyword(s):

Neural Network ◽

Time Series ◽

Anomaly Detection ◽

Time Series Data ◽

Short Term Memory ◽

Computational Cost ◽

Reconstruction Error ◽

Detection Methods ◽

Series Data ◽

Data Set

For abnormal detection of time series data, the supervised anomaly detection methods require labeled data. While the range of outlier factors used by the existing semi-supervised methods varies with data, model and time, the threshold for determining abnormality is difficult to obtain, in addition, the computational cost of the way to calculate outlier factors from other data points in the data set is also very large. These make such methods difficult to practically apply. This paper proposes a framework named LSTM-VE which uses clustering combined with visualization method to roughly label normal data, and then uses the normal data to train long short-term memory (LSTM) neural network for semi-supervised anomaly detection. The variance error (VE) of the normal data category classification probability sequence is used as outlier factor. The framework enables anomaly detection based on deep learning to be practically applied and using VE avoids the shortcomings of existing outlier factors and gains a better performance. In addition, the framework is easy to expand because the LSTM neural network can be replaced with other classification models. Experiments on the labeled and real unlabeled data sets prove that the framework is better than replicator neural networks with reconstruction error (RNN-RS) and has good scalability as well as practicability.

Download Full-text

Studies on the GAN-based Anomaly Detection Methods for the Time Series Data

IEEE Access ◽

10.1109/access.2021.3078553 ◽

2021 ◽

pp. 1-1

Author(s):

Chang-Ki Lee ◽

Yu-Jeong Cheon ◽

Wook-Yeon Hwang

Keyword(s):

Time Series ◽

Anomaly Detection ◽

Time Series Data ◽

Detection Methods ◽

Series Data

Download Full-text

A METHOD OF TIME-SERIES CHANGE DETECTION USING FULL POLSAR IMAGES FROM DIFFERENT SENSORS

ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences ◽

10.5194/isprs-archives-xlii-3-1127-2018 ◽

2018 ◽

Vol XLII-3 ◽

pp. 1127-1133

Author(s):

W. Liu ◽

J. Yang ◽

J. Zhao ◽

H. Shi ◽

L. Yang

Keyword(s):

Time Series ◽

Change Detection ◽

Time Series Data ◽

Gaussian Mixture ◽

Detection Methods ◽

Series Data ◽

Difference Image ◽

Novel Method ◽

Difference Images ◽

The City

Most of the existing change detection methods using full polarimetric synthetic aperture radar (PolSAR) are limited to detecting change between two points in time. In this paper, a novel method was proposed to detect the change based on time-series data from different sensors. Firstly, the overall difference image of a time-series PolSAR was calculated by ominous statistic test. Secondly, difference images between any two images in different times ware acquired by R<sub>j</sub> statistic test. Generalized Gaussian mixture model (GGMM) was used to obtain time-series change detection maps in the last step for the proposed method. To verify the effectiveness of the proposed method, we carried out the experiment of change detection by using the time-series PolSAR images acquired by Radarsat-2 and Gaofen-3 over the city of Wuhan, in China. Results show that the proposed method can detect the time-series change from different sensors.

Download Full-text

Statistical stage transition detection method for small sample gene expression time series data

Mathematical Biosciences ◽

10.1016/j.mbs.2014.06.005 ◽

2014 ◽

Vol 254 ◽

pp. 58-63

Author(s):

Daisuke Tominaga

Keyword(s):

Gene Expression ◽

Time Series ◽

Time Series Data ◽

Detection Method ◽

Small Sample ◽

Series Data ◽

Gene Expression Time Series ◽

Stage Transition ◽

Expression Time

Download Full-text