scholarly journals Big Machinery Data Preprocessing Methodology for Data-Driven Models in Prognostics and Health Management

Sensors ◽  
2021 ◽  
Vol 21 (20) ◽  
pp. 6841
Author(s):  
Sergio Cofre-Martel ◽  
Enrique Lopez Droguett ◽  
Mohammad Modarres

Sensor monitoring networks and advances in big data analytics have guided the reliability engineering landscape to a new era of big machinery data. Low-cost sensors, along with the evolution of the internet of things and industry 4.0, have resulted in rich databases that can be analyzed through prognostics and health management (PHM) frameworks. Several data-driven models (DDMs) have been proposed and applied for diagnostics and prognostics purposes in complex systems. However, many of these models are developed using simulated or experimental data sets, and there is still a knowledge gap for applications in real operating systems. Furthermore, little attention has been given to the required data preprocessing steps compared to the training processes of these DDMs. Up to date, research works do not follow a formal and consistent data preprocessing guideline for PHM applications. This paper presents a comprehensive step-by-step pipeline for the preprocessing of monitoring data from complex systems aimed for DDMs. The importance of expert knowledge is discussed in the context of data selection and label generation. Two case studies are presented for validation, with the end goal of creating clean data sets with healthy and unhealthy labels that are then used to train machinery health state classifiers.

Author(s):  
Zhimin Xi ◽  
Rong Jing ◽  
Pingfeng Wang ◽  
Chao Hu

This paper develops a Copula-based sampling method for data-driven prognostics and health management (PHM). The principal idea is to first build statistical relationship between failure time and the time realizations at specified degradation levels on the basis of off-line training data sets, then identify possible failure times for on-line testing units based on the constructed statistical model and available on-line testing data. Specifically, three technical components are proposed to implement the methodology. First of all, a generic health index system is proposed to represent the health degradation of engineering systems. Next, a Copula-based modeling is proposed to build statistical relationship between failure time and the time realizations at specified degradation levels. Finally, a sampling approach is proposed to estimate the failure time and remaining useful life (RUL) of on-line testing units. Two case studies, including a bearing system in electric cooling fans and a 2008 IEEE PHM challenge problem, are employed to demonstrate the effectiveness of the proposed methodology.


2013 ◽  
Vol 2013 ◽  
pp. 1-11
Author(s):  
Dewang Chen ◽  
Long Chen

In order to obtain a decent trade-off between the low-cost, low-accuracy Global Positioning System (GPS) receivers and the requirements of high-precision digital maps for modern railways, using the concept of constraint K-segment principal curves (CKPCS) and the expert knowledge on railways, we propose three practical CKPCS generation algorithms with reduced computational complexity, and thereafter more suitable for engineering applications. The three algorithms are named ALLopt, MPMopt, and DCopt, in which ALLopt exploits global optimization and MPMopt and DCopt apply local optimization with different initial solutions. We compare the three practical algorithms according to their performance on average projection error, stability, and the fitness for simple and complex simulated trajectories with noise data. It is found that ALLopt only works well for simple curves and small data sets. The other two algorithms can work better for complex curves and large data sets. Moreover, MPMopt runs faster than DCopt, but DCopt can work better for some curves with cross points. The three algorithms are also applied in generating GPS digital maps for two railway GPS data sets measured in Qinghai-Tibet Railway (QTR). Similar results like the ones in synthetic data are obtained. Because the trajectory of a railway is relatively simple and straight, we conclude that MPMopt works best according to the comprehensive considerations on the speed of computation and the quality of generated CKPCS. MPMopt can be used to obtain some key points to represent a large amount of GPS data. Hence, it can greatly reduce the data storage requirements and increase the positioning speed for real-time digital map applications.


2021 ◽  
Author(s):  
Tamim Ahmed ◽  
Kowshik Thopalli ◽  
Thanassis Rikakis ◽  
Pavan Turaga ◽  
Aisling Kelliher ◽  
...  

We are developing a system for long-term Semi-Automated Rehabilitation At the Home (SARAH) that relies on low-cost and unobtrusive video-based sensing. We present a cyber-human methodology used by the SARAH system for automated assessment of upper extremity stroke rehabilitation at the home. We propose a hierarchical model for automatically segmenting stroke survivor's movements and generating training task performance assessment scores during rehabilitation. The hierarchical model fuses expert therapist knowledge-based approaches with data-driven techniques. The expert knowledge is more observable in the higher layers of the hierarchy (task and segment) and therefore more accessible to algorithms incorporating high-level constraints relating to activity structure (i.e. type and order of segments per task). We utilize an HMM and a Decision Tree model to connect these high-level priors to data-driven analysis. The lower layers (RGB images and raw kinematics) need to be addressed primarily through data-driven techniques. We use a transformer-based architecture operating on low-level action features (tracking of individual body joints and objects) and a Multi-Stage Temporal Convolutional Network(MS-TCN) operating on raw RGB images. We develop a sequence combining these complementary algorithms effectively, thus encoding the information from different layers of the movement hierarchy. Through this combination, we produce robust segmentation and task assessment results on noisy, variable, and limited data, which is characteristic of low-cost video capture of rehabilitation at the home. Our proposed approach achieves 85% accuracy in per-frame labeling, 99% accuracy in segment classification, and 93% accuracy in task completion assessment. Although the methodology proposed in this paper applies to upper extremity rehabilitation using the SARAH system, it can potentially be used, with minor alterations, to assist automation in many other movement rehabilitation contexts (i.e. lower extremity training for neurological accidents).


2019 ◽  
Vol 11 (1) ◽  
Author(s):  
Khaled Akkad

Remaining useful life (RUL) estimation is one of the most important aspects of prognostics and health management (PHM). Various deep learning (DL) based techniques have been developed and applied for the purposes of RUL estimation. One limitation of DL is the lack of physical interpretations as they are purely data driven models. Another limitation is the need for an exceedingly large amount of data to arrive at an acceptable pattern recognition performance for the purposes of RUL estimation. This research is aimed to overcome these limitations by developing physics based DL techniques for RUL prediction and validate the method with real run-to-failure datasets. The contribution of the research relies on creating hybrid DL based techniques as well as combining physics based approaches with DL techniques for effective RUL prediction.


2021 ◽  
Vol 9 ◽  
Author(s):  
Xingang Zhao ◽  
Junyung Kim ◽  
Kyle Warns ◽  
Xinyan Wang ◽  
Pradeep Ramuhalli ◽  
...  

In a carbon-constrained world, future uses of nuclear power technologies can contribute to climate change mitigation as the installed electricity generating capacity and range of applications could be much greater and more diverse than with the current plants. To preserve the nuclear industry competitiveness in the global energy market, prognostics and health management (PHM) of plant assets is expected to be important for supporting and sustaining improvements in the economics associated with operating nuclear power plants (NPPs) while maintaining their high availability. Of interest are long-term operation of the legacy fleet to 80 years through subsequent license renewals and economic operation of new builds of either light water reactors or advanced reactor designs. Recent advances in data-driven analysis methods—largely represented by those in artificial intelligence and machine learning—have enhanced applications ranging from robust anomaly detection to automated control and autonomous operation of complex systems. The NPP equipment PHM is one area where the application of these algorithmic advances can significantly improve the ability to perform asset management. This paper provides an updated method-centric review of the full PHM suite in NPPs focusing on data-driven methods and advances since the last major survey article was published in 2015. The main approaches and the state of practice are described, including those for the tasks of data acquisition, condition monitoring, diagnostics, prognostics, and planning and decision-making. Research advances in non-nuclear power applications are also included to assess findings that may be applicable to the nuclear industry, along with the opportunities and challenges when adapting these developments to NPPs. Finally, this paper identifies key research needs in regard to data availability and quality, verification and validation, and uncertainty quantification.


2020 ◽  
Vol 6 (5) ◽  
pp. eaaw0961 ◽  
Author(s):  
S. Gerber ◽  
L. Pospisil ◽  
M. Navandar ◽  
I. Horenko

Finding reliable discrete approximations of complex systems is a key prerequisite when applying many of the most popular modeling tools. Common discretization approaches (e.g., the very popular K-means clustering) are crucially limited in terms of quality, parallelizability, and cost. We introduce a low-cost improved quality scalable probabilistic approximation (SPA) algorithm, allowing for simultaneous data-driven optimal discretization, feature selection, and prediction. We prove its optimality, parallel efficiency, and a linear scalability of iteration cost. Cross-validated applications of SPA to a range of large realistic data classification and prediction problems reveal marked cost and performance improvements. For example, SPA allows the data-driven next-day predictions of resimulated surface temperatures for Europe with the mean prediction error of 0.75°C on a common PC (being around 40% better in terms of errors and five to six orders of magnitude cheaper than with common computational instruments used by the weather services).


Sign in / Sign up

Export Citation Format

Share Document