Abstract
Most industrial systems have supervisory control and data acquisition (SCADA) systems that collect and store process parameters. SCADA data is seen as a valuable source to get and extract insights about the asset health condition and associated maintenance operations. It is still unclear how appliable and valid insights SCADA data might provide. The purpose of this paper is to explore the potential benefits of SCADA data for maintenance purposes and discuss the limitations from a machine learning perspective. In this paper, a two-year SCADA data related to a wind turbine generator is extracted and analysed using several machine learning algorithms, i.e., two-class boosted decision tree, two-class decision forest, k-means clustering on Azure ML learning studio. It is concluded that the SCADA data can be useful for failure detection and prediction once rich training data is given. In a failure prediction context, data richness means ensuring that fault features are presented in the training data. Moreover, the logs file can be used as labelled data to supervise some algorithms once they are reported in a more rigorous manner (timing, description).