Solar farm voltage anomaly detection using high-resolution μPMU data-driven unsupervised machine learning

AbstractBackgroundImproving the knowledge and understanding of the environmental determinants of malaria vectors abundances at fine spatiotemporal scales is essential to design locally tailored vector control intervention. This work aimed at exploring the environmental tenets of human-biting activity in the main malaria vectors (Anopheles gambiae s.s., Anopheles coluzzii and Anopheles funestus) in the health district of Diébougou, rural Burkina Faso.MethodsAnopheles human-biting activity was monitored in 27 villages during 15 months (in 2017-2018), and environmental variables (meteorological and landscape) were extracted from high resolution satellite imagery. A two-step data-driven modeling study was then carried-out. Correlation coefficients between the biting rates of each vector species and the environmental variables taken at various temporal lags and spatial distances from the biting events were first calculated. Then, multivariate machine-learning models were generated and interpreted to i) pinpoint primary and secondary environmental drivers of variation in the biting rates of each species and ii) identify complex associations between the environmental conditions and the biting rates.ResultsMeteorological and landscape variables were often significantly correlated with the vectors’ biting rates. Many nonlinear associations and thresholds were unveiled by the multivariate models, both for meteorological and landscape variables. From these results, several aspects of the bio-ecology of the main malaria vectors were precised or hypothesized for the Diébougou area, including breeding sites typologies, development and survival rates in relation to weather, flight ranges from breeding sites, dispersal related to landscape openness.ConclusionsUsing high resolution data in an interpretable machine-learning modeling framework proved to be an efficient way to enhance the knowledge of the complex links between the environment and the malaria vectors at a local scale. More broadly, the emerging field of interpretable machine-learning has significant potential to help improving our understanding of the complex processes leading to malaria transmission.

Download Full-text

A New Data-Driven Seismic Interpretation Workflow Using Unsupervised Machine Learning and Non-Local Trace Matching

10.3997/2214-4609.201901509 ◽

2019 ◽

Author(s):

A.J. Bugge ◽

J.E. Lie ◽

A.K. Evensen ◽

S. Clark

Keyword(s):

Machine Learning ◽

Seismic Interpretation ◽

Data Driven ◽

Unsupervised Machine Learning ◽

Non Local

Download Full-text

Anomaly Detection in Electrical Substation Circuits via Unsupervised Machine Learning

2016 IEEE 17th International Conference on Information Reuse and Integration (IRI) ◽

10.1109/iri.2016.74 ◽

2016 ◽

Cited By ~ 6

Author(s):

Alfonso Valdes ◽

Richard Macwan ◽

Matthew Backes

Keyword(s):

Machine Learning ◽

Anomaly Detection ◽

Unsupervised Machine Learning ◽

Electrical Substation

Download Full-text

Unsupervised Learning for Product Use Activity Recognition: An Exploratory Study of a “Chatty Device”

Sensors ◽

10.3390/s21154991 ◽

2021 ◽

Vol 21 (15) ◽

pp. 4991

Author(s):

Mike Lakoju ◽

Nemitari Ajienka ◽

M. Ahmadieh Khanesar ◽

Pete Burnap ◽

David T. Branson

Keyword(s):

Machine Learning ◽

Activity Recognition ◽

Sampling Rate ◽

Machine Learning Algorithms ◽

Data Driven ◽

Sensor Data ◽

Unsupervised Machine Learning ◽

Fuzzy C Means ◽

Product Use ◽

Fuzzy C Means Algorithm

To create products that are better fit for purpose, manufacturers require new methods for gaining insights into product experience in the wild at scale. “Chatty Factories” is a concept that explores the transformative potential of placing IoT-enabled data-driven systems at the core of design and manufacturing processes, aligned to the Industry 4.0 paradigm. In this paper, we propose a model that enables new forms of agile engineering product development via “chatty” products. Products relay their “experiences” from the consumer world back to designers and product engineers through the mediation provided by embedded sensors, IoT, and data-driven design tools. Our model aims to identify product “experiences” to support the insights into product use. To this end, we create an experiment to: (i) collect sensor data at 100 Hz sampling rate from a “Chatty device” (device with sensors) for six common everyday activities that drive produce experience: standing, walking, sitting, dropping and picking up of the device, placing the device stationary on a side table, and a vibrating surface; (ii) pre-process and manually label the product use activity data; (iii) compare a total of four Unsupervised Machine Learning models (three classic and the fuzzy C-means algorithm) for product use activity recognition for each unique sensor; and (iv) present and discuss our findings. The empirical results demonstrate the feasibility of applying unsupervised machine learning algorithms for clustering product use activity. The highest obtained F-measure is 0.87, and MCC of 0.84, when the Fuzzy C-means algorithm is applied for clustering, outperforming the other three algorithms applied.

Download Full-text

AERIAL PHOTOGRAMMETRY AND MACHINE LEARNING BASED REGIONAL LANDSLIDE SUSCEPTIBILITY ASSESSMENT FOR AN EARTHQUAKE PRONE AREA IN TURKEY

ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences ◽

10.5194/isprs-archives-xliii-b3-2021-713-2021 ◽

2021 ◽

Vol XLIII-B3-2021 ◽

pp. 713-720

Author(s):

G. Karakas ◽

S. Kocaman ◽

C. Gokceoglu

Keyword(s):

Machine Learning ◽

High Resolution ◽

Landslide Susceptibility ◽

Data Driven ◽

Natural Phenomenon ◽

Landslide Occurrence ◽

Conditioning Factors ◽

Aerial Photogrammetry ◽

Prone Area ◽

Using Data

Abstract. Landslide is a frequently observed natural phenomenon and a geohazard with destructive effects on economies, society and the environment. Production of up-to-date landslide susceptibility (LS) maps is an essential process for landslide hazard mitigation. Obtaining up-to-date and accurate data for the production of LS maps is also important and this task can be achieved by using aerial photogrammetric techniques, which can produce geospatial data with high resolution. The produced geospatial datasets can be integrated in data-driven methods for obtaining accurate LS maps. In the present study, LS map was produced by using data-driven machine learning (ML) methods, i.e. random forest (RF). An earthquake and landslide prone area from the south-eastern part of Turkey was selected as the study area. Topographical derivatives were extracted from digital surface models (DSMs) produced by using aerial photogrammetric datasets with 30 cm ground sampling distances. The lithological parameters were employed in the study together with an accurate landslide inventory, which were also delineated by using the high-resolution DSMs and orthophotos. The relationships between the landslide occurrence and the pre-defined conditioning factors were analyzed using the frequency ratio (FR) method. The results show that the RF method exhibits high prediction performance in the study area with an area under curve (AUC) value of 0.92.

Download Full-text

Using country-level variables to classify countries according to the number of confirmed COVID-19 cases: An unsupervised machine learning approach

Wellcome Open Research ◽

10.12688/wellcomeopenres.15819.2 ◽

2020 ◽

Vol 5 ◽

pp. 56 ◽

Cited By ~ 1

Author(s):

Rodrigo M. Carrillo-Larco ◽

Manuel Castillo-Cara

Keyword(s):

Machine Learning ◽

Case Fatality Rate ◽

Learning Algorithms ◽

Case Fatality ◽

Machine Learning Algorithms ◽

Data Driven ◽

Mortality Data ◽

Fatality Rate ◽

Unsupervised Machine Learning ◽

Country Level

Background: The COVID-19 pandemic has attracted the attention of researchers and clinicians whom have provided evidence about risk factors and clinical outcomes. Research on the COVID-19 pandemic benefiting from open-access data and machine learning algorithms is still scarce yet can produce relevant and pragmatic information. With country-level pre-COVID-19-pandemic variables, we aimed to cluster countries in groups with shared profiles of the COVID-19 pandemic. Methods: Unsupervised machine learning algorithms (k-means) were used to define data-driven clusters of countries; the algorithm was informed by disease prevalence estimates, metrics of air pollution, socio-economic status and health system coverage. Using the one-way ANOVA test, we compared the clusters in terms of number of confirmed COVID-19 cases, number of deaths, case fatality rate and order in which the country reported the first case. Results: The model to define the clusters was developed with 155 countries. The model with three principal component analysis parameters and five or six clusters showed the best ability to group countries in relevant sets. There was strong evidence that the model with five or six clusters could stratify countries according to the number of confirmed COVID-19 cases (p<0.001). However, the model could not stratify countries in terms of number of deaths or case fatality rate. Conclusions: A simple data-driven approach using available global information before the COVID-19 pandemic, seemed able to classify countries in terms of the number of confirmed COVID-19 cases. The model was not able to stratify countries based on COVID-19 mortality data.

Download Full-text