scholarly journals Ferroelectret-based Hydrophone Employed in Oil Identification—A Machine Learning Approach

Sensors ◽  
2020 ◽  
Vol 20 (10) ◽  
pp. 2979 ◽  
Author(s):  
Daniel R. de Luna ◽  
T.T.C. Palitó ◽  
Y.A.O. Assagra ◽  
R.A.P. Altafim ◽  
J.P. Carmo ◽  
...  

This work focuses on acoustic analysis as a way of discriminating mineral oil, providing a robust technique, immune to electromagnetic noise, and in some cases, depending on the applied sensor, a low-cost technique. Thus, we propose a new method for the diagnosis of the quality of mineral oil used in electrical transformers, integrating a ferroelectric-based hydrophone and an acoustic transducer. Our classification solution is based on a supervised machine learning technique applied to the signals generated by an in-home built hydrophone. A total of three statistical datasets entries were collected during the acoustic experiments on four types of oils. The first, the second, and third datasets contain 180, 240, and 420 entries, respectively. Eighty-four features were considered from each dataset to apply to two classification approaches. The first classification approach is able to distinguish the oils from the four possible classes with a classification error less than 2%, while the second approach is able to successfully classify the oils without errors (e.g., with a score of 100%).

2017 ◽  
Author(s):  
Sabrina Jaeger ◽  
Simone Fulle ◽  
Samo Turk

Inspired by natural language processing techniques we here introduce Mol2vec which is an unsupervised machine learning approach to learn vector representations of molecular substructures. Similarly, to the Word2vec models where vectors of closely related words are in close proximity in the vector space, Mol2vec learns vector representations of molecular substructures that are pointing in similar directions for chemically related substructures. Compounds can finally be encoded as vectors by summing up vectors of the individual substructures and, for instance, feed into supervised machine learning approaches to predict compound properties. The underlying substructure vector embeddings are obtained by training an unsupervised machine learning approach on a so-called corpus of compounds that consists of all available chemical matter. The resulting Mol2vec model is pre-trained once, yields dense vector representations and overcomes drawbacks of common compound feature representations such as sparseness and bit collisions. The prediction capabilities are demonstrated on several compound property and bioactivity data sets and compared with results obtained for Morgan fingerprints as reference compound representation. Mol2vec can be easily combined with ProtVec, which employs the same Word2vec concept on protein sequences, resulting in a proteochemometric approach that is alignment independent and can be thus also easily used for proteins with low sequence similarities.


2021 ◽  
Vol 10 (7) ◽  
pp. 436
Author(s):  
Amerah Alghanim ◽  
Musfira Jilani ◽  
Michela Bertolotto ◽  
Gavin McArdle

Volunteered Geographic Information (VGI) is often collected by non-expert users. This raises concerns about the quality and veracity of such data. There has been much effort to understand and quantify the quality of VGI. Extrinsic measures which compare VGI to authoritative data sources such as National Mapping Agencies are common but the cost and slow update frequency of such data hinder the task. On the other hand, intrinsic measures which compare the data to heuristics or models built from the VGI data are becoming increasingly popular. Supervised machine learning techniques are particularly suitable for intrinsic measures of quality where they can infer and predict the properties of spatial data. In this article we are interested in assessing the quality of semantic information, such as the road type, associated with data in OpenStreetMap (OSM). We have developed a machine learning approach which utilises new intrinsic input features collected from the VGI dataset. Specifically, using our proposed novel approach we obtained an average classification accuracy of 84.12%. This result outperforms existing techniques on the same semantic inference task. The trustworthiness of the data used for developing and training machine learning models is important. To address this issue we have also developed a new measure for this using direct and indirect characteristics of OSM data such as its edit history along with an assessment of the users who contributed the data. An evaluation of the impact of data determined to be trustworthy within the machine learning model shows that the trusted data collected with the new approach improves the prediction accuracy of our machine learning technique. Specifically, our results demonstrate that the classification accuracy of our developed model is 87.75% when applied to a trusted dataset and 57.98% when applied to an untrusted dataset. Consequently, such results can be used to assess the quality of OSM and suggest improvements to the data set.


Sign in / Sign up

Export Citation Format

Share Document