scholarly journals Line Faults Classification Using Machine Learning on Three Phase Voltages Extracted from Large Dataset of PMU Measurements

2022 ◽  
Author(s):  
Hussain Otudi ◽  
Tatjana Dokic ◽  
Taif Mohamed ◽  
Mladen Kezunovic ◽  
Yi Hu ◽  
...  
2020 ◽  
Vol 10 (14) ◽  
pp. 4965
Author(s):  
Yordanos Dametw Mamuya ◽  
Yih-Der Lee ◽  
Jing-Wen Shen ◽  
Md Shafiullah ◽  
Cheng-Chien Kuo

Fault location with the highest possible accuracy has a significant role in expediting the restoration process, after being exposed to any kind of fault in power distribution grids. This paper provides fault detection, classification, and location methods using machine learning tools and advanced signal processing for a radial distribution grid. The three-phase current signals, one cycle before and one cycle after the inception of the fault are measured at the sending end of the grid. A discrete wavelet transform (DWT) is employed to extract useful features from the three-phase current signal. Standard statistical techniques are then applied onto DWT coefficients to extract the useful features. Among many features, mean, standard deviation (SD), energy, skewness, kurtosis, and entropy are evaluated and fed into the artificial neural network (ANN), Multilayer perceptron (MLP), and extreme learning machine (ELM), to identify the fault type and its location. During the training process, all types of faults with variations in the loading and fault resistance are considered. The performance of the proposed fault locating methods is evaluated in terms of root mean absolute percentage error (MAPE), root mean squared error (RMSE), Willmott’s index of agreement (WIA), coefficient of determination ( R 2 ), and Nash-Sutcliffe model efficiency coefficient (NSEC). The time it takes for training and testing are also considered. The proposed method that discrete wavelet transforms with machine learning is a very accurate and reliable method for fault classifying and locating in both a balanced and unbalanced radial system. 100% fault detection accuracy is achieved for all types of faults. Except for the slight confusion of three line to ground (3LG) and three line (3L) faults, 100% classification accuracy is also achieved. The performance measures show that both MLP and ELM are very accurate and comparative in locating faults. The method can be further applied for meshed networks with multiple distributed generators. Renewable generations in the form of distributed generation units can also be studied.


Author(s):  
Xianping Du ◽  
Onur Bilgen ◽  
Hongyi Xu

Abstract Machine learning for classification has been used widely in engineering design, for example, feasible domain recognition and hidden pattern discovery. Training an accurate machine learning model requires a large dataset; however, high computational or experimental costs are major issues in obtaining a large dataset for real-world problems. One possible solution is to generate a large pseudo dataset with surrogate models, which is established with a smaller set of real training data. However, it is not well understood whether the pseudo dataset can benefit the classification model by providing more information or deteriorates the machine learning performance due to the prediction errors and uncertainties introduced by the surrogate model. This paper presents a preliminary investigation towards this research question. A classification-and-regressiontree model is employed to recognize the design subspaces to support design decision-making. It is implemented on the geometric design of a vehicle energy-absorbing structure based on finite element simulations. Based on a small set of real-world data obtained by simulations, a surrogate model based on Gaussian process regression is employed to generate pseudo datasets for training. The results showed that the tree-based method could help recognize feasible design domains efficiently. Furthermore, the additional information provided by the surrogate model enhances the accuracy of classification. One important conclusion is that the accuracy of the surrogate model determines the quality of the pseudo dataset and hence, the improvements in the machine learning model.


2021 ◽  
Author(s):  
Mina Kwon ◽  
Hyeonjin Kim ◽  
Jaeyeong Yang ◽  
Jihyun Hur ◽  
Tae-Ho Lee ◽  
...  

AbstractWhile the negative impacts of caffeinated soda on children’s physical health have been well documented, it remains unexplored if habitual caffeinated soda intake is associated with intellectual capacities in children. Here, we investigated the behavioral and neural correlates of daily consumption of caffeinated soda on neurocognitive functions including working memory, impulsivity, and reward processing. We rigorously tested the link between caffeinated soda intake and the neurocognitive functions by applying machine learning and hierarchical linear regression to a large dataset from the Adolescent Brain Cognitive Development (ABCD) Study (N=3,966; age=9-10 years). The results showed that daily consumption of caffeinated soda in children was associated with impaired working memory and higher impulsivity, and increased amygdala activation during the emotional working memory task. The machine learning results also showed hypoactivity in the nucleus accumbens and the posterior cingulate cortex during reward processing. These results findings have significant implications for public health recommendations.Statement of RelevanceIs caffeinated soda bad for children’s brain development? If so, which specific intellectual capacity is affected? It is a question that many parents and caregivers are asking but surprisingly there is no clear guideline. Caffeinated soda is the most preferred route of caffeine intake in childhood and known to have physical side effects on children, but the link between habitual drinking of caffeinated soda in children and intellectual capacities remains largely unknown. Here, by applying machine learning and hierarchical regression approaches to a large dataset, we demonstrate that daily intake of caffeinated soda is associated with neurocognitive deficits including impaired working memory and higher impulsivity. These results have significant implications for public health recommendations.


2021 ◽  
Author(s):  
Anton Goretsky ◽  
Anastasia Dmitrienko ◽  
Irene Tang ◽  
Nicolae Lari ◽  
Owen Kunhardt ◽  
...  

In 2010, the Eunice Kennedy Shriver National Institute of Child Health and Human Development (NICHD) started the Nulliparous Pregnancy Outcomes Study: Monitoring Mothers-to-be (nuMoM2b), a prospective cohort study of a racially/ethnically/geographically diverse population of nulliparous women with singleton gestation. The nuMoM2b is a very large dataset, consisting of data for 10,038 patients with over 4,600 features per patient, spread out over 80 files. In this report, we share our experience preparing and working with this dataset. We present our data preprocessing of the nuMoM2b dataset to get a deeper understanding of the data, extract the most relevant features, make the fewest assumptions when filling in unknown values, and reducing the dimensionality of the data. We hope this report is useful to researchers interested in building machine learning and statistical models from the nuMoM2b dataset.


Author(s):  
Manjunath Aradhya ◽  
Jyothi VK ◽  
Sharath Kumar ◽  
Guru DS

Searching, recognizing and retrieving a video of interest from a large collection of a video data is an instantaneous requirement. This requirement has been recognized as an active area of research in computer vision, machine learning and pattern recognition. Flower video recognition and retrieval is vital in the field of floriculture and horticulture. In this paper we propose a model for the retrieval of videos of flowers. Initially, videos are represented with keyframes and flowers in keyframes are segmented from their background. Then, the model is analysed by features extracted from flower regions of the keyframe. A Linear Discriminant Analysis (LDA) is adapted for the extraction of discriminating features. Multiclass Support Vector Machine (MSVM) classifier is applied to identify the class of the query video. Experiments have been conducted on relatively large dataset of our own, consisting of 7788 videos of 30 different species of flowers captured from three different devices. Generally, retrieval of flower videos is addressed by the use of a query video consisting of a flower of a single species. In this work we made an attempt to develop a system consisting of retrieval of similar videos for a query video consisting of flowers of different species.


Author(s):  
Charan Lokku

Abstract: To avoid fraudulent Job postings on the internet, we target to minimize the number of such frauds through the Machine Learning approach to predict the chances of a job being fake so that the candidate can stay alert and make informed decisions if required. The model will use NLP to analyze the sentiments and pattern in the job posting and TF-IDF vectorizer for feature extraction. In this model, we are going to use Synthetic Minority Oversampling Technique (SMOTE) to balance the data and for classification, we used Random Forest to predict output with high accuracy, even for the large dataset it runs efficiently, and it enhances the accuracy of the model and prevents the overfitting issue. The final model will take in any relevant job posting data and produce a result determining whether the job is real or fake. Keywords: Natural Language Processing (NLP), Term Frequency-Inverse Document Frequency (TF-IDF), Synthetic Minority Oversampling Technique (SMOTE), Random Forest.


Sign in / Sign up

Export Citation Format

Share Document