scholarly journals Unsupervised classification of simulated magnetospheric regions

2021 ◽  
Vol 39 (5) ◽  
pp. 861-881
Author(s):  
Maria Elena Innocenti ◽  
Jorge Amaya ◽  
Joachim Raeder ◽  
Romain Dupuis ◽  
Banafsheh Ferdousi ◽  
...  

Abstract. In magnetospheric missions, burst-mode data sampling should be triggered in the presence of processes of scientific or operational interest. We present an unsupervised classification method for magnetospheric regions that could constitute the first step of a multistep method for the automatic identification of magnetospheric processes of interest. Our method is based on self-organizing maps (SOMs), and we test it preliminarily on data points from global magnetospheric simulations obtained with the OpenGGCM-CTIM-RCM code. The dimensionality of the data is reduced with principal component analysis before classification. The classification relies exclusively on local plasma properties at the selected data points, without information on their neighborhood or on their temporal evolution. We classify the SOM nodes into an automatically selected number of classes, and we obtain clusters that map to well-defined magnetospheric regions. We validate our classification results by plotting the classified data in the simulated space and by comparing with k-means classification. For the sake of result interpretability, we examine the SOM feature maps (magnetospheric variables are called features in the context of classification), and we use them to unlock information on the clusters. We repeat the classification experiments using different sets of features, we quantitatively compare different classification results, and we obtain insights on which magnetospheric variables make more effective features for unsupervised classification.

2021 ◽  
Author(s):  
Maria Elena Innocenti ◽  
Jorge Amaya ◽  
Joachim Raeder ◽  
Romain Dupuis ◽  
Banafsheh Ferdousi ◽  
...  

Abstract. In magnetospheric missions, burst mode data sampling should be triggered in the presence of processes of scientific or opera- tional interest. We present an unsupervised classification method for magnetospheric regions, that could constitute the first-step of a multi-step method for the automatic identification of magnetospheric processes of interest. Our method is based on Self Organizing Maps (SOMs), and we test it preliminarily on data points from global magnetospheric simulations obtained with the OpenGGCM-CTIM-RCM code. The classification relies exclusively on local plasma properties at the selected data points, without information on their neighborhood or on their temporal evolution. We classify the SOM nodes into an automatically selected number of classes, and we obtain clusters that map to well defined magnetospheric regions. For the sake of result interpretability, we examine the SOM feature maps (magnetospheric variables are called features in the context of classification), and we use them to unlock information on the clusters. We repeat the classification experiments using different sets of features, and we obtain insights on which magnetospheric variables make more effective features for unsupervised classification.


Author(s):  
Mohsen Moshki ◽  
Mehran Garmehi ◽  
Peyman Kabiri

In this chapter, application of Principal Component Analysis (PCA) and one of its extensions on intrusion detection is investigated. This extended version of PCA is modified to cover an important shortcoming of traditional PCA. In order to evaluate these modifications, it is mathematically proved that these modifications are beneficial and later on a known dataset such as the DARPA99 dataset is used to verify results experimentally. To verify this approach, initially the traditional PCA is used to preprocess the dataset. Later on, using a simple classifier such as KNN, the effectiveness of the multiclass classification is studied. In the reported work, instead of traditional PCA, a revised version of PCA named Weighted PCA (WPCA) will be used for feature extraction. The results from applying the aforementioned method to the DARPA99 dataset show that this approach results in better accuracy than the traditional PCA when a number of features are limited, a number of classes are large, and a population of classes is unbalanced. In some situations WPCA outperforms traditional PCA by more than 1% in accuracy.


Energies ◽  
2019 ◽  
Vol 12 (15) ◽  
pp. 2980 ◽  
Author(s):  
Bizhong Xia ◽  
Yadi Yang ◽  
Jie Zhou ◽  
Guanghao Chen ◽  
Yifan Liu ◽  
...  

Battery sorting is an important process in the production of lithium battery module and battery pack for electric vehicles (EVs). Accurate battery sorting can ensure good consistency of batteries for grouping. This study investigates the mechanism of inconsistency of battery packs and process of battery sorting on the lithium-ion battery module production line. Combined with the static and dynamic characteristics of lithium-ion batteries, the battery parameters on the production line that can be used as a sorting basis are analyzed, and the parameters of battery mass, volume, resistance, voltage, charge/discharge capacity and impedance characteristics are measured. The data of batteries are processed by the principal component analysis (PCA) method in statistics, and after analysis, the parameters of batteries are obtained. Principal components are used as sorting variables, and the self-organizing map (SOM) neural network is carried out to cluster the batteries. Group experiments are carried out on the separated batteries, and state of charge (SOC) consistency of the batteries is achieved to verify that the sorting algorithm and sorting result is accurate.


2018 ◽  
Vol 2018 ◽  
pp. 1-12 ◽  
Author(s):  
Emre Dandıl

Lung cancer is one of the most common cancer types. For the survival of the patient, early detection of lung cancer with the best treatment method is crucial. In this study, we propose a novel computer-aided pipeline on computed tomography (CT) scans for early diagnosis of lung cancer thanks to the classification of benign and malignant nodules. The proposed pipeline is composed of four stages. In preprocessing steps, CT images are enhanced, and lung volumes are extracted from the image with the help of a novel method called lung volume extraction method (LUVEM). The significance of the proposed pipeline is using LUVEM for extracting lung region. In nodule detection stage, candidate nodules are determined according to the circular Hough transform- (CHT-) based method. Then, lung nodules are segmented with self-organizing maps (SOM). In feature computation stage, intensity, shape, texture, energy, and combined features are used for feature extraction, and principal component analysis (PCA) is used for feature reduction step. In the final stage, probabilistic neural network (PNN) classifies benign and malign nodules. According to the experiments performed on our dataset, the proposed pipeline system can classify benign and malign nodules with 95.91% accuracy, 97.42% sensitivity, and 94.24% specificity. Even in cases of small-sized nodules (3–10 mm), the proposed system can determine the nodule type with 94.68% accuracy.


2000 ◽  
Vol 42 (7-8) ◽  
pp. 193-199 ◽  
Author(s):  
K. C. Yu ◽  
C. Y. Chang ◽  
L. J. Tsai ◽  
S. T. Ho

This study depicts the amounts of heavy metals (Cu, Zn, Pb, Cr, Co, and Ni) bound to four geochemical compositions of sediments (carbonates, Mn oxides, Fe oxides, and organic matters), and the correlations between various geochemical compositions and their heavy-metal complexes. Hundreds of data, obtained from sediments of five main rivers (located in southern Taiwan), were analyzed by using multivariate analysis method. Among the four different geochemical compositions, the total amount of the six heavy metals bound to organic matter is the highest. Zn is easily bound to various geochemical compositions, especially carbonates in sediments of the Yenshui river and the Potzu river (i.e., the heavily heavy-metal polluted sediments); Cr, Pb, and Ni are mainly bound to both Fe oxides and organic matter; Cu has high affinity to organic matter. By performing principal component analyses, the data points of organic matter and both Pb and Cu associated with organic matter cluster together in sediments ofthe Peikang, the Potzu, and the Yenshui rivers, which indicates both Pb and Cu might be discharged from the same pollution sources in these rivers. Moreover, correlations between any two binding fractions of heavy metal associated with Fe oxides in different rivers are not consistent, which indicates some factors including the binding sites of Fe oxides, the extent of heavy metal pollution, binding competitions between heavy metals may affect the amounts of heavy metals bound to Fe oxides. Furthermore, it should be noted that the amount of Pb bound to Fe oxides is highly correlated with the amount of Fe oxides in sediments of the Peikang, the Potzu, and the Yenshui rivers.


2018 ◽  
Vol 13 (No. 2) ◽  
pp. 83-89 ◽  
Author(s):  
E. Salković ◽  
I. Djurović ◽  
M. Knežević ◽  
V. Popović-Bugarin ◽  
A. Topalović

This paper describes the process of digitizing Montenegro’s legacy soil data, and an initial attempt to use it for digital soil mapping (DSM) purposes. The handwritten legacy numerical records of physical and chemical properties for more than 10 000 soil profiles and semi-profiles covering whole Montenegro have been digitized, and, out of those, more than 3000 have been georeferenced. Problems and challenges of digitization addressed in the paper are: processing of non-uniform handwritten numerical records, parsing a complex textual representation of those records, georeferencing the records using digitized (scanned) legacy soil maps, creating a single computer database containing all digitized records, transforming, cleaning and validating the data. For an initial assessment of the suitability of these data for mapping purposes, inverse distance weighting (IDW), ordinary kriging (OK), multiple linear regression (LR), and regression-kriging (RK) interpolation models were applied to create thematic maps of soil phosphorus. The area chosen for mapping is a 400 km<sup>2</sup> area near the city of Cetinje, containing 125 data points. LR and RK models were developed using publicly available digital elevation model (DEM) data and satellite global land survey (GLS) data as predictor variables. The digitized phosphorus quantities were normalized and scaled. The predictor variables were scaled, and principal component analysis was performed. For the best performing RK model an R<sup>2</sup> value of 0.23 was obtained.


Sign in / Sign up

Export Citation Format

Share Document