scholarly journals JSOM: Jointly-evolving self-organizing maps for alignment of biological datasets and identification of related clusters

2021 ◽  
Vol 17 (3) ◽  
pp. e1008804
Author(s):  
Hong Seo Lim ◽  
Peng Qiu

With the rapid advances of various single-cell technologies, an increasing number of single-cell datasets are being generated, and the computational tools for aligning the datasets which make subsequent integration or meta-analysis possible have become critical. Typically, single-cell datasets from different technologies cannot be directly combined or concatenated, due to the innate difference in the data, such as the number of measured parameters and the distributions. Even datasets generated by the same technology are often affected by the batch effect. A computational approach for aligning different datasets and hence identifying related clusters will be useful for data integration and interpretation in large scale single-cell experiments. Our proposed algorithm called JSOM, a variation of the Self-organizing map, aligns two related datasets that contain similar clusters, by constructing two maps—low-dimensional discretized representation of datasets–that jointly evolve according to both datasets. Here we applied the JSOM algorithm to flow cytometry, mass cytometry, and single-cell RNA sequencing datasets. The resulting JSOM maps not only align the related clusters in the two datasets but also preserve the topology of the datasets so that the maps could be used for further analysis, such as clustering.

2017 ◽  
Vol 2017 ◽  
pp. 1-11 ◽  
Author(s):  
Adeoluwa Akande ◽  
Ana Cristina Costa ◽  
Jorge Mateu ◽  
Roberto Henriques

The explosion of data in the information age has provided an opportunity to explore the possibility of characterizing the climate patterns using data mining techniques. Nigeria has a unique tropical climate with two precipitation regimes: low precipitation in the north leading to aridity and desertification and high precipitation in parts of the southwest and southeast leading to large scale flooding. In this research, four indices have been used to characterize the intensity, frequency, and amount of rainfall over Nigeria. A type of Artificial Neural Network called the self-organizing map has been used to reduce the multiplicity of dimensions and produce four unique zones characterizing extreme precipitation conditions in Nigeria. This approach allowed for the assessment of spatial and temporal patterns in extreme precipitation in the last three decades. Precipitation properties in each cluster are discussed. The cluster closest to the Atlantic has high values of precipitation intensity, frequency, and duration, whereas the cluster closest to the Sahara Desert has low values. A significant increasing trend has been observed in the frequency of rainy days at the center of the northern region of Nigeria.


2007 ◽  
Vol 19 (9) ◽  
pp. 2515-2535 ◽  
Author(s):  
Takaaki Aoki ◽  
Toshio Aoyagi

The self-organizing map (SOM) is an unsupervised learning method as well as a type of nonlinear principal component analysis that forms a topologically ordered mapping from the high-dimensional data space to a low-dimensional representation space. It has recently found wide applications in such areas as visualization, classification, and mining of various data. However, when the data sets to be processed are very large, a copious amount of time is often required to train the map, which seems to restrict the range of putative applications. One of the major culprits for this slow ordering time is that a kind of topological defect (e.g., a kink in one dimension or a twist in two dimensions) gets created in the map during training. Once such a defect appears in the map during training, the ordered map cannot be obtained until the defect is eliminated, for which the number of iterations required is typically several times larger than in the absence of the defect. In order to overcome this weakness, we propose that an asymmetric neighborhood function be used for the SOM algorithm. Compared with the commonly used symmetric neighborhood function, we found that an asymmetric neighborhood function accelerates the ordering process of the SOM algorithm, though this asymmetry tends to distort the generated ordered map. We demonstrate that the distortion of the map can be suppressed by improving the asymmetric neighborhood function SOM algorithm. The number of learning steps required for perfect ordering in the case of the one-dimensional SOM is numerically shown to be reduced from O(N3) to O(N2) with an asymmetric neighborhood function, even when the improved algorithm is used to get the final map without distortion.


Medicina ◽  
2021 ◽  
Vol 57 (3) ◽  
pp. 235
Author(s):  
Diego Galvan ◽  
Luciane Effting ◽  
Hágata Cremasco ◽  
Carlos Adam Conte-Junior

Background and objective: In the current pandemic scenario, data mining tools are fundamental to evaluate the measures adopted to contain the spread of COVID-19. In this study, unsupervised neural networks of the Self-Organizing Maps (SOM) type were used to assess the spatial and temporal spread of COVID-19 in Brazil, according to the number of cases and deaths in regions, states, and cities. Materials and methods: The SOM applied in this context does not evaluate which measures applied have helped contain the spread of the disease, but these datasets represent the repercussions of the country’s measures, which were implemented to contain the virus’ spread. Results: This approach demonstrated that the spread of the disease in Brazil does not have a standard behavior, changing according to the region, state, or city. The analyses showed that cities and states in the north and northeast regions of the country were the most affected by the disease, with the highest number of cases and deaths registered per 100,000 inhabitants. Conclusions: The SOM clustering was able to spatially group cities, states, and regions according to their coronavirus cases, with similar behavior. Thus, it is possible to benefit from the use of similar strategies to deal with the virus’ spread in these cities, states, and regions.


2021 ◽  
Vol 11 (4) ◽  
pp. 1933
Author(s):  
Hiroomi Hikawa ◽  
Yuta Ichikawa ◽  
Hidetaka Ito ◽  
Yutaka Maeda

In this paper, a real-time dynamic hand gesture recognition system with gesture spotting function is proposed. In the proposed system, input video frames are converted to feature vectors, and they are used to form a posture sequence vector that represents the input gesture. Then, gesture identification and gesture spotting are carried out in the self-organizing map (SOM)-Hebb classifier. The gesture spotting function detects the end of the gesture by using the vector distance between the posture sequence vector and the winner neuron’s weight vector. The proposed gesture recognition method was tested by simulation and real-time gesture recognition experiment. Results revealed that the system could recognize nine types of gesture with an accuracy of 96.6%, and it successfully outputted the recognition result at the end of gesture using the spotting result.


2019 ◽  
Vol 496 ◽  
pp. 572-591 ◽  
Author(s):  
Ameya Malondkar ◽  
Roberto Corizzo ◽  
Iluju Kiringa ◽  
Michelangelo Ceci ◽  
Nathalie Japkowicz

Author(s):  
Macario O. Cordel ◽  
Arnulfo P. Azcarraga

Several time-critical problems relying on large amount of data, e.g., business trends, disaster response and disease outbreak, require cost-effective, timely and accurate data summary and visualization, in order to come up with an efficient and effective decision. Self-organizing map (SOM) is a very effective data clustering and visualization tool as it provides intuitive display of data in lower-dimensional space. However, with [Formula: see text] complexity, SOM becomes inappropriate for large datasets. In this paper, we propose a force-directed visualization method that emulates SOMs capability to display the data clusters with [Formula: see text] complexity. The main idea is to perform a force-directed fine-tuning of the 2D representation of data. To demonstrate the efficiency and the vast potential of the proposed method as a fast visualization tool, the methodology is used to do a 2D-projection of the MNIST handwritten digits dataset.


2019 ◽  
Vol 1 (1) ◽  
pp. 194-202
Author(s):  
Adrian Costea

Abstract This paper assesses the financial performance of Romania’s non-banking financial institutions (NFIs) using a neural network training algorithm proposed by Kohonen, namely the Self-Organizing Maps algorithm. The algorithm takes the financial dataset and positiones each observation into a self-organizing map (a two-dimensional map) which can be latter used to visualize the trajectories of an individual NFI and explain it based on different performance dimensions, such as capital adequacy, assets’ quality and profitability. Further, we use the map as an early-warning system that would accurately forecast the NFIs future performance (whether they would stay or be eliminated from the NFI’s Special Register three quarters into the future). The results are promising: the model is able to correctly predict NFIs’ performance movements. Finally, we compared the results of our SOM-based model with those obtained by applying a multivariate logit-based model. The SOM model performed worse in discriminating the NFIs’ performance: the performance classes were not clearly defined and the model lacked the interpretability of the results. In the contrary, the multivariate logit coefficients have nice interpretability and an individual default probability estimate is obtained for each new observation. However, we can benefit from the results of both techniques: the visualization capabilities of the SOM model and the interpretability of multivariate logit-based model.


2009 ◽  
Vol 18 (04) ◽  
pp. 603-611 ◽  
Author(s):  
CHIH-FONG TSAI ◽  
YUAH-CHIAO LIN ◽  
YI-TING WANG

Stock trading activities are always very popular in many countries. Generally, investors with various backgrounds have different preferences over the stocks they trade. In literature, a number of studies examine the institutions' holding preferences for certain stock characteristics when choosing the security portfolio. However, very few studies investigate the stock trading preferences of individual investors. In this paper, we focus on two factors which affect the portfolio choices of investors, which are stock characteristics and investor features. In particular, a self-organizing map (SOM) is used to group a certain number of clusters based on a chosen dataset. Then, the decision tree model is used to extract useful rules from the clusters which contain the most trading records in the sample. We find that if the investors are females, less wealthy, and make stock trades with lower frequencies, they will be more careful and conservative. On the other hand, if the investors are males, having a high level of wealth, and make stock trades very often, they tend to choose stocks with high EPS, high market-to-book, and high prices.


2019 ◽  
Vol 32 (22) ◽  
pp. 7747-7761 ◽  
Author(s):  
Leif M. Swenson ◽  
Richard Grotjahn

Abstract Extreme precipitation events have major societal impacts. These events are rare and can have small spatial scale, making statistical analysis difficult; both factors are mitigated by combining events over a region. A methodology is presented to objectively define “coherent” regions wherein data points have matching annual cycles. Regions are found by training self-organizing maps (SOMs) on the annual cycle of precipitation for each grid point across the contiguous United States (CONUS). Using the annual cycle for our intended application minimizes problems caused by consecutive dry periods and localized extreme events. Multiple criteria are applied to identify useful numbers of regions for our future application. Criteria assess these properties for each region: having many more events than experienced by a single grid point, good connectedness and compactness, and robustness to changing the number of regions. Our methodology is applicable across datasets and is tested here on both reanalysis and gridded observational data. Precipitation regions obtained align with large-scale geographical features and are readily interpretable. Useful numbers of regions balance two conflicting preferences: larger regions contain more events and thereby have more robust statistics, but more compact regions allow weather patterns associated with extreme events to be aggregated with confidence. For 6-h precipitation, 12–15 regions over the CONUS optimize our metrics. The regions obtained are compared against two existing region archetypes. For example, a popular set of regions, based on nine groups of states, has less coherent regions than defining the same number of regions with our SOM methodology.


Atmosphere ◽  
2019 ◽  
Vol 10 (8) ◽  
pp. 474 ◽  
Author(s):  
Min-Hee Lee ◽  
Joo-Hong Kim

Contribution of extra-tropical synoptic cyclones to the formation of mean summer atmospheric circulation patterns in the Arctic domain (≥60° N) was investigated by clustering dominant Arctic circulation patterns based on daily mean sea-level pressure using self-organizing maps (SOMs). Three SOM patterns were identified; one pattern had prevalent low-pressure anomalies in the Arctic Circle (SOM1), while two exhibited opposite dipoles with primary high-pressure anomalies covering the Arctic Ocean (SOM2 and SOM3). The time series of their occurrence frequencies demonstrated the largest inter-annual variation in SOM1, a slight decreasing trend in SOM2, and the abrupt upswing after 2007 in SOM3. Analyses of synoptic cyclone activity using the cyclone track data confirmed the vital contribution of synoptic cyclones to the formation of large-scale patterns. Arctic cyclone activity was enhanced in the SOM1, which was consistent with the meridional temperature gradient increases over the land–Arctic ocean boundaries co-located with major cyclone pathways. The composite daily synoptic evolution of each SOM revealed that all three SOMs persisted for less than five days on average. These evolutionary short-term weather patterns have substantial variability at inter-annual and longer timescales. Therefore, the synoptic-scale activity is central to forming the seasonal-mean climate of the Arctic.


Sign in / Sign up

Export Citation Format

Share Document