scholarly journals Projection of High-Dimensional Genome-Wide Expression on SOM Transcriptome Landscapes

2021 ◽  
Vol 2 (1) ◽  
pp. 62-76
Author(s):  
Maria Nikoghosyan ◽  
Henry Loeffler-Wirth ◽  
Suren Davidavyan ◽  
Hans Binder ◽  
Arsen Arakelyan

The self-organizing maps portraying has been proven to be a powerful approach for analysis of transcriptomic, genomic, epigenetic, single-cell, and pathway-level data as well as for “multi-omic” integrative analyses. However, the SOM method has a major disadvantage: it requires the retraining of the entire dataset once a new sample is added, which can be resource- and time-demanding. It also shifts the gene landscape, thus complicating the interpretation and comparison of results. To overcome this issue, we have developed two approaches of transfer learning that allow for extending SOM space with new samples, meanwhile preserving its intrinsic structure. The extension SOM (exSOM) approach is based on adding secondary data to the existing SOM space by “meta-gene adaptation”, while supervised SOM portrayal (supSOM) adds support vector machine regression model on top of the original SOM algorithm to “predict” the portrait of a new sample. Both methods have been shown to accurately combine existing and new data. With simulated data, exSOM outperforms supSOM for accuracy, while supSOM significantly reduces the computing time and outperforms exSOM for this parameter. Analysis of real datasets demonstrated the validity of the projection methods with independent datasets mapped on existing SOM space. Moreover, both methods well handle the projection of samples with new characteristics that were not present in training datasets.

2021 ◽  
Vol 13 (5) ◽  
pp. 2426
Author(s):  
David Bienvenido-Huertas ◽  
Jesús A. Pulido-Arcas ◽  
Carlos Rubio-Bellido ◽  
Alexis Pérez-Fargallo

In recent times, studies about the accuracy of algorithms to predict different aspects of energy use in the building sector have flourished, being energy poverty one of the issues that has received considerable critical attention. Previous studies in this field have characterized it using different indicators, but they have failed to develop instruments to predict the risk of low-income households falling into energy poverty. This research explores the way in which six regression algorithms can accurately forecast the risk of energy poverty by means of the fuel poverty potential risk index. Using data from the national survey of socioeconomic conditions of Chilean households and generating data for different typologies of social dwellings (e.g., form ratio or roof surface area), this study simulated 38,880 cases and compared the accuracy of six algorithms. Multilayer perceptron, M5P and support vector regression delivered the best accuracy, with correlation coefficients over 99.5%. In terms of computing time, M5P outperforms the rest. Although these results suggest that energy poverty can be accurately predicted using simulated data, it remains necessary to test the algorithms against real data. These results can be useful in devising policies to tackle energy poverty in advance.


2022 ◽  
Vol 14 (2) ◽  
pp. 302
Author(s):  
Chunchao Li ◽  
Xuebin Tang ◽  
Lulu Shi ◽  
Yuanxi Peng ◽  
Yuhua Tang

Effective feature extraction (FE) has always been the focus of hyperspectral images (HSIs). For aerial remote-sensing HSIs processing and its land cover classification, in this article, an efficient two-staged hyperspectral FE method based on total variation (TV) is proposed. In the first stage, the average fusion method was used to reduce the spectral dimension. Then, the anisotropic TV model with different regularization parameters was utilized to obtain featured blocks of different smoothness, each containing multi-scale structure information, and we stacked them as the next stage’s input. In the second stage, equipped with singular value transformation to reduce the dimension again, we followed an isotropic TV model based on split Bregman algorithm for further detail smoothing. Finally, the feature-extracted block was fed to the support vector machine for classification experiments. The results, with three hyperspectral datasets, demonstrate that our proposed method can competitively outperform state-of-the-art methods in terms of its classification accuracy and computing time. Also, our proposed method delivers robustness and stability by comprehensive parameter analysis.


2021 ◽  
Vol 0 (0) ◽  
Author(s):  
Sonia Bansal ◽  
Vineet Mehan

Abstract Objectives The key test in Content-Based Medical Image Retrieval (CBMIR) frameworks for MRI (Magnetic Resonance Imaging) pictures is the semantic hole between the low-level visual data caught by the MRI machine and the elevated level data seen by the human evaluator. Methods The conventional component extraction strategies centre just on low-level or significant level highlights and utilize some handmade highlights to diminish this hole. It is important to plan an element extraction structure to diminish this hole without utilizing handmade highlights by encoding/consolidating low-level and elevated level highlights. The Fleecy gathering is another packing technique, which is applied in plan depiction here and SVM (Support Vector Machine) is applied. Remembering the predefinition of bunching amount and enlistment cross-section is until now a significant theme, a new predefinition advance is extended in this paper, in like manner, and another CBMIR procedure is suggested and endorsed. It is essential to design a part extraction framework to diminish this opening without using painstakingly gathered features by encoding/joining low-level and critical level features. Results SVM and FCM (Fuzzy C Means) are applied to the power structures. Consequently, the incorporate vector contains all the objectives of the image. Recuperation of the image relies upon the detachment among request and database pictures called closeness measure. Conclusions Tests are performed on the 200 Image Database. Finally, exploratory results are evaluated by the audit and precision.


Sensors ◽  
2019 ◽  
Vol 19 (20) ◽  
pp. 4523 ◽  
Author(s):  
Carlos Cabo ◽  
Celestino Ordóñez ◽  
Fernando Sáchez-Lasheras ◽  
Javier Roca-Pardiñas ◽  
and Javier de Cos-Juez

We analyze the utility of multiscale supervised classification algorithms for object detection and extraction from laser scanning or photogrammetric point clouds. Only the geometric information (the point coordinates) was considered, thus making the method independent of the systems used to collect the data. A maximum of five features (input variables) was used, four of them related to the eigenvalues obtained from a principal component analysis (PCA). PCA was carried out at six scales, defined by the diameter of a sphere around each observation. Four multiclass supervised classification models were tested (linear discriminant analysis, logistic regression, support vector machines, and random forest) in two different scenarios, urban and forest, formed by artificial and natural objects, respectively. The results obtained were accurate (overall accuracy over 80% for the urban dataset, and over 93% for the forest dataset), in the range of the best results found in the literature, regardless of the classification method. For both datasets, the random forest algorithm provided the best solution/results when discrimination capacity, computing time, and the ability to estimate the relative importance of each variable are considered together.


Animals ◽  
2021 ◽  
Vol 11 (4) ◽  
pp. 1040
Author(s):  
Glynn Tonsor ◽  
Jayson Lusk ◽  
Shauna Tonsor

Meat products represent a significant share of US consumer food expenditures. The COVID-19 pandemic directly impacted both demand and supply of US beef and pork products for a prolonged period, resulting in a myriad of economic impacts. The complex disruptions create significant challenges in isolating and inferring consumer-demand changes from lagged secondary data. Thus, we turn to novel household-level data from a continuous consumer tracking survey, the Meat Demand Monitor, launched in February 2020, just before the US pandemic. We find diverse impacts across US households related to “hoarding” behavior and financial confidence over the course of the pandemic. Combined, these insights extend our understanding of pandemic impacts on US consumers and provide a timely example of knowledge enabled by ongoing and targeted household-level data collection and analysis.


2011 ◽  
Vol 2011 ◽  
pp. 1-28 ◽  
Author(s):  
Zhongqiang Chen ◽  
Zhanyan Liang ◽  
Yuan Zhang ◽  
Zhongrong Chen

Grayware encyclopedias collect known species to provide information for incident analysis, however, the lack of categorization and generalization capability renders them ineffective in the development of defense strategies against clustered strains. A grayware categorization framework is therefore proposed here to not only classify grayware according to diverse taxonomic features but also facilitate evaluations on grayware risk to cyberspace. Armed with Support Vector Machines, the framework builds learning models based on training data extracted automatically from grayware encyclopedias and visualizes categorization results with Self-Organizing Maps. The features used in learning models are selected with information gain and the high dimensionality of feature space is reduced by word stemming and stopword removal process. The grayware categorizations on diversified features reveal that grayware typically attempts to improve its penetration rate by resorting to multiple installation mechanisms and reduced code footprints. The framework also shows that grayware evades detection by attacking victims' security applications and resists being removed by enhancing its clotting capability with infected hosts. Our analysis further points out that species in categoriesSpywareandAdwarecontinue to dominate the grayware landscape and impose extremely critical threats to the Internet ecosystem.


2021 ◽  
Vol 11 (9) ◽  
pp. 4008
Author(s):  
Hang-Lo Lee ◽  
Jin-Seop Kim ◽  
Chang-Ho Hong ◽  
Dong-Keun Cho

Monitoring rock damage subjected to cracks is an important stage in underground spaces such as radioactive waste disposal repository, civil tunnel, and mining industries. Acoustic emission (AE) technique is one of the methods for monitoring rock damage and has been used by many researchers. To increase the accuracy of the evaluation and prediction of rock damage, it is required to consider various AE parameters, but this work is a difficult problem due to the complexity of the relationship between several AE parameters and rock damage. The purpose of this study is to propose a machine learning (ML)-based prediction model of the quantitative rock damage taking into account of combined features between several AE parameters. To achieve the goal, 10 granite samples from KAERI (Korea Atomic Energy Research Institute) in Daejeon were prepared, and a uniaxial compression test was conducted. To construct a model, random forest (RF) was employed and compared with support vector regression (SVR). The result showed that the generalization performance of RF is higher than that of SVRRBF. The R2, RMSE, and MAPE of the RF for testing data are 0.989, 0.032, and 0.014, respectively, which are acceptable results for application in laboratory scale. As a complementary work, parameter analysis was conducted by means of the Shapley additive explanations (SHAP) for model interpretability. It was confirmed that the cumulative absolute energy and initiation frequency were selected as the main parameter in both high and low-level degrees of the damage. This study suggests the possibility of extension to in-situ application, as subsequent research. Additionally, it provides information that the RF algorithm is a suitable technique and which parameters should be considered for predicting the degree of damage. In future work, we will extend the research to the engineering scale and consider the attenuation characteristics of rocks for practical application.


Author(s):  
Zepei Wu ◽  
Shuo Liu ◽  
Delong Zhao ◽  
Ling Yang ◽  
Zixin Xu ◽  
...  

AbstractCloud particles have different shapes in the atmosphere. Research on cloud particle shapes plays an important role in analyzing the growth of ice crystals and the cloud microphysics. To achieve an accurate and efficient classification algorithm on ice crystal images, this study uses image-based morphological processing and principal component analysis, to extract features of images and apply intelligent classification algorithms for the Cloud Particle Imager (CPI). Currently, there are mainly two types of ice-crystal classification methods: one is the mode parameterization scheme, and the other is the artificial intelligence model. Combined with data feature extraction, the dataset was tested on ten types of classifiers, and the highest average accuracy was 99.07%. The fastest processing speed of the real-time data processing test was 2,000 images/s. In actual application, the algorithm should consider the processing speed, because the images are in the order of millions. Therefore, a support vector machine (SVM) classifier was used in this study. The SVM-based optimization algorithm can classify ice crystals into nine classes with an average accuracy of 95%, blurred frame accuracy of 100%, with a processing speed of 2,000 images/s. This method has a relatively high accuracy and faster classification processing speed than the classic neural network model. The new method could be also applied in physical parameter analysis of cloud microphysics.


Author(s):  
Ke Li ◽  
Yalei Wu ◽  
Shimin Song ◽  
Yi sun ◽  
Jun Wang ◽  
...  

The measurement of spacecraft electrical characteristics and multi-label classification issues are generally including a large amount of unlabeled test data processing, high-dimensional feature redundancy, time-consumed computation, and identification of slow rate. In this paper, a fuzzy c-means offline (FCM) clustering algorithm and the approximate weighted proximal support vector machine (WPSVM) online recognition approach have been proposed to reduce the feature size and improve the speed of classification of electrical characteristics in the spacecraft. In addition, the main component analysis for the complex signals based on the principal component feature extraction is used for the feature selection process. The data capture contribution approach by using thresholds is furthermore applied to resolve the selection problem of the principal component analysis (PCA), which effectively guarantees the validity and consistency of the data. Experimental results indicate that the proposed approach in this paper can obtain better fault diagnosis results of the spacecraft electrical characteristics’ data, improve the accuracy of identification, and shorten the computing time with high efficiency.


Sign in / Sign up

Export Citation Format

Share Document