Orthogonal linear separation analysis: an approach to decompose the complex effects of a perturbagen

Mapping Intimacies ◽

10.1101/384446 ◽

2018 ◽

Author(s):

Tadahaya Mizuno ◽

Setsuo Kinoshita ◽

Shotaro Maedera ◽

Takuya Ito ◽

Hiroyuki Kusuhara

Keyword(s):

Clustering Analysis ◽

Drug Effects ◽

Data Sets ◽

Transcriptome Data ◽

Analysis Method ◽

Profile Data ◽

Mcf7 Cells ◽

Connectivity Map ◽

Linear Separation ◽

Significant Enrichment

AbstractDrugs have multiple, not single, effects. Decomposition of drug effects into basic components helps us to understand the pharmacological properties of a drug and contributes to drug discovery. We have extended factor analysis and developed a novel profile data analysis method, orthogonal linear separation analysis (OLSA). OLSA contracted 11,911 genes to 118 factors from transcriptome data of MCF7 cells treated with 318 compounds in Connectivity Map. Ontology of the main genes constituting the factors detected significant enrichment of the ontology in 65 of 118 factors and similar results were obtained in two other data sets. One factor discriminated two Hsp90 inhibitors, geldanamycin and radicicol, while clustering analysis could not. Doxorubicin was estimated to inhibit Na+/K+ATPase, one of the suggested mechanisms of doxorubicin-induced cardiotoxicity. Based on the factor including PI3K/AKT/mTORC1 inhibition activity, 5 compounds were predicted to be novel autophagy inducers, and other analysis including western blotting revealed that 4 of the 5 actually induced autophagy. These findings indicate the potential of OLSA to decompose the effects of a drug and identify its basic components. (<175 words)

Generation of geometric interpolations of building types with deep variational autoencoders

Design Science ◽

10.1017/dsj.2020.31 ◽

2020 ◽

Vol 6 ◽

Author(s):

Jaime de Miguel Rodríguez ◽

Maria Eugenia Villafañe ◽

Luka Piškorec ◽

Fernando Sancho Caparrini

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Large Data ◽

Learning Model ◽

Large Data Sets ◽

Data Sets ◽

Connectivity Map ◽

Data Set ◽

3D Objects ◽

Machine Learning Model

Abstract This work presents a methodology for the generation of novel 3D objects resembling wireframes of building types. These result from the reconstruction of interpolated locations within the learnt distribution of variational autoencoders (VAEs), a deep generative machine learning model based on neural networks. The data set used features a scheme for geometry representation based on a ‘connectivity map’ that is especially suited to express the wireframe objects that compose it. Additionally, the input samples are generated through ‘parametric augmentation’, a strategy proposed in this study that creates coherent variations among data by enabling a set of parameters to alter representative features on a given building type. In the experiments that are described in this paper, more than 150 k input samples belonging to two building types have been processed during the training of a VAE model. The main contribution of this paper has been to explore parametric augmentation for the generation of large data sets of 3D geometries, showcasing its problems and limitations in the context of neural networks and VAEs. Results show that the generation of interpolated hybrid geometries is a challenging task. Despite the difficulty of the endeavour, promising advances are presented.

Development of Orthogonal Linear Separation Analysis (OLSA) to Decompose Drug Effects into Basic Components

Scientific Reports ◽

10.1038/s41598-019-38528-4 ◽

2019 ◽

Vol 9 (1) ◽

Cited By ~ 4

Author(s):

Tadahaya Mizuno ◽

Setsuo Kinoshita ◽

Takuya Ito ◽

Shotaro Maedera ◽

Hiroyuki Kusuhara

Keyword(s):

Drug Effects ◽

Linear Separation

A stepwise emission clustering analysis method for analyzing the effects of heavy metal emissions from multiple income groups

The Science of The Total Environment ◽

10.1016/j.scitotenv.2021.152472 ◽

2021 ◽

pp. 152472

Author(s):

Boyue Zheng ◽

Guohe Huang ◽

Lirong Liu ◽

Jizhe Li ◽

Yu Li

Keyword(s):

Heavy Metal ◽

Clustering Analysis ◽

Analysis Method ◽

Income Groups ◽

Heavy Metal Emissions

Evaluate Turkey’s Climate Classification by Clustering Analysis Method

Key Challenges in Geography - Smart Geography ◽

10.1007/978-3-030-28191-5_4 ◽

2019 ◽

pp. 41-53

Author(s):

Barbaros Gönençgil

Keyword(s):

Clustering Analysis ◽

Analysis Method ◽

Climate Classification

Input data preprocessing method for exchange rate forecasting via neural network

Serbian Journal of Electrical Engineering ◽

10.2298/sjee1404597a ◽

2014 ◽

Vol 11 (4) ◽

pp. 597-608

Author(s):

Dragan Antic ◽

Miroslav Milovanovic ◽

Stanisa Peric ◽

Sasa Nikolic ◽

Marko Milojkovic

Keyword(s):

Neural Network ◽

Input Data ◽

Principal Component ◽

Economic Systems ◽

Data Sets ◽

Foreign Exchange Rates ◽

Analysis Method ◽

Feed Forward Neural Network ◽

Parameters Selection ◽

Neural Network Input

The aim of this paper is to present a method for neural network input parameters selection and preprocessing. The purpose of this network is to forecast foreign exchange rates using artificial intelligence. Two data sets are formed for two different economic systems. Each system is represented by six categories with 70 economic parameters which are used in the analysis. Reduction of these parameters within each category was performed by using the principal component analysis method. Component interdependencies are established and relations between them are formed. Newly formed relations were used to create input vectors of a neural network. The multilayer feed forward neural network is formed and trained using batch training. Finally, simulation results are presented and it is concluded that input data preparation method is an effective way for preprocessing neural network data.

Application of Improved Fuzzy C-means Clustering Analysis Method of Load Characteristics Stats

Proceedings of the 2016 3rd International Conference on Materials Engineering, Manufacturing Technology and Control ◽

10.2991/icmemtc-16.2016.234 ◽

2016 ◽

Author(s):

Lin Li ◽

Dong Liu ◽

Ying Du ◽

Junli Liu

Keyword(s):

Clustering Analysis ◽

Analysis Method ◽

Fuzzy C Means ◽

Fuzzy C Means Clustering ◽

Load Characteristics

A Data Distribution View of Clustering Algorithms

Encyclopedia of Data Warehousing and Mining, Second Edition ◽

10.4018/978-1-60566-010-3.ch059 ◽

2011 ◽

pp. 374-381 ◽

Cited By ~ 1

Author(s):

Junjie Wu ◽

Jian Chen ◽

Hui Xiong

Keyword(s):

Data Mining ◽

Cluster Analysis ◽

Clustering Analysis ◽

Clustering Algorithm ◽

Clustering Algorithms ◽

Data Distribution ◽

Point Of View ◽

Group Method ◽

Data Sets ◽

Distribution Point

Cluster analysis (Jain & Dubes, 1988) provides insight into the data by dividing the objects into groups (clusters), such that objects in a cluster are more similar to each other than objects in other clusters. Cluster analysis has long played an important role in a wide variety of fields, such as psychology, bioinformatics, pattern recognition, information retrieval, machine learning, and data mining. Many clustering algorithms, such as K-means and Unweighted Pair Group Method with Arithmetic Mean (UPGMA), have been wellestablished. A recent research focus on clustering analysis is to understand the strength and weakness of various clustering algorithms with respect to data factors. Indeed, people have identified some data characteristics that may strongly affect clustering analysis including high dimensionality and sparseness, the large size, noise, types of attributes and data sets, and scales of attributes (Tan, Steinbach, & Kumar, 2005). However, further investigation is expected to reveal whether and how the data distributions can have the impact on the performance of clustering algorithms. Along this line, we study clustering algorithms by answering three questions: 1. What are the systematic differences between the distributions of the resultant clusters by different clustering algorithms? 2. How can the distribution of the “true” cluster sizes make impact on the performances of clustering algorithms? 3. How to choose an appropriate clustering algorithm in practice? The answers to these questions can guide us for the better understanding and the use of clustering methods. This is noteworthy, since 1) in theory, people seldom realized that there are strong relationships between the clustering algorithms and the cluster size distributions, and 2) in practice, how to choose an appropriate clustering algorithm is still a challenging task, especially after an algorithm boom in data mining area. This chapter thus tries to fill this void initially. To this end, we carefully select two widely used categories of clustering algorithms, i.e., K-means and Agglomerative Hierarchical Clustering (AHC), as the representative algorithms for illustration. In the chapter, we first show that K-means tends to generate the clusters with a relatively uniform distribution on the cluster sizes. Then we demonstrate that UPGMA, one of the robust AHC methods, acts in an opposite way to K-means; that is, UPGMA tends to generate the clusters with high variation on the cluster sizes. Indeed, the experimental results indicate that the variations of the resultant cluster sizes by K-means and UPGMA, measured by the Coefficient of Variation (CV), are in the specific intervals, say [0.3, 1.0] and [1.0, 2.5] respectively. Finally, we put together K-means and UPGMA for a further comparison, and propose some rules for the better choice of the clustering schemes from the data distribution point of view.

A Heterogeneity Urban Facilities Spatial Clustering Analysis Method Based on Data Field and Decision Graph

Spatial Data and Intelligence - Lecture Notes in Computer Science ◽

10.1007/978-3-030-85462-1_8 ◽

2021 ◽

pp. 103-109

Author(s):

Lei Kang ◽

Haiyan Liu ◽

Xiaohui Chen ◽

Weiying Cheng ◽

Jing Li ◽

...

Keyword(s):

Clustering Analysis ◽

Spatial Clustering ◽

Analysis Method ◽

Data Field ◽

Decision Graph ◽

Spatial Clustering Analysis

Combination Clustering Analysis Method and its Application

Journal of Applied Sciences ◽

10.3923/jas.2013.1251.1255 ◽

2013 ◽

Vol 13 (8) ◽

pp. 1251-1255

Author(s):

Yang Liu ◽

Qin-Liang Li ◽

Li-Yuan Dong ◽

Bang-Chun Wen

Keyword(s):

Clustering Analysis ◽

Analysis Method

On the Effectiveness of Hybrid Canopy with Hoeffding Adaptive Naive Bayes Trees

International Journal of Applied Evolutionary Computation ◽

10.4018/ijaec.2017040102 ◽

2017 ◽

Vol 8 (2) ◽

pp. 30-43

Author(s):

Mrutyunjaya Panda

Keyword(s):

Big Data ◽

Clustering Analysis ◽

Large Scale ◽

Data Sets ◽

Recent Past ◽

Large Scale Data ◽

Huge Data ◽

With Memory ◽

Memory Constraints ◽

Scale Data

The Big Data, due to its complicated and diverse nature, poses a lot of challenges for extracting meaningful observations. This sought smart and efficient algorithms that can deal with computational complexity along with memory constraints out of their iterative behavior. This issue may be solved by using parallel computing techniques, where a single machine or a multiple machine can perform the work simultaneously, dividing the problem into sub problems and assigning some private memory to each sub problems. Clustering analysis are found to be useful in handling such a huge data in the recent past. Even though, there are many investigations in Big data analysis are on, still, to solve this issue, Canopy and K-Means++ clustering are used for processing the large-scale data in shorter amount of time with no memory constraints. In order to find the suitability of the approach, several data sets are considered ranging from small to very large ones having diverse filed of applications. The experimental results opine that the proposed approach is fast and accurate.