scholarly journals Spectral Embedded Deep Clustering

Entropy ◽  
2019 ◽  
Vol 21 (8) ◽  
pp. 795 ◽  
Author(s):  
Yuichiro Wada ◽  
Shugo Miyamoto ◽  
Takumi Nakagama ◽  
Léo Andéol ◽  
Wataru Kumagai ◽  
...  

We propose a new clustering method based on a deep neural network. Given an unlabeled dataset and the number of clusters, our method directly groups the dataset into the given number of clusters in the original space. We use a conditional discrete probability distribution defined by a deep neural network as a statistical model. Our strategy is first to estimate the cluster labels of unlabeled data points selected from a high-density region, and then to conduct semi-supervised learning to train the model by using the estimated cluster labels and the remaining unlabeled data points. Lastly, by using the trained model, we obtain the estimated cluster labels of all given unlabeled data points. The advantage of our method is that it does not require key conditions. Existing clustering methods with deep neural networks assume that the cluster balance of a given dataset is uniform. Moreover, it also can be applied to various data domains as long as the data is expressed by a feature vector. In addition, it is observed that our method is robust against outliers. Therefore, the proposed method is expected to perform, on average, better than previous methods. We conducted numerical experiments on five commonly used datasets to confirm the effectiveness of the proposed method.

2017 ◽  
Author(s):  
Luís Dias ◽  
Rosalvo Neto

Google released on November of 2015 Tensorflow, an open source machine learning framework that can be used to implement Deep Neural Network algorithms, a class of algorithms that shows great potential in solving complex problems. Considering the importance of usability in software success, this research aims to perform a usability analysis on Tensorflow and to compare it with another widely used framework, R. The evaluation was performed through usability tests with university students. The study led do indications that Tensorflow usability is equal or better than the usability of traditional frameworks used by the scientific community.


Author(s):  
Musa Mojarad ◽  
Hamid Parvin ◽  
Samad Nejatian ◽  
Vahideh Rezaie

In clustering ensemble, it is desired to combine several clustering outputs in order to create better results than the output results of the basic individual clustering methods in terms of consistency, robustness and performance. In this research, we want to present a clustering ensemble method with a new aggregation function. The proposed method is named Robust Clustering Ensemble based on Iterative Fusion of Base Clusters (RCEIFBC). This method takes into account the two similarity criteria: (a) one of them is the cluster-cluster similarity and (b) the other one is the object-cluster similarity. The proposed method has two steps and has been done on the binary cluster representation of the given ensemble. Indeed, before doing any step, the primary partitions are converted into a binary cluster representation where the primary ensemble has been broken into a number of primary binary clusters. The first step is to combine the primary binary clusters with the highest cluster-cluster similarity. This phase will be replicated as long as our desired candidate clusters are ready. The second step is to improve the merged clusters by assigning the data points to the merged clusters. The performance and robustness of the proposed method have been evaluated over different machine learning datasets. The experimentation indicates the effectiveness of the proposed method comparing to the state-of-the-art clustering methods in terms of performance and robustness.


2010 ◽  
Vol 439-440 ◽  
pp. 605-610
Author(s):  
Xiao Yong Liu

In this paper, a new RBF neural network (RBFNN) algorithm, called ar-RBFNN, is presented. In traditional RBFNNs based on clustering algorithm, called oRBFNN in this paper, the width of the basis function-Gaussian function, or called radius, ignored the effect of numbers in different clusters, or density of data points. New algorithm considers radius is effect to performance of algorithms in problem of function approximation. Mean Square Error is used to evaluate performances of two algorithms, oRBFNN and ar-RBFNN algorithms. Several experiments in function approximation show ar-RBFNN is better than oRBFNN.


Author(s):  
Seung-Geon Lee ◽  
Jaedeok Kim ◽  
Hyun-Joo Jung ◽  
Yoonsuck Choe

Estimating the relative importance of each sample in a training set has important practical and theoretical value, such as in importance sampling or curriculum learning. This kind of focus on individual samples invokes the concept of samplewise learnability: How easy is it to correctly learn each sample (cf. PAC learnability)? In this paper, we approach the sample-wise learnability problem within a deep learning context. We propose a measure of the learnability of a sample with a given deep neural network (DNN) model. The basic idea is to train the given model on the training set, and for each sample, aggregate the hits and misses over the entire training epochs. Our experiments show that the samplewise learnability measure collected this way is highly linearly correlated across different DNN models (ResNet-20, VGG-16, and MobileNet), suggesting that such a measure can provide deep general insights on the data’s properties. We expect our method to help develop better curricula for training, and help us better understand the data itself.


2021 ◽  
Vol 2021 ◽  
pp. 1-9
Author(s):  
Yishu Qiu ◽  
Lanliang Lin ◽  
Lvqing Yang ◽  
Dingzhao Li ◽  
Runhan Song ◽  
...  

In this paper, we proposed a multiscale and bidirectional input model based on convolutional neural network and deep neural network, named MBCDNN. In order to solve the problem of inconsistent activity segments, a multiscale input module is constructed to make up for the noise caused by filling. In order to solve the problem that single input is not enough to extract features from original data, we propose to manually design aggregation features combined with forward sequence and reverse sequence and use five cross-validation and stratified sampling to enhance the generalization ability of the model. According to the particularity of the task, we design an evaluation index combined with scene and action weight, which enriches the learning ability of the model to a great extent. In the 19 kinds of activity data based on scene+action, the accuracy and robustness are significantly improved, which is better than other mainstream traditional methods.


Author(s):  
Yuchi Kanzawa ◽  

Clustering methods of relational data are often based on the assumption that a given set of relational data is Euclidean, and kernelized clustering methods are often based on the assumption that a given kernel is positive semidefinite. In practice, non-Euclidean relational data and an indefinite kernel may arise, and a β-spread transformation was proposed for such cases, which modified a given set of relational data or a give a kernel Gram matrix such that the modified β value is common to all objects. In this paper, we propose an object-wise β-spread transformation for use in both relational and kernelized fuzzy c-means clustering. The proposed system retains the given data better than conventional methods, and numerical examples show that our method is efficient for both relational and kernel fuzzy c-means.


2020 ◽  
Vol 7 (4) ◽  
pp. 727
Author(s):  
Larasati Larasati ◽  
Wisnu Ananta Kusuma ◽  
Annisa Annisa

<p class="Abstrak"><em>Drug repositioning</em> adalah penggunaan senyawa obat yang sudah lolos uji sebelumnya untuk mengatasi penyakit baru selain penyakit awal obat tersebut ditujukan. <em>Drug repositioning </em>dapat dilakukan dengan memprediksi interaksi senyawa obat dengan protein penyakit yang bereaksi positif. Salah satu tantangan dalam prediksi interaksi senyawa dan protein adalah masalah ketidakseimbangan data. <em>Deep semi-supervised learning </em>dapat menjadi alternatif untuk menangani model prediksi dengan data yang tidak seimbang. Proses <em>pre-training </em>berbasis <em>unsupervised learning</em> pada <em>deep semi-supervised learning </em>dapat merepresentasikan input dari <em>unlabeled data</em> (data mayoritas) dengan baik dan mengoptimasi inisialisasi bobot pada <em>classifier</em>. Penelitian ini mengimplementasikan <em>Deep Belief Network</em> (DBN) sebagai <em>pre-training</em> dan <em>Deep Neural Network</em> (DNN) sebagai <em>classifier</em>. Data yang digunakan pada penelitian ini adalah <em>dataset</em> ion channel, GPCR, dan nuclear receptor yang bersumber dari pangkalan data KEGG BRITE, BRENDA, SuperTarget, dan DrugBank. Hasil penelitian ini menunjukkan pada <em>dataset</em> tersebut, <em>pre-training</em> berupa ekstraksi fitur memberikan efek optimasi dilihat dari peningkatan performa model DNN pada akurasi (3-4.5%), AUC (4.5%), <em>precision</em><em> </em>(5.9-6%), dan F-measure (3.8%).</p><p class="Abstrak"> </p><p class="Abstrak"><em><strong>Abstract</strong></em></p><p class="Abstract"><em>Drug repositioning is the reuse of an existing drug to treat a new disease other than its original medical indication. Drug repositioning can be done by predicting the interaction of drug compounds with disease proteins that react positively. One of the challenges in predicting the interaction of compounds and proteins is imbalanced data. Deep semi-supervised learning can be an alternative to handle prediction models with imbalanced data. The unsupervised learning based pre-training process in deep semi-supervised learning can represent input from unlabeled data (majority data) properly and optimize initialization of weights on the classifier. This study implements the Deep Belief Network (DBN) as a pre-training with Deep Neural Network (DNN) as a classifier. The data used in this study are ion channel, GPCR, and nuclear receptor dataset sourced from KEGG BRITE, BRENDA, SuperTarget, and DrugBank databases. The results of this study indicate that pre-training as feature extraction had an optimization effect. This can be seen from DNN performance improvement in accuracy (3-4.5%), AUC (4.5%), precision (5.9-6%), and F-measure (3.8%).<strong></strong></em></p><p class="Abstrak"><em><strong><br /></strong></em></p>


Author(s):  
Xianyun Wang ◽  
Changchun Bao

AbstractAccording to the encoding and decoding mechanism of binaural cue coding (BCC), in this paper, the speech and noise are considered as left channel signal and right channel signal of the BCC framework, respectively. Subsequently, the speech signal is estimated from noisy speech when the inter-channel level difference (ICLD) and inter-channel correlation (ICC) between speech and noise are given. In this paper, exact inter-channel cues and the pre-enhanced inter-channel cues are used for speech restoration. The exact inter-channel cues are extracted from clean speech and noise, and the pre-enhanced inter-channel cues are extracted from the pre-enhanced speech and estimated noise. After that, they are combined one by one to form a codebook. Once the pre-enhanced cues are extracted from noisy speech, the exact cues are estimated by a mapping between the pre-enhanced cues and a prior codebook. Next, the estimated exact cues are used to obtain a time-frequency (T-F) mask for enhancing noisy speech based on the decoding of BCC. In addition, in order to further improve accuracy of the T-F mask based on the inter-channel cues, the deep neural network (DNN)-based method is proposed to learn the mapping relationship between input features of noisy speech and the T-F masks. Experimental results show that the codebook-driven method can achieve better performance than conventional methods, and the DNN-based method performs better than the codebook-driven method.


2019 ◽  
Vol 214 ◽  
pp. 06013
Author(s):  
Anton Hawthorne-Gonzalvez ◽  
Martin Sevior

B-decay data from the Belle experiment at the KEKB collider have a substantial background from e+e- -h> qq¯ events. To suppress this we employ deep neural network algorithms. These provide improved signal from background discrimination. However, the deep neural network develops a substantial correlation with the ∆E kinematic variable used to distinguish signal from background in the final fit due to its relationship with input variables. The effect of this correlation is reduced by deploying an adversarial neural network. Over-all the adversarial deep neural network performs better than a Boosted Decision Tree algorithimn and a commercial package, NeuroBayes, which employs a neural net with a single hidden layer.


Sign in / Sign up

Export Citation Format

Share Document