A Strictly Unsupervised Deep Learning Method for HEp-2 Cell Image Classification

Classifying the images that portray the Human Epithelial cells of type 2 (HEp-2) represents one of the most important steps in the diagnosis procedure of autoimmune diseases. Performing this classification manually represents an extremely complicated task due to the heterogeneity of these cellular images. Hence, an automated classification scheme appears to be necessary. However, the majority of the available methods prefer to utilize the supervised learning approach for this problem. The need for thousands of images labelled manually can represent a difficulty with this approach. The first contribution of this work is to demonstrate that classifying HEp-2 cell images can also be done using the unsupervised learning paradigm. Unlike the majority of the existing methods, we propose here a deep learning scheme that performs both the feature extraction and the cells’ discrimination through an end-to-end unsupervised paradigm. We propose the use of a deep convolutional autoencoder (DCAE) that performs feature extraction via an encoding–decoding scheme. At the same time, we embed in the network a clustering layer whose purpose is to automatically discriminate, during the feature learning process, the latent representations produced by the DCAE. Furthermore, we investigate how the quality of the network’s reconstruction can affect the quality of the produced representations. We have investigated the effectiveness of our method on some benchmark datasets and we demonstrate here that the unsupervised learning, when done properly, performs at the same level as the actual supervised learning-based state-of-the-art methods in terms of accuracy.

Download Full-text

Research on Fault Diagnosis of Air Conditioner Based on Deep Learning

International Journal of Robotics and Control ◽

10.5430/ijrc.v2n1p18 ◽

2018 ◽

Vol 2 (1) ◽

pp. 18

Author(s):

Zhiting Liu ◽

Yuhua Wang ◽

Yuexia Zhou

Keyword(s):

Feature Extraction ◽

Deep Learning ◽

Fault Diagnosis ◽

Unsupervised Learning ◽

Supervised Learning ◽

Vibration Signal ◽

Learning Rate ◽

Air Conditioner ◽

Time Frequency ◽

Mode Decomposition

The essence of intelligent fault diagnosis is to classify the feature of faults by machine learning. It is difficult and key to extract fault characteristics of signals efficiently. The general feature extraction methods include time frequency domain feature extraction, Empirical Mode Decomposition (EMD), Wavelet Transform and Variational Mode Decomposition (VMD). However, these methods require a certain prior experience and require reasonable analysis and processing of the signals. In this paper, in order to effectively extract the fault characteristics of the air conditioner's vibration signal, the stacked automatic encoder (SAE) is used to extract the feature of air conditioner’s vibration signal, and the Softmax function is used to identify the air conditioner's working condition. The SAE performs unsupervised learning on the signal, and Softmax function performs supervised learning on the signal. The number of hidden layers and the number of hidden layer's nodes are determined through experiments. The effects of learning rate, learning rate decay, regularization, dropout, and batch size on the correct rate of the model in supervised learning and unsupervised learning are analyzed. Thereby realizing the fault diagnosis of the air conditioner. The recognition correct rate of deep learning model reached 99.92\%. The deep learning fault diagnosis method proposed in this paper is compared with EMD and SVM, VMD and SVM two kind of fault diagnosis methods.

Download Full-text

Understanding the relationship between clouds and surface downward radiation forecast errors with Unsupervised Deep Learning

10.5194/ems2021-471 ◽

2021 ◽

Author(s):

Matthias Zech ◽

Lueder von Bremen

Keyword(s):

Deep Learning ◽

Unsupervised Learning ◽

Prediction Models ◽

Weather Prediction ◽

Reanalysis Data ◽

Cloud Formation ◽

Trade Wind ◽

Forecast Errors ◽

Model Learning ◽

Unsupervised Deep Learning

Cloudiness is a difficult parameter to forecast and has improved relatively little over the last decade in numerical weather prediction models as the EMCWF IFS. However, surface downward solar radiation forecast (ssrd) errors are becoming more important with higher penetration of photovoltaics in Europe as forecasts errors induce power imbalances that might lead to high balancing costs. This study continues recent approaches to better understand clouds using satellite images with Deep Learning. Unlike other studies which focus on shallow trade wind cumulus clouds over the ocean, this study investigates the European land area. To better understand the clouds, we use the daily MODIS optical cloud thickness product which shows both water and ice phase of the cloud. This allows to consider both cloud structure and cloud formation during learning. It is also much easier to distinguish between snow and cloud in contrast to using visible bands. Methodologically, it uses the Unsupervised Learning approach tile2vec to derive a lower dimensional representation of the clouds. Three cloud regions with two similar neighboring tiles and one tile from a different time and location are sampled to learn lower-rank embeddings. In contrast to the initial tile2vec&#160;implementation, this study does not sample arbitrarily distant tiles but uses the fractal dimension of the clouds in a pseudo-random sampling fashion to improve model learning.The usefulness of the cloud segments is shown by applying them in a case study to investigate statistical properties of ssrd forecast errors over Europe which are derived from hourly ECMWF IFS forecasts and ERA5 reanalysis data. This study shows how Unsupervised Learning has high potential despite its relatively low usage compared to Supervised Learning in academia. It further shows, how the generated land cloud product can be used to better characterize ssrd forecast errors over Europe.

Download Full-text

Extracting a biologically latent space of lung cancer epigenetics with variational autoencoders

BMC Bioinformatics ◽

10.1186/s12859-019-3130-9 ◽

2019 ◽

Vol 20 (S18) ◽

Cited By ~ 2

Author(s):

Zhenxing Wang ◽

Yadong Wang

Keyword(s):

Lung Cancer ◽

Dna Methylation ◽

Deep Learning ◽

Malignant Tumors ◽

Methylation Data ◽

Cancer Epigenetics ◽

Latent Space ◽

Unsupervised Deep Learning ◽

Latent Representations ◽

Model Training

Abstract Background Lung cancer is one of the most malignant tumors, causing over 1,000,000 deaths each year worldwide. Deep learning has brought success in many domains in recent years. DNA methylation, an epigenetic factor, is used for model training in many studies. There is an opportunity for deep learning methods to analyze the lung cancer epigenetic data to determine their subtypes for appropriate treatment. Results Here, we employ variational autoencoders (VAEs), an unsupervised deep learning framework, on 450K DNA methylation data of TCGA-LUAD and TCGA-LUSC to learn latent representations of the DNA methylation landscape. We extract a biologically relevant latent space of LUAD and LUSC samples. It is showed that the bivariate classifiers on the further compressed latent features could classify the subtypes accurately. Through clustering of methylation-based latent space features, we demonstrate that the VAEs can capture differential methylation patterns about subtypes of lung cancer. Conclusions VAEs can distinguish the original subtypes from manually mixed methylation data frame with the encoded features of latent space. Further applications about VAEs should focus on fine-grained subtypes identification for precision medicine.

Download Full-text

Extracting Predictive Representations from Hundreds of Millions of Molecules

10.21203/rs.3.rs-745668/v1 ◽

2021 ◽

Author(s):

Dong Chen ◽

Guowei Wei ◽

Feng Pan

Keyword(s):

Image Analysis ◽

Deep Learning ◽

Virtual Screening ◽

Supervised Learning ◽

Learning Approach ◽

Optimal Model ◽

Molecular Sequences ◽

Benchmark Datasets ◽

Molecular Complexity

Abstract Although deep learning can automatically extract features in relatively simple tasks such as image analysis, the construction of appropriate representations remains essential for molecular predictions due to intricate molecular complexity. Additionally, it is often expensive, time-consuming, and ethically constrained to generate labeled data for supervised learning in molecular sciences, leading to challenging small and diverse datasets. In this work, we develop a self-supervised learning approach via a masking strategy to pre-train transformer models from over 700 million unlabeled molecules in multiple databases. The intrinsic chemical logic learned from this approach enables the extraction of predictive representations from task-specific molecular sequences in a fine-tuned process. To understand the importance of self-supervised learning from unlabeled molecules, we assemble three models with different combinations of databases. Moreover, we propose a new protocol based on data traits to automatically select the optimal model for a specific predictive task. To validate the proposed representation and protocol, we consider 10 benchmark datasets in addition to 38 ligand-based virtual screening datasets. Extensive validation indicates that the proposed representation and protocol show superb performance.

Download Full-text

Deep Learning-Based Automatic Clutter/Interference Detection for HFSWR

Remote Sensing ◽

10.3390/rs10101517 ◽

2018 ◽

Vol 10 (10) ◽

pp. 1517 ◽

Cited By ~ 11

Author(s):

Ling Zhang ◽

Wei You ◽

Q. Wu ◽

Shengbo Qi ◽

Yonggang Ji

Keyword(s):

Feature Extraction ◽

Deep Learning ◽

Interference Detection ◽

Target Feature ◽

Feature Design ◽

Wide Area Monitoring ◽

Random Occurrence ◽

Wave Radar ◽

Area Monitoring

High-frequency surface wave radar (HFSWR) plays an important role in wide area monitoring of the marine target and the sea state. However, the detection ability of HFSWR is severely limited by the strong clutter and the interference, which are difficult to be detected due to many factors such as random occurrence and complex distribution characteristics. Hence the automatic detection of the clutter and interference is an important step towards extracting them. In this paper, an automatic clutter and interference detection method based on deep learning is proposed to improve the performance of HFSWR. Conventionally, the Range-Doppler (RD) spectrum image processing method requires the target feature extraction including feature design and preselection, which is not only complicated and time-consuming, but the quality of the designed features is bound up with the performance of the algorithm. By analyzing the features of the target, the clutter and the interference in RD spectrum images, a lightweight deep convolutional learning network is established based on a faster region-based convolutional neural networks (Faster R-CNN). By using effective feature extraction combined with a classifier, the clutter and the interference can be automatically detected. Due to the end-to-end architecture and the numerous convolutional features, the deep learning-based method can avoid the difficulty and absence of uniform standard inherent in handcrafted feature design and preselection. Field experimental results show that the Faster R-CNN based method can automatically detect the clutter and interference with decent performance and classify them with high accuracy.

Download Full-text

PREDICTING LAND PRICES AND MEASURING UNCERTAINTY BY COMBINING SUPERVISED AND UNSUPERVISED LEARNING

International Journal of Strategic Property Management ◽

10.3846/ijspm.2021.14293 ◽

2021 ◽

Vol 25 (2) ◽

pp. 169-178

Author(s):

Changro Lee

Keyword(s):

Neural Network ◽

Principal Component Analysis ◽

Deep Learning ◽

Unsupervised Learning ◽

Supervised Learning ◽

Hybrid Approach ◽

Principal Component ◽

Component Analysis ◽

Land Prices ◽

Property Valuation

Despite the popularity deep learning has been gaining, measuring the uncertainty within the result has not met expectations in many deep learning applications and this includes property valuation. In real-world tasks, however, rather than simply requiring predictions, assurance of the certainty of the predictions is also demanded. In this study, supervised learning is combined with unsupervised learning to bridge this gap. A method based on principal component analysis, a popular tool of unsupervised learning, was developed and used to represent the uncertainty in property valuation. Then, a neural network, a representative algorithm to implement supervised learning, was constructed, and trained to predict land prices. Finally, the uncertainty that was measured using principal component analysis was incorporated into the price predicted by the neural network. This hybrid approach is shown to be likely to improve the credibility of the valuation work. The findings of this study are expected to generate interest in the integration of the two learning approaches, thereby promoting the rapid adoption of deep learning tools in the property valuation industry.

Download Full-text

KNN Loss and Deep KNN

Fundamenta Informaticae ◽

10.3233/fi-2021-2068 ◽

2021 ◽

Vol 182 (2) ◽

pp. 95-110

Author(s):

Linh Le ◽

Ying Xie ◽

Vijay V. Raghavan

Keyword(s):

Deep Learning ◽

Supervised Learning ◽

Nearest Neighbor ◽

Learning Strategy ◽

Feature Space ◽

Kernel Functions ◽

Training Data ◽

K Nearest Neighbor ◽

Map Data

The k Nearest Neighbor (KNN) algorithm has been widely applied in various supervised learning tasks due to its simplicity and effectiveness. However, the quality of KNN decision making is directly affected by the quality of the neighborhoods in the modeling space. Efforts have been made to map data to a better feature space either implicitly with kernel functions, or explicitly through learning linear or nonlinear transformations. However, all these methods use pre-determined distance or similarity functions, which may limit their learning capacity. In this paper, we present two loss functions, namely KNN Loss and Fuzzy KNN Loss, to quantify the quality of neighborhoods formed by KNN with respect to supervised learning, such that minimizing the loss function on the training data leads to maximizing KNN decision accuracy on the training data. We further present a deep learning strategy that is able to learn, by minimizing KNN loss, pairwise similarities of data that implicitly maps data to a feature space where the quality of KNN neighborhoods is optimized. Experimental results show that this deep learning strategy (denoted as Deep KNN) outperforms state-of-the-art supervised learning methods on multiple benchmark data sets.

Download Full-text

Comparison of Feature Extraction Methods for Physiological Signals for Heat-Based Pain Recognition

Sensors ◽

10.3390/s21144838 ◽

2021 ◽

Vol 21 (14) ◽

pp. 4838

Author(s):

Philip Gouverneur ◽

Frédéric Li ◽

Wacław M. Adamczyk ◽

Tibor M. Szikszay ◽

Kerstin Luedtke ◽

...

Keyword(s):

Machine Learning ◽

Feature Extraction ◽

Deep Learning ◽

Feature Learning ◽

Recognition System ◽

Extraction Methods ◽

Feature Engineering ◽

Learning Approaches ◽

Physiological Sensors ◽

Pain Recognition

While even the most common definition of pain is under debate, pain assessment has remained the same for decades. But the paramount importance of precise pain management for successful healthcare has encouraged initiatives to improve the way pain is assessed. Recent approaches have proposed automatic pain evaluation systems using machine learning models trained with data coming from behavioural or physiological sensors. Although yielding promising results, machine learning studies for sensor-based pain recognition remain scattered and not necessarily easy to compare to each other. In particular, the important process of extracting features is usually optimised towards specific datasets. We thus introduce a comparison of feature extraction methods for pain recognition based on physiological sensors in this paper. In addition, the PainMonit Database (PMDB), a new dataset including both objective and subjective annotations for heat-induced pain in 52 subjects, is introduced. In total, five different approaches including techniques based on feature engineering and feature learning with deep learning are evaluated on the BioVid and PMDB datasets. Our studies highlight the following insights: (1) Simple feature engineering approaches can still compete with deep learning approaches in terms of performance. (2) More complex deep learning architectures do not yield better performance compared to simpler ones. (3) Subjective self-reports by subjects can be used instead of objective temperature-based annotations to build a robust pain recognition system.

Download Full-text

Deep learning for constructing microblog behavior representation to identify social media user's personality

10.7287/peerj.preprints.1906 ◽

2016 ◽

Author(s):

Xiaoqian Liu ◽

Tingshao Zhu

Keyword(s):

Feature Extraction ◽

Deep Learning ◽

Prediction Model ◽

Learning Algorithm ◽

Rapid Development ◽

Feature Learning ◽

Extraction Methods ◽

Deep Learning Algorithm ◽

Personality Prediction ◽

Behavior Description

Due to the rapid development of information technology, Internet has become part of everyday life gradually. People would like to communicate with friends to share their opinions on social networks. The diverse social network behavior is an ideal users' personality traits reflection. Existing behavior analysis methods for personality prediction mostly extract behavior attributes with heuristic. Although they work fairly well, but it is hard to extend and maintain. In this paper, for personality prediction, we utilize deep learning algorithm to build feature learning model, which could unsupervised extract Linguistic Representation Feature Vector (LRFV) from text published on Sina Micro-blog actively. Compared with other feature extraction methods, LRFV, as an abstract representation of Micro-blog content, could describe use's semantic information more objectively and comprehensively. In the experiments, the personality prediction model is built using linear regression algorithm, and different attributes obtained through different feature extraction methods are taken as input of prediction model respectively. The results show that LRFV performs more excellently in micro-blog behavior description and improve the performance of personality prediction model.

Download Full-text