scholarly journals Adversarial Disentanglement with Grouped Observations

2020 ◽  
Vol 34 (06) ◽  
pp. 10243-10250
Author(s):  
Jozsef Nemeth

We consider the disentanglement of the representations of the relevant attributes of the data (content) from all other factors of variations (style) using Variational Autoencoders. Some recent works addressed this problem by utilizing grouped observations, where the content attributes are assumed to be common within each group, while there is no any supervised information on the style factors. In many cases, however, these methods fail to prevent the models from using the style variables to encode content related features as well. This work supplements these algorithms with a method that eliminates the content information in the style representations. For that purpose the training objective is augmented to minimize an appropriately defined mutual information term in an adversarial way. Experimental results and comparisons on image datasets show that the resulting method can efficiently separate the content and style related attributes and generalizes to unseen data.

Author(s):  
Chung-Hsien Wu ◽  
Hung-Yu Su ◽  
Chao-Hong Liu

This chapter presents an efficient approach to personalized pronunciation assessment of Taiwanese-accented English. The main goal of this study is to detect frequently occurring mispronunciation patterns of Taiwanese-accented English instead of scoring English pronunciations directly. The proposed assessment help quickly discover personalized mispronunciations of a student, thus English teachers can spend more time on teaching or rectifying students’ pronunciations. In this approach, an unsupervised model adaptation method is performed on the universal acoustic models to recognize the speech of a specific speaker with mispronunciations and Taiwanese accent. A dynamic sentence selection algorithm, considering the mutual information of the related mispronunciations, is proposed to choose a sentence containing the most undetected mispronunciations in order to quickly extract personalized mispronunciations. The experimental results show that the proposed unsupervised adaptation approach obtains an accuracy improvement of about 2.1% on the recognition of Taiwanese-accented English speech.


2020 ◽  
Vol 309 ◽  
pp. 03030
Author(s):  
Yiwei Zhu

Natural image segmentation plays an important role in the fields of image processing and computer vision. Image segmentation based on clustering is an important method in unsupervised image segmentation algorithms. But there are two problems with this type of approach. First, feature extraction is generally pixel-based, which results in poor segmentation results and boundary fitting. In order to solve this problem, it is proposed to introduce super pixel to be segmented image preprocessing. Second, the number of partitions is difficult to determine. Aiming at this problem, an energy difference based on mutual information is proposed, which can automatically determine the number of partitions. The experimental results on the standard database show that the proposed algorithm overcomes the above problems and achieves better experimental results.


2019 ◽  
Vol 14 (2) ◽  
pp. 108-114 ◽  
Author(s):  
Akın Özkan ◽  
Sultan Belgin İşgör ◽  
Gökhan Şengül ◽  
Yasemin Gülgün İşgör

Background: Dye-exclusion based cell viability analysis has been broadly used in cell biology including anticancer drug discovery studies. Viability analysis refers to the whole decision making process for the distinction of dead cells from live ones. Basically, cell culture samples are dyed with a special stain called trypan blue, so that the dead cells are selectively colored to darkish. This distinction provides critical information that may be used to expose influences of the studied drug on considering cell culture including cancer. Examiner’s experience and tiredness substantially affect the consistency throughout the manual observation of cell viability. The unsteady results of cell viability may end up with biased experimental results accordingly. Therefore, a machine learning based automated decision-making procedure is inevitably needed to improve consistency of the cell viability analysis. Objective: In this study, we investigate various combinations of classifiers and feature extractors (i.e. classification models) to maximize the performance of computer vision-based viability analysis. Method: The classification models are tested on novel hemocytometer image datasets which contain two types of cancer cell images, namely, caucasian promyelocytic leukemia (HL60), and chronic myelogenous leukemia (K562). Results: From the experimental results, k-Nearest Neighbor (KNN) and Random Forest (RF) by combining Local Phase Quantization (LPQ) achieve the lowest misclassification rates that are 0.031 and 0.082, respectively. Conclusion: The experimental results show that KNN and RF with LPQ can be powerful alternatives to the conventional manual cell viability analysis. Also, the collected datasets are released from the “biochem.atilim.edu.tr/datasets/” web address publically to academic studies.


2013 ◽  
Vol 278-280 ◽  
pp. 1174-1177 ◽  
Author(s):  
Jia Jia Miao ◽  
Guo You Chen ◽  
Le Wang ◽  
Xue Lin Fang

Microblogging has become a major tool for people to not only share information, but also to talk about current affairs. Has become the most popular content in the analysis, interested companies and researchers. We focus on the micro-blog clustering high-dimensional, high sparse, and proposed a new algorithm based on k-means-k frequent itemsets. In addition, the development of a method to capture long-term mutual information context knowledge in microblogging and algorithms are also designed to measure the conversation Similar. In order to support the new micro-blog clustering algorithm. Experimental results show that the clustering algorithm has higher accuracy than the standard k-means and two points in k-means algorithm toward large-capacity and highly sparse microblogging also maintain good scalability.


2017 ◽  
Vol 2017 ◽  
pp. 1-22 ◽  
Author(s):  
Muhammad Shafiq ◽  
Xiangzhan Yu

Accurate network traffic classification at early stage is very important for 5G network applications. During the last few years, researchers endeavored hard to propose effective machine learning model for classification of Internet traffic applications at early stage with few packets. Nevertheless, this essential problem still needs to be studied profoundly to find out effective packet number as well as effective machine learning (ML) model. In this paper, we tried to solve the above-mentioned problem. For this purpose, five Internet traffic datasets are utilized. Initially, we extract packet size of 20 packets and then mutual information analysis is carried out to find out the mutual information of each packet onnflow type. Thereafter, we execute 10 well-known machine learning algorithms using crossover classification method. Two statistical analysis tests, Friedman and Wilcoxon pairwise tests, are applied for the experimental results. Moreover, we also apply the statistical tests for classifiers to find out effective ML classifier. Our experimental results show that 13–19 packets are the effective packet numbers for 5G IM WeChat application at early stage network traffic classification. We also find out effective ML classifier, where Random Forest ML classifier is effective classifier at early stage Internet traffic classification.


Optimization based three dimensional (3D) rigid image registration (RIR) is one of the most commonly used methods of image registration in radiotherapy. Interpolator and similarity metric plays a crucial role in optimization image registration process. In this paper, the efficiency of image registration algorithm is analyzed by using various combinations of interpolators and similarity metric in terms of quantitative measures and is compared with commercially available image registration algorithm in radiotherapy. Computed Tomography (CT) and Cone Beam Computed Tomography (CBCT) image datasets were registered by image registration algorithm written in python language using simple image tool kit (SITK). Different combinations of similarity metric and interpolator such as mean square difference (MSD), mutual information (MI), demons and nearest neighbor (NN), linear, B- spline respectively were used in this study. The efficiency of the algorithm was quantified in terms of mean square error (MSE), structural similarity index (SSI), normalized cross correlation (NCC) and mutual information (MI). The image registration algorithm with most efficient combination of similarity metric and interpolator was selected for comparison with the commercially available image registration algorithm. The algorithm for multimodal (CTCBCT) 3D image registration with NN interpolator and MI similarity metric showed the highest values of SSI, NCC and MI as 0.865, 0.933, 1.223 respectively among other combination of interpolator and similarity metric. Further this algorithm when compared and statistically analyzed with commercially available image registration algorithm of Treatment Planning System (TPS. most commonly used for radiotherapy treatment) resulted in no significant difference (F value NCC-3.18, MI-4.010, SSI2.776) in their quantitative measures. The present study is limited to 3D RIR and can be extended for deformable image registration.


2007 ◽  
Vol 2007 ◽  
pp. 1-10 ◽  
Author(s):  
Jiangang Liu ◽  
Jie Tian

Traditional mutual information (MI) function aligns two multimodality images with intensity information, lacking spatial information, so that it usually presents many local maxima that can lead to inaccurate registration. Our paper proposes an algorithm of adaptive combination of intensity and gradient field mutual information (ACMI). Gradient code maps (GCM) are constructed by coding gradient field information of corresponding original images. The gradient field MI, calculated from GCMs, can provide complementary properties to intensity MI. ACMI combines intensity MI and gradient field MI with a nonlinear weight function, which can automatically adjust the proportion between two types MI in combination to improve registration. Experimental results demonstrate that ACMI outperforms the traditional MI and it is much less sensitive to reduced resolution or overlap of images.


Sensors ◽  
2018 ◽  
Vol 19 (1) ◽  
pp. 4
Author(s):  
Álvaro García-Martín ◽  
Juan SanMiguel ◽  
José Martínez

Applying people detectors to unseen data is challenging since patterns distributions, such as viewpoints, motion, poses, backgrounds, occlusions and people sizes, may significantly differ from the ones of the training dataset. In this paper, we propose a coarse-to-fine framework to adapt frame by frame people detectors during runtime classification, without requiring any additional manually labeled ground truth apart from the offline training of the detection model. Such adaptation make use of multiple detectors mutual information, i.e., similarities and dissimilarities of detectors estimated and agreed by pair-wise correlating their outputs. Globally, the proposed adaptation discriminates between relevant instants in a video sequence, i.e., identifies the representative frames for an adaptation of the system. Locally, the proposed adaptation identifies the best configuration (i.e., detection threshold) of each detector under analysis, maximizing the mutual information to obtain the detection threshold of each detector. The proposed coarse-to-fine approach does not require training the detectors for each new scenario and uses standard people detector outputs, i.e., bounding boxes. The experimental results demonstrate that the proposed approach outperforms state-of-the-art detectors whose optimal threshold configurations are previously determined and fixed from offline training data.


2020 ◽  
Vol 34 (07) ◽  
pp. 12402-12409 ◽  
Author(s):  
Tong Wu ◽  
Zhenzhen Lei ◽  
Bingqian Lin ◽  
Cuihua Li ◽  
Yanyun Qu ◽  
...  

Despite recent progress on the segmentation of high-resolution images, there exist an unsolved problem, i.e., the trade-off among the segmentation accuracy, memory resources and inference speed. So far, GLNet is introduced for high or ultra-resolution image segmentation, which has reduced the computational memory of the segmentation network. However, it ignores the importances of different cropped patches, and treats tiled patches equally for fusion with the whole image, resulting in high computational cost. To solve this problem, we introduce a patch proposal network (PPN) in this paper, which adaptively distinguishes the critical patches from the trivial ones to fuse with the whole image for refining segmentation. PPN is a classification network which alleviates network training burden and improves segmentation accuracy. We further embed PPN in a global-local segmentation network, instructing global branch and refinement branch to work collaboratively. We implement our method on four image datasets:DeepGlobe, ISIC, CRAG and Cityscapes, the first two are ultra-resolution image datasets and the last two are high-resolution image datasets. The experimental results show that our method achieves almost the best segmentation performance compared with the state-of-the-art segmentation methods and the inference speed is 12.9 fps on DeepGlobe and 10 fps on ISIC. Moreover, we embed PPN with the general semantic segmentation network and the experimental results on Cityscapes which contains more object classes demonstrate the generalization ability on general semantic segmentation.


Sign in / Sign up

Export Citation Format

Share Document