Partial Multi-Label Learning with Noisy Label Identification

Partial multi-label learning (PML) deals with problems where each instance is assigned with a candidate label set, which contains multiple relevant labels and some noisy labels. Recent studies usually solve PML problems with the disambiguation strategy, which recovers ground-truth labels from the candidate label set by simply assuming that the noisy labels are generated randomly. In real applications, however, noisy labels are usually caused by some ambiguous contents of the example. Based on this observation, we propose a partial multi-label learning approach to simultaneously recover the ground-truth information and identify the noisy labels. The two objectives are formalized in a unified framework with trace norm and ℓ1 norm regularizers. Under the supervision of the observed noise-corrupted label matrix, the multi-label classifier and noisy label identifier are jointly optimized by incorporating the label correlation exploitation and feature-induced noise model. Extensive experiments on synthetic as well as real-world data sets validate the effectiveness of the proposed approach.

Download Full-text

Partial Multi-Label Optimal Margin Distribution Machine

Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2021/303 ◽

2021 ◽

Author(s):

Nan Cao ◽

Teng Zhang ◽

Hai Jin

Keyword(s):

Ground Truth ◽

Data Sets ◽

Real World Data ◽

Generalization Performance ◽

The Core ◽

Unseen Data ◽

Margin Distribution ◽

Minimum Margin ◽

Core Problem ◽

Noisy Labels

Partial multi-label learning deals with the circumstance in which the ground-truth labels are not directly available but hidden in a candidate label set. Due to the presence of other irrelevant labels, vanilla multi-label learning methods are prone to be misled and fail to generalize well on unseen data, thus how to enable them to get rid of the noisy labels turns to be the core problem of partial multi-label learning. In this paper, we propose the Partial Multi-Label Optimal margin Distribution Machine (PML-ODM), which distinguishs the noisy labels through explicitly optimizing the distribution of ranking margin, and exhibits better generalization performance than minimum margin based counterparts. In addition, we propose a novel feature prototype representation to further enhance the disambiguation ability, and the non-linear kernels can also be applied to promote the generalization performance for linearly inseparable data. Extensive experiments on real-world data sets validates the superiority of our proposed method.

Download Full-text

Enhancing Recommender Diversity Using Gaussian Cloud Transformation

International Journal of Uncertainty Fuzziness and Knowledge-Based Systems ◽

10.1142/s0218488515500233 ◽

2015 ◽

Vol 23 (04) ◽

pp. 521-544 ◽

Cited By ~ 4

Author(s):

Jinpeng Chen ◽

Yu Liu ◽

Deyi Li

Keyword(s):

Recommender Systems ◽

Recommendation System ◽

Data Sets ◽

Complete Spectrum ◽

Unified Framework ◽

Real World Data ◽

Balance Accuracy ◽

Average Accuracy ◽

Novel Method ◽

Attention To Diversity

The recommender systems community is paying great attention to diversity as key qualities beyond accuracy in real recommendation scenarios. Multifarious diversity-increasing approaches have been developed to enhance recommendation diversity in the related literature while making personalized recommendations to users. In this work, we present Gaussian Cloud Recommendation Algorithm (GCRA), a novel method designed to balance accuracy and diversity personalized top-N recommendation lists in order to capture the user's complete spectrum of tastes. Our proposed algorithm does not require semantic information. Meanwhile we propose a unified framework to extend the traditional CF algorithms via utilizing GCRA for improving the recommendation system performance. Our work builds upon prior research on recommender systems. Though being detrimental to average accuracy, we show that our method can capture the user's complete spectrum of interests. Systematic experiments on three real-world data sets have demonstrated the effectiveness of our proposed approach in learning both accuracy and diversity.

Download Full-text

Multi-View Multi-Label Learning with View-Specific Information Extraction

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2019/539 ◽

2019 ◽

Cited By ~ 4

Author(s):

Xuan Wu ◽

Qing-Guo Chen ◽

Yao Hu ◽

Dengbao Wang ◽

Xiaodong Chang ◽

...

Keyword(s):

Information Extraction ◽

Real World ◽

State Of The Art ◽

Specific Information ◽

Learning Approach ◽

Data Sets ◽

Learning Approaches ◽

Real World Data ◽

Learning Techniques ◽

Shared Information

Multi-view multi-label learning serves an important framework to learn from objects with diverse representations and rich semantics. Existing multi-view multi-label learning techniques focus on exploiting shared subspace for fusing multi-view representations, where helpful view-specific information for discriminative modeling is usually ignored. In this paper, a novel multi-view multi-label learning approach named SIMM is proposed which leverages shared subspace exploitation and view-specific information extraction. For shared subspace exploitation, SIMM jointly minimizes confusion adversarial loss and multi-label loss to utilize shared information from all views. For view-specific information extraction, SIMM enforces an orthogonal constraint w.r.t. the shared subspace to utilize view-specific discriminative information. Extensive experiments on real-world data sets clearly show the favorable performance of SIMM against other state-of-the-art multi-view multi-label learning approaches.

Download Full-text

Novel Techniques for Classification of Lung Nodules using Deep Learning Approach

The Open Biomedical Engineering Journal ◽

10.2174/1874120701913010120 ◽

2019 ◽

Vol 13 (1) ◽

pp. 120-126

Author(s):

K. Bhavanishankar ◽

M. V. Sudhamani

Keyword(s):

Deep Learning ◽

Lung Nodule ◽

Ground Truth ◽

Lung Nodules ◽

Learning Approach ◽

Data Sets ◽

Ground Truth Data ◽

Series Of Experiments ◽

Sensitivity Specificity

Objective: Lung cancer is proving to be one of the deadliest diseases that is haunting mankind in recent years. Timely detection of the lung nodules would surely enhance the survival rate. This paper focusses on the classification of candidate lung nodules into nodules/non-nodules in a CT scan of the patient. A deep learning approach –autoencoder is used for the classification. Investigation/Methodology: Candidate lung nodule patches obtained as the results of the lung segmentation are considered as input to the autoencoder model. The ground truth data from the LIDC repository is prepared and is submitted to the autoencoder training module. After a series of experiments, it is decided to use 4-stacked autoencoder. The model is trained for over 600 LIDC cases and the trained module is tested for remaining data sets. Results: The results of the classification are evaluated with respect to performance measures such as sensitivity, specificity, and accuracy. The results obtained are also compared with other related works and the proposed approach was found to be better by 6.2% with respect to accuracy. Conclusion: In this paper, a deep learning approach –autoencoder has been used for the classification of candidate lung nodules into nodules/non-nodules. The performance of the proposed approach was evaluated with respect to sensitivity, specificity, and accuracy and the obtained values are 82.6%, 91.3%, and 87.0%, respectively. This result is then compared with existing related works and an improvement of 6.2% with respect to accuracy has been observed.

Download Full-text

Swell-noise attenuation: A deep learning approach

The Leading Edge ◽

10.1190/tle38120934.1 ◽

2019 ◽

Vol 38 (12) ◽

pp. 934-942 ◽

Cited By ~ 4

Author(s):

Xing Zhao ◽

Ping Lu ◽

Yanyan Zhang ◽

Jianxiong Chen ◽

Xiaoyang Li

Keyword(s):

Deep Learning ◽

Seismic Data ◽

Learning Model ◽

Noise Model ◽

Learning Approach ◽

Data Sets ◽

Great Success ◽

Noise Attenuation ◽

Coherent Noise ◽

Deep Learning Model

Noise attenuation for ordinary images using machine learning technology has achieved great success in the computer vision field. However, directly applying these models to seismic data would not be effective since the evaluation criteria from the geophysical domain require a high-quality visualized image and the ability to maintain original seismic signals from the contaminated wavelets. This paper introduces an approach equipped with a specially designed deep learning model that can effectively attenuate swell noise with different intensities and characteristics from shot gathers with a relatively simple workflow applicable to marine seismic data sets. Three significant benefits are introduced from the proposed deep learning model. First, our deep learning model doesn't need to consume a pure swell-noise model. Instead, a contaminated swell-noise model derived from field data sets (which may contain other noises or primary signals) can be used for training. Second, inspired by the conventional algorithm for coherent noise attenuation, our neural network model is designed to learn and detect the swell noise rather than inferring the attenuated seismic data. Third, several comparisons (signal-to-noise ratio, mean squared error, and intensities of residual swell noises) indicate that the deep learning approach has the capability to remove swell noise without harming the primary signals. The proposed deep learning-based approach can be considered as an alternative approach that combines and takes advantage of both the conventional and data-driven method to better serve swell-noise attenuation. The comparable results also indicate that the deep learning method has strong potential to solve other coherent noise-attenuation tasks for seismic data.

Download Full-text

A unified framework for the integration of multiple hierarchical clusterings or networks from multi-source data

BMC Bioinformatics ◽

10.1186/s12859-021-04303-4 ◽

2021 ◽

Vol 22 (1) ◽

Author(s):

Audrey Hulot ◽

Denis Laloë ◽

Florence Jaffrézic

Keyword(s):

Breast Cancer ◽

Simple Procedure ◽

Data Sets ◽

Integration Step ◽

Visual Interpretation ◽

Data Types ◽

Unified Framework ◽

Real World Data ◽

Multiple Factor Analysis ◽

Data Set

Abstract Background Integrating data from different sources is a recurring question in computational biology. Much effort has been devoted to the integration of data sets of the same type, typically multiple numerical data tables. However, data types are generally heterogeneous: it is a common place to gather data in the form of trees, networks or factorial maps, as these representations all have an appealing visual interpretation that helps to study grouping patterns and interactions between entities. The question we aim to answer in this paper is that of the integration of such representations. Results To this end, we provide a simple procedure to compare data with various types, in particular trees or networks, that relies essentially on two steps: the first step projects the representations into a common coordinate system; the second step then uses a multi-table integration approach to compare the projected data. We rely on efficient and well-known methodologies for each step: the projection step is achieved by retrieving a distance matrix for each representation form and then applying multidimensional scaling to provide a new set of coordinates from all the pairwise distances. The integration step is then achieved by applying a multiple factor analysis to the multiple tables of the new coordinates. This procedure provides tools to integrate and compare data available, for instance, as tree or network structures. Our approach is complementary to kernel methods, traditionally used to answer the same question. Conclusion Our approach is evaluated on simulation and used to analyze two real-world data sets: first, we compare several clusterings for different cell-types obtained from a transcriptomics single-cell data set in mouse embryos; second, we use our procedure to aggregate a multi-table data set from the TCGA breast cancer database, in order to compare several protein networks inferred for different breast cancer subtypes.

Download Full-text

Multiorder Neurons for Evolutionary Higher-Order Clustering and Growth

Neural Computation ◽

10.1162/neco.2007.19.12.3369 ◽

2007 ◽

Vol 19 (12) ◽

pp. 3369-3391 ◽

Cited By ~ 5

Author(s):

Kiruthika Ramanathan ◽

Sheng-Uei Guan

Keyword(s):

Synaptic Weight ◽

Ground Truth ◽

Higher Order ◽

Data Sets ◽

Clustering Methods ◽

Self Organizing Maps ◽

Model Complex ◽

One Step ◽

Higher Order Tensors ◽

Ground Truth Information

This letter proposes to use multiorder neurons for clustering irregularly shaped data arrangements. Multiorder neurons are an evolutionary extension of the use of higher-order neurons in clustering. Higher-order neurons parametrically model complex neuron shapes by replacing the classic synaptic weight by higher-order tensors. The multiorder neuron goes one step further and eliminates two problems associated with higher-order neurons. First, it uses evolutionary algorithms to select the best neuron order for a given problem. Second, it obtains more information about the underlying data distribution by identifying the correct order for a given cluster of patterns. Empirically we observed that when the correlation of clusters found with ground truth information is used in measuring clustering accuracy, the proposed evolutionary multiorder neurons method can be shown to outperform other related clustering methods. The simulation results from the Iris, Wine, and Glass data sets show significant improvement when compared to the results obtained using self-organizing maps and higher-order neurons. The letter also proposes an intuitive model by which multiorder neurons can be grown, thereby determining the number of clusters in data.

Download Full-text

Deep learning for automatic segmentation of thigh and leg muscles

Magnetic Resonance Materials in Physics Biology and Medicine ◽

10.1007/s10334-021-00967-4 ◽

2021 ◽

Author(s):

Abramo Agosti ◽

Enea Shaqiri ◽

Matteo Paoletti ◽

Francesca Solazzo ◽

Niels Bergsland ◽

...

Keyword(s):

Skeletal Muscle ◽

Deep Learning ◽

Quantitative Imaging ◽

Automatic Segmentation ◽

Neuromuscular Diseases ◽

Ground Truth ◽

Learning Approach ◽

Unified Framework ◽

Intermuscular Adipose Tissue ◽

Fatty Replacement

Abstract Objective In this study we address the automatic segmentation of selected muscles of the thigh and leg through a supervised deep learning approach. Material and methods The application of quantitative imaging in neuromuscular diseases requires the availability of regions of interest (ROI) drawn on muscles to extract quantitative parameters. Up to now, manual drawing of ROIs has been considered the gold standard in clinical studies, with no clear and universally accepted standardized procedure for segmentation. Several automatic methods, based mainly on machine learning and deep learning algorithms, have recently been proposed to discriminate between skeletal muscle, bone, subcutaneous and intermuscular adipose tissue. We develop a supervised deep learning approach based on a unified framework for ROI segmentation. Results The proposed network generates segmentation maps with high accuracy, consisting in Dice Scores ranging from 0.89 to 0.95, with respect to “ground truth” manually segmented labelled images, also showing high average performance in both mild and severe cases of disease involvement (i.e. entity of fatty replacement). Discussion The presented results are promising and potentially translatable to different skeletal muscle groups and other MRI sequences with different contrast and resolution.

Download Full-text

TrustSVD: A Novel Trust-Based Matrix Factorization Model with User Trust and Item Ratings

International Journal of Advanced Research in Computer Science and Software Engineering ◽

10.23956/ijarcsse.v7i11.422 ◽

2017 ◽

Vol 7 (11) ◽

pp. 7 ◽

Cited By ~ 1

Author(s):

K Sobha Rani

Keyword(s):

Matrix Factorization ◽

Social Trust ◽

State Of The Art ◽

Data Sets ◽

Real World Data ◽

Recommendation Algorithm ◽

Active User ◽

Factorization Model ◽

The Social ◽

Matrix Factorization Technique

Collaborative filtering suffers from the problems of data sparsity and cold start, which dramatically degrade recommendation performance. To help resolve these issues, we propose TrustSVD, a trust-based matrix factorization technique. By analyzing the social trust data from four real-world data sets, we conclude that not only the explicit but also the implicit influence of both ratings and trust should be taken into consideration in a recommendation model. Hence, we build on top of a state-of-the-art recommendation algorithm SVD++ which inherently involves the explicit and implicit influence of rated items, by further incorporating both the explicit and implicit influence of trusted users on the prediction of items for an active user. To our knowledge, the work reported is the first to extend SVD++ with social trust information. Experimental results on the four data sets demonstrate that our approach TrustSVD achieves better accuracy than other ten counterparts, and can better handle the concerned issues.

Download Full-text

Comparison of different SLAM approaches for a driverless race car

tm - Technisches Messen ◽

10.1515/teme-2021-0004 ◽

2021 ◽

Vol 0 (0) ◽

Author(s):

Nick Le Large ◽

Frank Bieder ◽

Martin Lauer

Keyword(s):

Kalman Filter ◽

Extended Kalman Filter ◽

Ground Truth ◽

Time Constraints ◽

Differential Gps ◽

Real World Data ◽

Race Car ◽

Localization And Mapping ◽

Slam Algorithm ◽

Computational Resources

Abstract For the application of an automated, driverless race car, we aim to assure high map and localization quality for successful driving on previously unknown, narrow race tracks. To achieve this goal, it is essential to choose an algorithm that fulfills the requirements in terms of accuracy, computational resources and run time. We propose both a filter-based and a smoothing-based Simultaneous Localization and Mapping (SLAM) algorithm and evaluate them using real-world data collected by a Formula Student Driverless race car. The accuracy is measured by comparing the SLAM-generated map to a ground truth map which was acquired using high-precision Differential GPS (DGPS) measurements. The results of the evaluation show that both algorithms meet required time constraints thanks to a parallelized architecture, with GraphSLAM draining the computational resources much faster than Extended Kalman Filter (EKF) SLAM. However, the analysis of the maps generated by the algorithms shows that GraphSLAM outperforms EKF SLAM in terms of accuracy.

Download Full-text