Graph Self Supervised Learning: the BT, the HSIC, and the VICReg

Mapping Intimacies ◽

10.31219/osf.io/tvmdu ◽

2021 ◽

Author(s):

Sayan Nag

Keyword(s):

Neural Networks ◽

Supervised Learning ◽

Loss Function ◽

Data Augmentation ◽

Learning Strategy ◽

Loss Functions ◽

Augmentation Strategies ◽

Batch Sizes ◽

Graph Neural Networks ◽

The Impact

Self-supervised learning and pre-training strategies have developed over the last few years especially for Convolutional Neural Networks (CNNs). Recently application of such methods can also be noticed for Graph Neural Networks (GNNs). In this paper, we have used a graph based self-supervised learning strategy with different loss functions (Barlow Twins[? ], HSIC[? ], VICReg[? ]) which have shown promising results when applied with CNNs previously. We have also proposed a hybrid loss function combining the advantages of VICReg and HSIC and called it as VICRegHSIC. The performance of these aforementioned methods have been compared when applied to two different datasets namely MUTAG and PROTEINS. Moreover, the impact of different batch sizes, projector dimensions and data augmentation strategies have also been explored. The results are preliminary and we will be continuing to explore with other datasets.

Download Full-text

Rectified Wing Loss for Efficient and Robust Facial Landmark Localisation with Convolutional Neural Networks

International Journal of Computer Vision ◽

10.1007/s11263-019-01275-0 ◽

2019 ◽

Vol 128 (8-9) ◽

pp. 2126-2145 ◽

Cited By ~ 3

Author(s):

Zhen-Hua Feng ◽

Josef Kittler ◽

Muhammad Awais ◽

Xiao-Jun Wu

Keyword(s):

Neural Networks ◽

Convolutional Neural Networks ◽

Loss Function ◽

Data Augmentation ◽

Small Sample Size ◽

Small Sample ◽

Imbalance Problem ◽

Facial Landmark ◽

The Impact ◽

Coarse To Fine

AbstractEfficient and robust facial landmark localisation is crucial for the deployment of real-time face analysis systems. This paper presents a new loss function, namely Rectified Wing (RWing) loss, for regression-based facial landmark localisation with Convolutional Neural Networks (CNNs). We first systemically analyse different loss functions, including L2, L1 and smooth L1. The analysis suggests that the training of a network should pay more attention to small-medium errors. Motivated by this finding, we design a piece-wise loss that amplifies the impact of the samples with small-medium errors. Besides, we rectify the loss function for very small errors to mitigate the impact of inaccuracy of manual annotation. The use of our RWing loss boosts the performance significantly for regression-based CNNs in facial landmarking, especially for lightweight network architectures. To address the problem of under-representation of samples with large pose variations, we propose a simple but effective boosting strategy, referred to as pose-based data balancing. In particular, we deal with the data imbalance problem by duplicating the minority training samples and perturbing them by injecting random image rotation, bounding box translation and other data augmentation strategies. Last, the proposed approach is extended to create a coarse-to-fine framework for robust and efficient landmark localisation. Moreover, the proposed coarse-to-fine framework is able to deal with the small sample size problem effectively. The experimental results obtained on several well-known benchmarking datasets demonstrate the merits of our RWing loss and prove the superiority of the proposed method over the state-of-the-art approaches.

Download Full-text

Pairwise Half-graph Discrimination: A Simple Graph-level Self-supervised Strategy for Pre-training Graph Neural Networks

Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2021/371 ◽

2021 ◽

Author(s):

Pengyong Li ◽

Jun Wang ◽

Ziliang Li ◽

Yixuan Qiao ◽

Xianggen Liu ◽

...

Keyword(s):

Neural Networks ◽

Supervised Learning ◽

Learning Strategy ◽

Binary Classification ◽

Representation Learning ◽

Graph Representation ◽

Superior Performance ◽

Graph Classification ◽

Training Strategy ◽

Graph Neural Networks

Self-supervised learning has gradually emerged as a powerful technique for graph representation learning. However, transferable, generalizable, and robust representation learning on graph data still remains a challenge for pre-training graph neural networks. In this paper, we propose a simple and effective self-supervised pre-training strategy, named Pairwise Half-graph Discrimination (PHD), that explicitly pre-trains a graph neural network at graph-level. PHD is designed as a simple binary classification task to discriminate whether two half-graphs come from the same source. Experiments demonstrate that the PHD is an effective pre-training strategy that offers comparable or superior performance on 13 graph classification tasks compared with state-of-the-art strategies, and achieves notable improvements when combined with node-level strategies. Moreover, the visualization of learned representation revealed that PHD strategy indeed empowers the model to learn graph-level knowledge like the molecular scaffold. These results have established PHD as a powerful and effective self-supervised learning strategy in graph-level representation learning.

Download Full-text

Comparing Class-Aware and Pairwise Loss Functions for Deep Metric Learning in Wildlife Re-Identification

Sensors ◽

10.3390/s21186109 ◽

2021 ◽

Vol 21 (18) ◽

pp. 6109

Author(s):

Nkosikhona Dlamini ◽

Terence L. van Zyl

Keyword(s):

Neural Network ◽

Neural Networks ◽

Loss Function ◽

Network Architecture ◽

Loss Functions ◽

Similarity Learning ◽

Neural Network Architecture ◽

Shot Classification ◽

Public Datasets ◽

The Impact

Similarity learning using deep convolutional neural networks has been applied extensively in solving computer vision problems. This attraction is supported by its success in one-shot and zero-shot classification applications. The advances in similarity learning are essential for smaller datasets or datasets in which few class labels exist per class such as wildlife re-identification. Improving the performance of similarity learning models comes with developing new sampling techniques and designing loss functions better suited to training similarity in neural networks. However, the impact of these advances is tested on larger datasets, with limited attention given to smaller imbalanced datasets such as those found in unique wildlife re-identification. To this end, we test the advances in loss functions for similarity learning on several animal re-identification tasks. We add two new public datasets, Nyala and Lions, to the challenge of animal re-identification. Our results are state of the art on all public datasets tested except Pandas. The achieved Top-1 Recall is 94.8% on the Zebra dataset, 72.3% on the Nyala dataset, 79.7% on the Chimps dataset and, on the Tiger dataset, it is 88.9%. For the Lion dataset, we set a new benchmark at 94.8%. We find that the best performing loss function across all datasets is generally the triplet loss; however, there is only a marginal improvement compared to the performance achieved by Proxy-NCA models. We demonstrate that no single neural network architecture combined with a loss function is best suited for all datasets, although VGG-11 may be the most robust first choice. Our results highlight the need for broader experimentation and exploration of loss functions and neural network architecture for the more challenging task, over classical benchmarks, of wildlife re-identification.

Download Full-text

How do loss functions impact the performance of graph neural networks?

10.21528/cbic2021-161 ◽

2021 ◽

Author(s):

Gabriel Jonas Duarte ◽

Tamara Arruda Pereira ◽

Erik Jhones Nascimento ◽

Diego Mesquita ◽

Amauri Holanda Souza Junior

Keyword(s):

Neural Networks ◽

Mean Absolute Error ◽

Absolute Error ◽

Cross Entropy ◽

Loss Functions ◽

Hinge Loss ◽

Significant Difference ◽

Node Classification ◽

Graph Neural Networks ◽

The Impact

Graph neural networks (GNNs) have become the de facto approach for supervised learning on graph data.To train these networks, most practitioners employ the categorical cross-entropy (CE) loss. We can attribute this largely to the probabilistic interpretability of models trained using CE, since it corresponds to the negative log of the categorical/softmax likelihood.We can attribute this largely to the probabilistic interpretation of CE, since it corresponds to the negative log of the categorical/softmax likelihood.Nonetheless, recent works have shown that deep learning models can benefit from adopting other loss functions. For instance, neural networks trained with symmetric losses (e.g., mean absolute error) are robust to label noise. Nonetheless, loss functions are a modeling choice and other training criteria can be employed — e.g., hinge loss and mean absolute error (MAE). Perhaps surprisingly, the effect of using different losses on GNNs has not been explored. In this preliminary work, we gauge the impact of different loss functions to the performance of GNNs for node classification under i) noisy labels and ii) different sample sizes. In contrast to findings on Euclidean domains, our results for GNNs show that there is no significant difference between models trained with CE and other classical loss functions on both aforementioned scenarios.

Download Full-text

Data augmentation for computed tomography angiography via synthetic image generation and neural domain adaptation

Current Directions in Biomedical Engineering ◽

10.1515/cdbme-2020-0015 ◽

2020 ◽

Vol 6 (1) ◽

Author(s):

Malte Seemann ◽

Lennart Bargsten ◽

Alexander Schlaefer

Keyword(s):

Computed Tomography ◽

Neural Networks ◽

Deep Learning ◽

Medical Imaging ◽

Computed Tomography Angiography ◽

Data Augmentation ◽

Domain Adaptation ◽

Synthetic Image ◽

Wide Range ◽

The Impact

AbstractDeep learning methods produce promising results when applied to a wide range of medical imaging tasks, including segmentation of artery lumen in computed tomography angiography (CTA) data. However, to perform sufficiently, neural networks have to be trained on large amounts of high quality annotated data. In the realm of medical imaging, annotations are not only quite scarce but also often not entirely reliable. To tackle both challenges, we developed a two-step approach for generating realistic synthetic CTA data for the purpose of data augmentation. In the first step moderately realistic images are generated in a purely numerical fashion. In the second step these images are improved by applying neural domain adaptation. We evaluated the impact of synthetic data on lumen segmentation via convolutional neural networks (CNNs) by comparing resulting performances. Improvements of up to 5% in terms of Dice coefficient and 20% for Hausdorff distance represent a proof of concept that the proposed augmentation procedure can be used to enhance deep learning-based segmentation for artery lumen in CTA images.

Download Full-text

Data augmentation and semi-supervised learning for deep neural networks-based text classifier

Proceedings of the 35th Annual ACM Symposium on Applied Computing ◽

10.1145/3341105.3373992 ◽

2020 ◽

Author(s):

Heereen Shim ◽

Stijn Luca ◽

Dietwig Lowet ◽

Bart Vanrumste

Keyword(s):

Neural Networks ◽

Supervised Learning ◽

Deep Neural Networks ◽

Data Augmentation

Download Full-text

OPTIMAL REINSURANCE FROM THE VIEWPOINTS OF BOTH AN INSURER AND A REINSURER UNDER THE CVAR RISK MEASURE AND VAJDA CONDITION

Astin Bulletin ◽

10.1017/asb.2021.9 ◽

2021 ◽

pp. 1-29

Author(s):

Yanhong Chen

Keyword(s):

Loss Function ◽

Value At Risk ◽

Numerical Study ◽

Risk Measure ◽

Convex Combination ◽

Weighting Factor ◽

Loss Functions ◽

Conditional Value At Risk ◽

Optimal Reinsurance ◽

The Impact

ABSTRACT In this paper, we study the optimal reinsurance contracts that minimize the convex combination of the Conditional Value-at-Risk (CVaR) of the insurer’s loss and the reinsurer’s loss over the class of ceded loss functions such that the retained loss function is increasing and the ceded loss function satisfies Vajda condition. Among a general class of reinsurance premium principles that satisfy the properties of risk loading and convex order preserving, the optimal solutions are obtained. Our results show that the optimal ceded loss functions are in the form of five interconnected segments for general reinsurance premium principles, and they can be further simplified to four interconnected segments if more properties are added to reinsurance premium principles. Finally, we derive optimal parameters for the expected value premium principle and give a numerical study to analyze the impact of the weighting factor on the optimal reinsurance.

Download Full-text

Assessing the Impact of the Loss Function, Architecture and Image Type for Deep Learning-Based Wildfire Segmentation

Applied Sciences ◽

10.3390/app11157046 ◽

2021 ◽

Vol 11 (15) ◽

pp. 7046

Author(s):

Jorge Francisco Ciprián-Sánchez ◽

Gilberto Ochoa-Ruiz ◽

Lucile Rossi ◽

Frédéric Morandini

Keyword(s):

Deep Learning ◽

Loss Function ◽

State Of The Art ◽

Fire Detection ◽

Loss Functions ◽

Wildfire Spread ◽

Combine Information ◽

The Impact ◽

Image Type ◽

Segmentation Models

Wildfires stand as one of the most relevant natural disasters worldwide, particularly more so due to the effect of climate change and its impact on various societal and environmental levels. In this regard, a significant amount of research has been done in order to address this issue, deploying a wide variety of technologies and following a multi-disciplinary approach. Notably, computer vision has played a fundamental role in this regard. It can be used to extract and combine information from several imaging modalities in regard to fire detection, characterization and wildfire spread forecasting. In recent years, there has been work pertaining to Deep Learning (DL)-based fire segmentation, showing very promising results. However, it is currently unclear whether the architecture of a model, its loss function, or the image type employed (visible, infrared, or fused) has the most impact on the fire segmentation results. In the present work, we evaluate different combinations of state-of-the-art (SOTA) DL architectures, loss functions, and types of images to identify the parameters most relevant to improve the segmentation results. We benchmark them to identify the top-performing ones and compare them to traditional fire segmentation techniques. Finally, we evaluate if the addition of attention modules on the best performing architecture can further improve the segmentation results. To the best of our knowledge, this is the first work that evaluates the impact of the architecture, loss function, and image type in the performance of DL-based wildfire segmentation models.

Download Full-text

Data Augmentation Strategies for Human Activity Data Using Generative Adversarial Neural Networks

2021 IEEE International Conference on Pervasive Computing and Communications Workshops and other Affiliated Events (PerCom Workshops) ◽

10.1109/percomworkshops51409.2021.9431046 ◽

2021 ◽

Author(s):

Alexander Hoelzemann ◽

Nimish Sorathiya ◽

Kristof Van Laerhoven

Keyword(s):

Neural Networks ◽

Human Activity ◽

Data Augmentation ◽

Activity Data ◽

Augmentation Strategies

Download Full-text

The Impact of Global Structural Information in Graph Neural Networks Applications

Data ◽

10.3390/data7010010 ◽

2022 ◽

Vol 7 (1) ◽

pp. 10

Author(s):

Davide Buffelli ◽

Fabio Vandin

Keyword(s):

Neural Networks ◽

Structural Information ◽

Global Information ◽

Graph Structure ◽

Practical Applications ◽

Average Accuracy ◽

Graph Neural Networks ◽

The Impact ◽

Regularization Strategy ◽

Node Embeddings

Graph Neural Networks (GNNs) rely on the graph structure to define an aggregation strategy where each node updates its representation by combining information from its neighbours. A known limitation of GNNs is that, as the number of layers increases, information gets smoothed and squashed and node embeddings become indistinguishable, negatively affecting performance. Therefore, practical GNN models employ few layers and only leverage the graph structure in terms of limited, small neighbourhoods around each node. Inevitably, practical GNNs do not capture information depending on the global structure of the graph. While there have been several works studying the limitations and expressivity of GNNs, the question of whether practical applications on graph structured data require global structural knowledge or not remains unanswered. In this work, we empirically address this question by giving access to global information to several GNN models, and observing the impact it has on downstream performance. Our results show that global information can in fact provide significant benefits for common graph-related tasks. We further identify a novel regularization strategy that leads to an average accuracy improvement of more than 5% on all considered tasks.

Download Full-text