Large-Context Conversational Representation Learning: Self-Supervised Learning For Conversational Documents

Video representation learning is a vital problem for classification task. Recently, a promising unsupervised paradigm termed self-supervised learning has emerged, which explores inherent supervisory signals implied in massive data for feature learning via solving auxiliary tasks. However, existing methods in this regard suffer from two limitations when extended to video classification. First, they focus only on a single task, whereas ignoring complementarity among different task-specific features and thus resulting in suboptimal video representation. Second, high computational and memory cost hinders their application in real-world scenarios. In this paper, we propose a graph-based distillation framework to address these problems: (1) We propose logits graph and representation graph to transfer knowledge from multiple self-supervised tasks, where the former distills classifier-level knowledge by solving a multi-distribution joint matching problem, and the latter distills internal feature knowledge from pairwise ensembled representations with tackling the challenge of heterogeneity among different features; (2) The proposal that adopts a teacher-student framework can reduce the redundancy of knowledge learned from teachers dramatically, leading to a lighter student model that solves classification task more efficiently. Experimental results on 3 video datasets validate that our proposal not only helps learn better video representation but also compress model for faster inference.

Download Full-text

Pairwise Half-graph Discrimination: A Simple Graph-level Self-supervised Strategy for Pre-training Graph Neural Networks

Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2021/371 ◽

2021 ◽

Author(s):

Pengyong Li ◽

Jun Wang ◽

Ziliang Li ◽

Yixuan Qiao ◽

Xianggen Liu ◽

...

Keyword(s):

Neural Networks ◽

Supervised Learning ◽

Learning Strategy ◽

Binary Classification ◽

Representation Learning ◽

Graph Representation ◽

Superior Performance ◽

Graph Classification ◽

Training Strategy ◽

Graph Neural Networks

Self-supervised learning has gradually emerged as a powerful technique for graph representation learning. However, transferable, generalizable, and robust representation learning on graph data still remains a challenge for pre-training graph neural networks. In this paper, we propose a simple and effective self-supervised pre-training strategy, named Pairwise Half-graph Discrimination (PHD), that explicitly pre-trains a graph neural network at graph-level. PHD is designed as a simple binary classification task to discriminate whether two half-graphs come from the same source. Experiments demonstrate that the PHD is an effective pre-training strategy that offers comparable or superior performance on 13 graph classification tasks compared with state-of-the-art strategies, and achieves notable improvements when combined with node-level strategies. Moreover, the visualization of learned representation revealed that PHD strategy indeed empowers the model to learn graph-level knowledge like the molecular scaffold. These results have established PHD as a powerful and effective self-supervised learning strategy in graph-level representation learning.

Download Full-text

Evaluating Protein Transfer Learning with TAPE

10.1101/676825 ◽

2019 ◽

Cited By ~ 15

Author(s):

Roshan Rao ◽

Nicholas Bhattacharya ◽

Neil Thomas ◽

Yan Duan ◽

Xi Chen ◽

...

Keyword(s):

Machine Learning ◽

Learning Community ◽

Supervised Learning ◽

Real Life ◽

Representation Learning ◽

Protein Modeling ◽

Bench Mark ◽

Biologically Relevant ◽

Learning Techniques ◽

Almost All

AbstractProtein modeling is an increasingly popular area of machine learning research. Semi-supervised learning has emerged as an important paradigm in protein modeling due to the high cost of acquiring supervised protein labels, but the current literature is fragmented when it comes to datasets and standardized evaluation techniques. To facilitate progress in this field, we introduce the Tasks Assessing Protein Embeddings (TAPE), a set of five biologically relevant semi-supervised learning tasks spread across different domains of protein biology. We curate tasks into specific training, validation, and test splits to ensure that each task tests biologically relevant generalization that transfers to real-life scenarios. We bench-mark a range of approaches to semi-supervised protein representation learning, which span recent work as well as canonical sequence learning techniques. We find that self-supervised pretraining is helpful for almost all models on all tasks, more than doubling performance in some cases. Despite this increase, in several cases features learned by self-supervised pretraining still lag behind features extracted by state-of-the-art non-neural techniques. This gap in performance suggests a huge opportunity for innovative architecture design and improved modeling paradigms that better capture the signal in biological sequences. TAPE will help the machine learning community focus effort on scientifically relevant problems. Toward this end, all data and code used to run these experiments are available at https://github.com/songlab-cal/tape.

Download Full-text

Evaluating Polymer Representations via Quantifying Structure-Property Relationships

10.26434/chemrxiv.8060000 ◽

2019 ◽

Author(s):

RUIMIN MA ◽

Zeyu Liu ◽

Quanwei Zhang ◽

zhiyu liu ◽

Tengfei Luo

Keyword(s):

Supervised Learning ◽

Information Gain ◽

Molecular Graph ◽

Representation Learning ◽

Training Data ◽

Machine Learning Techniques ◽

Structure Property ◽

Structure Property Relationships ◽

Benchmark Database ◽

Learning Schemes

Machine learning techniques are being applied in quantifying structure-property relationships for a wide variety of materials, where the properly representing materials plays key roles. Although algorithms for representation learning are extensively studied, their applications to domain-specific areas, such as polymer, are limited largely due to the lack of benchmark databases. In this work, we investigate different types of polymer representations, including Morgan Fingerprint (MF), molecular embedding (ME) and molecular graph (MG), based on a benchmark database from a subset of PolyInfo. We evaluate the quality of different polymer representations via quantifying the relationships between the representations and polymer properties, including density, melting temperature and glass transition temperature. Different representation learning schemes, such as supervised learning, semi-supervised learning and transfer learning, are investigated. It is found that ME outperforms the other representations for structure-property relationship quantification in all cases studied, and MG is shown to be much inferior than ME and MF, likely due to the relatively small volumes of training data available. For MEs, it is found that the similarities of substructure MEs under different learning schemes (e.g., SL, SSL and TL) are differently estimated, thus leading to different performance scores in structure-property relation quantification. Several ME mixtures have shown to outperform the single MEs in the corresponding regression tasks, and this is attributed to the information gain when mixing different ME.

Download Full-text

selfRL: Two-Level Self-Supervised Transformer Representation Learning for Link Prediction of Heterogeneous Biomedical Networks

10.1101/2020.10.20.347153 ◽

2020 ◽

Author(s):

Xiaoqi Wang ◽

Yaning Yang ◽

Xiangke Liao ◽

Lenli Li ◽

Fei Li ◽

...

Keyword(s):

Supervised Learning ◽

Link Prediction ◽

Level Structure ◽

Representation Learning ◽

Learning Task ◽

Learning Tasks ◽

Biomedical Problem ◽

Great Performance ◽

Meta Path ◽

Frame Work

AbstractPredicting potential links in heterogeneous biomedical networks (HBNs) can greatly benefit various important biomedical problem. However, the self-supervised representation learning for link prediction in HBNs has been slightly explored in previous researches. Therefore, this study proposes a two-level self-supervised representation learning, namely selfRL, for link prediction in heterogeneous biomedical networks. The meta path detection-based self-supervised learning task is proposed to learn representation vectors that can capture the global-level structure and semantic feature in HBNs. The vertex entity mask-based self-supervised learning mechanism is designed to enhance local association of vertices. Finally, the representations from two tasks are concatenated to generate high-quality representation vectors. The results of link prediction on six datasets show selfRL outperforms 25 state-of-the-art methods. In particular, selfRL reveals great performance with results close to 1 in terms of AUC and AUPR on the NeoDTI-net dataset. In addition, the PubMed publications demonstrate that nine out of ten drugs screened by selfRL can inhibit the cytokine storm in COVID-19 patients. In summary, selfRL provides a general frame-work that develops self-supervised learning tasks with unlabeled data to obtain promising representations for improving link prediction.

Download Full-text

Detecting Cyber Attacks in Smart Grids Using Semi-Supervised Anomaly Detection and Deep Representation Learning

Information ◽

10.3390/info12080328 ◽

2021 ◽

Vol 12 (8) ◽

pp. 328

Author(s):

Ruobin Qi ◽

Craig Rasband ◽

Jun Zheng ◽

Raul Longoria

Keyword(s):

Anomaly Detection ◽

Supervised Learning ◽

Information And Communication Technologies ◽

Smart Grids ◽

Representation Learning ◽

Performance Comparison ◽

Cyber Attacks ◽

Training Dataset ◽

Cyber Attack ◽

Detection Algorithms

Smart grids integrate advanced information and communication technologies (ICTs) into traditional power grids for more efficient and resilient power delivery and management, but also introduce new security vulnerabilities that can be exploited by adversaries to launch cyber attacks, causing severe consequences such as massive blackout and infrastructure damages. Existing machine learning-based methods for detecting cyber attacks in smart grids are mostly based on supervised learning, which need the instances of both normal and attack events for training. In addition, supervised learning requires that the training dataset includes representative instances of various types of attack events to train a good model, which is sometimes hard if not impossible. This paper presents a new method for detecting cyber attacks in smart grids using PMU data, which is based on semi-supervised anomaly detection and deep representation learning. Semi-supervised anomaly detection only employs the instances of normal events to train detection models, making it suitable for finding unknown attack events. A number of popular semi-supervised anomaly detection algorithms were investigated in our study using publicly available power system cyber attack datasets to identify the best-performing ones. The performance comparison with popular supervised algorithms demonstrates that semi-supervised algorithms are more capable of finding attack events than supervised algorithms. Our results also show that the performance of semi-supervised anomaly detection algorithms can be further improved by augmenting with deep representation learning.

Download Full-text

VoxPopuli: A Large-Scale Multilingual Speech Corpus for Representation Learning, Semi-Supervised Learning and Interpretation

10.18653/v1/2021.acl-long.80 ◽

2021 ◽

Author(s):

Changhan Wang ◽

Morgane Riviere ◽

Ann Lee ◽

Anne Wu ◽

Chaitanya Talnikar ◽

...

Keyword(s):

Supervised Learning ◽

Large Scale ◽

Representation Learning ◽

Speech Corpus

Download Full-text

Conditional Self-Supervised Learning for Few-Shot Classification

Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2021/295 ◽

2021 ◽

Author(s):

Yuexuan An ◽

Hui Xue ◽

Xingyu Zhao ◽

Lu Zhang

Keyword(s):

Supervised Learning ◽

Structural Information ◽

Auxiliary Information ◽

Representation Learning ◽

Feature Representation ◽

Fine Tuning ◽

Main Task ◽

Good Representation ◽

Shot Classification ◽

Supervised Methods

How to learn a transferable feature representation from limited examples is a key challenge for few-shot classification. Self-supervision as an auxiliary task to the main supervised few-shot task is considered to be a conceivable way to solve the problem since self-supervision can provide additional structural information easily ignored by the main task. However, learning a good representation by traditional self-supervised methods is usually dependent on large training samples. In few-shot scenarios, due to the lack of sufficient samples, these self-supervised methods might learn a biased representation, which more likely leads to the wrong guidance for the main tasks and finally causes the performance degradation. In this paper, we propose conditional self-supervised learning (CSS) to use auxiliary information to guide the representation learning of self-supervised tasks. Specifically, CSS leverages supervised information as prior knowledge to shape and improve the learning feature manifold of self-supervision without auxiliary unlabeled data, so as to reduce representation bias and mine more effective semantic information. Moreover, CSS exploits more meaningful information through supervised and the improved self-supervised learning respectively and integrates the information into a unified distribution, which can further enrich and broaden the original representation. Extensive experiments demonstrate that our proposed method without any fine-tuning can achieve a significant accuracy improvement on the few-shot classification scenarios compared to the state-of-the-art few-shot learning methods.

Download Full-text

Evaluating Polymer Representations via Quantifying Structure-Property Relationships

10.26434/chemrxiv.8060000.v1 ◽

2019 ◽

Author(s):

RUIMIN MA ◽

Zeyu Liu ◽

Quanwei Zhang ◽

zhiyu liu ◽

Tengfei Luo

Keyword(s):

Supervised Learning ◽

Information Gain ◽

Molecular Graph ◽

Representation Learning ◽

Training Data ◽

Machine Learning Techniques ◽

Structure Property ◽

Structure Property Relationships ◽

Benchmark Database ◽

Learning Schemes

Machine learning techniques are being applied in quantifying structure-property relationships for a wide variety of materials, where the properly representing materials plays key roles. Although algorithms for representation learning are extensively studied, their applications to domain-specific areas, such as polymer, are limited largely due to the lack of benchmark databases. In this work, we investigate different types of polymer representations, including Morgan Fingerprint (MF), molecular embedding (ME) and molecular graph (MG), based on a benchmark database from a subset of PolyInfo. We evaluate the quality of different polymer representations via quantifying the relationships between the representations and polymer properties, including density, melting temperature and glass transition temperature. Different representation learning schemes, such as supervised learning, semi-supervised learning and transfer learning, are investigated. It is found that ME outperforms the other representations for structure-property relationship quantification in all cases studied, and MG is shown to be much inferior than ME and MF, likely due to the relatively small volumes of training data available. For MEs, it is found that the similarities of substructure MEs under different learning schemes (e.g., SL, SSL and TL) are differently estimated, thus leading to different performance scores in structure-property relation quantification. Several ME mixtures have shown to outperform the single MEs in the corresponding regression tasks, and this is attributed to the information gain when mixing different ME.

Download Full-text