Tencent ML-Images: A Large-Scale Multi-Label Image Database for Visual Representation Learning

Background: At present, using computer methods to predict drug-target interactions (DTIs) is a very important step in the discovery of new drugs and drug relocation processes. The potential DTIs identified by machine learning methods can provide guidance in biochemical or clinical experiments. Objective: The goal of this article is to combine the latest network representation learning methods for drug-target prediction research, improve model prediction capabilities, and promote new drug development. Methods: We use large-scale information network embedding (LINE) method to extract network topology features of drugs, targets, diseases, etc., integrate features obtained from heterogeneous networks, construct binary classification samples, and use random forest (RF) method to predict DTIs. Results: The experiments in this paper compare the common classifiers of RF, LR, and SVM, as well as the typical network representation learning methods of LINE, Node2Vec, and DeepWalk. It can be seen that the combined method LINE-RF achieves the best results, reaching an AUC of 0.9349 and an AUPR of 0.9016. Conclusion: The learning method based on LINE network can effectively learn drugs, targets, diseases and other hidden features from the network topology. The combination of features learned through multiple networks can enhance the expression ability. RF is an effective method of supervised learning. Therefore, the Line-RF combination method is a widely applicable method.

Download Full-text

CASIA-Face-Africa: A Large-scale African Face Image Database

IEEE Transactions on Information Forensics and Security ◽

10.1109/tifs.2021.3080496 ◽

2021 ◽

pp. 1-1

Author(s):

Jawad Muhammad ◽

Yunlong Wang ◽

Caiyong Wanga ◽

Kunbo Zhang ◽

Zhenan Sun

Keyword(s):

Large Scale ◽

Image Database ◽

Face Image

Download Full-text

A Novel Method to Predict Drug-Target Interactions Based on Large-Scale Graph Representation Learning

Cancers ◽

10.3390/cancers13092111 ◽

2021 ◽

Vol 13 (9) ◽

pp. 2111

Author(s):

Bo-Wei Zhao ◽

Zhu-Hong You ◽

Lun Hu ◽

Zhen-Hao Guo ◽

Lei Wang ◽

...

Keyword(s):

Drug Target ◽

Large Scale ◽

Computational Models ◽

Structural Information ◽

Characteristic Curve ◽

Representation Learning ◽

Graph Representation ◽

Convolutional Network ◽

Novel Method

Identification of drug-target interactions (DTIs) is a significant step in the drug discovery or repositioning process. Compared with the time-consuming and labor-intensive in vivo experimental methods, the computational models can provide high-quality DTI candidates in an instant. In this study, we propose a novel method called LGDTI to predict DTIs based on large-scale graph representation learning. LGDTI can capture the local and global structural information of the graph. Specifically, the first-order neighbor information of nodes can be aggregated by the graph convolutional network (GCN); on the other hand, the high-order neighbor information of nodes can be learned by the graph embedding method called DeepWalk. Finally, the two kinds of feature are fed into the random forest classifier to train and predict potential DTIs. The results show that our method obtained area under the receiver operating characteristic curve (AUROC) of 0.9455 and area under the precision-recall curve (AUPR) of 0.9491 under 5-fold cross-validation. Moreover, we compare the presented method with some existing state-of-the-art methods. These results imply that LGDTI can efficiently and robustly capture undiscovered DTIs. Moreover, the proposed model is expected to bring new inspiration and provide novel perspectives to relevant researchers.

Download Full-text

Classification of large-scale image database of various skin diseases using deep learning

International Journal of Computer Assisted Radiology and Surgery ◽

10.1007/s11548-021-02440-y ◽

2021 ◽

Author(s):

Masaya Tanaka ◽

Atsushi Saito ◽

Kosuke Shido ◽

Yasuhiro Fujisawa ◽

Kenshi Yamasaki ◽

...

Keyword(s):

Deep Learning ◽

Large Scale ◽

Skin Diseases ◽

Image Database

Download Full-text

A Large-Scale Compressed 360-Degree Spherical Image Database: From Subjective Quality Evaluation to Objective Model Comparison

2018 IEEE 20th International Workshop on Multimedia Signal Processing (MMSP) ◽

10.1109/mmsp.2018.8547102 ◽

2018 ◽

Cited By ~ 8

Author(s):

Wei Sun ◽

Ke Gu ◽

Siwei Ma ◽

Wenhan Zhu ◽

Ning Liu ◽

...

Keyword(s):

Large Scale ◽

Quality Evaluation ◽

Model Comparison ◽

Image Database ◽

Subjective Quality ◽

Spherical Image ◽

Objective Model

Download Full-text

X-MOL: large-scale pre-training for molecular understanding and diverse molecular analysis

10.1101/2020.12.23.424259 ◽

2020 ◽

Author(s):

Dongyu Xue ◽

Han Zhang ◽

Dongling Xiao ◽

Yukang Gong ◽

Guohui Chuai ◽

...

Keyword(s):

Molecular Analysis ◽

In Silico ◽

Large Scale ◽

De Novo ◽

Representation Learning ◽

Training Data ◽

Fine Tuning ◽

Model Interpretation ◽

Unlabelled Data ◽

Super Computing

AbstractIn silico modelling and analysis of small molecules substantially accelerates the process of drug development. Representing and understanding molecules is the fundamental step for various in silico molecular analysis tasks. Traditionally, these molecular analysis tasks have been investigated individually and separately. In this study, we presented X-MOL, which applies large-scale pre-training technology on 1.1 billion molecules for molecular understanding and representation, and then, carefully designed fine-tuning was performed to accommodate diverse downstream molecular analysis tasks, including molecular property prediction, chemical reaction analysis, drug-drug interaction prediction, de novo generation of molecules and molecule optimization. As a result, X-MOL was proven to achieve state-of-the-art results on all these molecular analysis tasks with good model interpretation ability. Collectively, taking advantage of super large-scale pre-training data and super-computing power, our study practically demonstrated the utility of the idea of “mass makes miracles” in molecular representation learning and downstream in silico molecular analysis, indicating the great potential of using large-scale unlabelled data with carefully designed pre-training and fine-tuning strategies to unify existing molecular analysis tasks and substantially enhance the performance of each task.

Download Full-text

Comparing Data Sources and Architectures for Deep Visual Representation Learning in Semantics

10.18653/v1/d16-1043 ◽

2016 ◽

Cited By ~ 2

Author(s):

Douwe Kiela ◽

Anita Lilla Verő ◽

Stephen Clark

Keyword(s):

Visual Representation ◽

Representation Learning ◽

Data Sources

Download Full-text

An Applicative Survey on Few-Shot Learning

Recent Patents on Engineering ◽

10.2174/1872212115666210715121344 ◽

2021 ◽

Vol 15 ◽

Author(s):

Jianwei Zhang ◽

Xubin Zhang ◽

Lei Lv ◽

Yining Di ◽

Wei Chen

Keyword(s):

Large Scale ◽

Representation Learning ◽

Language Models ◽

Data Sets ◽

Research Directions ◽

Large Scale Data ◽

Cross Domain ◽

Meta Learning ◽

Definition Of ◽

Future Work

Background: Learning discriminative representation from large-scale data sets has made a breakthrough in decades. However, it is still a thorny problem to generate representative embedding from limited examples, for example, a class containing only one image. Recently, deep learning-based Few-Shot Learning (FSL) has been proposed. It tackles this problem by leveraging prior knowledge in various ways. Objective: In this work, we review recent advances of FSL from the perspective of high-dimensional representation learning. The results of the analysis can provide insights and directions for future work. Methods: We first present the definition of general FSL. Then we propose a general framework for the FSL problem and give the taxonomy under the framework. We survey two FSL directions: learning policy and meta-learning. Results: We review the advanced applications of FSL, including image classification, object detection, image segmentation and other tasks etc., as well as the corresponding benchmarks to provide an overview of recent progress. Conclusion: FSL needs to be further studied in medical images, language models, and reinforcement learning in future work. In addition, cross-domain FSL, successive FSL, and associated FSL are more challenging and valuable research directions.

Download Full-text

Multi-modal transportation recommendation with unified route representation learning

Proceedings of the VLDB Endowment ◽

10.14778/3430915.3430924 ◽

2020 ◽

Vol 14 (3) ◽

pp. 342-350

Author(s):

Hao Liu ◽

Jindong Han ◽

Yanjie Fu ◽

Jingbo Zhou ◽

Xinjiang Lu ◽

...

Keyword(s):

Large Scale ◽

Transportation Networks ◽

Representation Learning ◽

Transportation Systems ◽

Graph Representation ◽

Dynamic Graph ◽

Arbitrary Length ◽

Task Learning ◽

Semantic Coherence ◽

Spatio Temporal

Multi-modal transportation recommendation aims to provide the most appropriate travel route with various transportation modes according to certain criteria. After analyzing large-scale navigation data, we find that route representations exhibit two patterns: spatio-temporal autocorrelations within transportation networks and the semantic coherence of route sequences. However, there are few studies that consider both patterns when developing multi-modal transportation systems. To this end, in this paper, we study multi-modal transportation recommendation with unified route representation learning by exploiting both spatio-temporal dependencies in transportation networks and the semantic coherence of historical routes. Specifically, we propose to unify both dynamic graph representation learning and hierarchical multi-task learning for multi-modal transportation recommendations. Along this line, we first transform the multi-modal transportation network into time-dependent multi-view transportation graphs and propose a spatiotemporal graph neural network module to capture the spatial and temporal autocorrelation. Then, we introduce a coherent-aware attentive route representation learning module to project arbitrary-length routes into fixed-length representation vectors, with explicit modeling of route coherence from historical routes. Moreover, we develop a hierarchical multi-task learning module to differentiate route representations for different transport modes, and this is guided by the final recommendation feedback as well as multiple auxiliary tasks equipped in different network layers. Extensive experimental results on two large-scale real-world datasets demonstrate the performance of the proposed system outperforms eight baselines.

Download Full-text