scholarly journals Link Prediction via Ranking Metric Dual-Level Attention Network Learning

Author(s):  
Zhou Zhao ◽  
Ben Gao ◽  
Vincent W. Zheng ◽  
Deng Cai ◽  
Xiaofei He ◽  
...  

Link prediction is a challenging problem for complex network analysis, arising in many disciplines such as social networks and telecommunication networks. Currently, many existing approaches estimate the proximity of the link endpoints for link prediction from their feature or the local neighborhood around them, which suffer from the localized view of network connections and insufficiency of discriminative feature representation. In this paper, we consider the problem of link prediction from the viewpoint of learning discriminative path-based proximity ranking metric embedding. We propose a novel ranking metric network learning framework by jointly exploiting both node-level and path-level attentional proximity of the endpoints for link prediction. We then develop the path-based dual-level reasoning attentional learning method with recurrent neural network for proximity ranking metric embedding. The extensive experiments on two large-scale datasets show that our method achieves better performance than other state-of-the-art solutions to the problem.

Author(s):  
Zhou Zhao ◽  
Lingtao Meng ◽  
Jun Xiao ◽  
Min Yang ◽  
Fei Wu ◽  
...  

Retweet prediction is a challenging problem in social media sites (SMS). In this paper, we study the problem of image retweet prediction in social media, which predicts the image sharing behavior that the user reposts the image tweets from their followees. Unlike previous studies, we learn user preference ranking model from their past retweeted image tweets in SMS. We first propose heterogeneous image retweet modeling network (IRM) that exploits users' past retweeted image tweets with associated contexts, their following relations in SMS and preference of their followees. We then develop a novel attentional multi-faceted ranking network learning framework with multi-modal neural networks for the proposed heterogenous IRM network to learn the joint image tweet representations and user preference representations for prediction task. The extensive experiments on a large-scale dataset from Twitter site shows that our method achieves better performance than other state-of-the-art solutions to the problem.


Author(s):  
Pingyang Dai ◽  
Rongrong Ji ◽  
Haibin Wang ◽  
Qiong Wu ◽  
Yuyu Huang

Person re-identification (Re-ID) is an important task in video surveillance which automatically searches and identifies people across different cameras. Despite the extensive Re-ID progress in RGB cameras, few works have studied the Re-ID between infrared and RGB images, which is essentially a cross-modality problem and widely encountered in real-world scenarios. The key challenge lies in two folds, i.e., the lack of discriminative information to re-identify the same person between RGB and infrared modalities, and the difficulty to learn a robust metric towards such a large-scale cross-modality retrieval. In this paper, we tackle the above two challenges by proposing a novel cross-modality generative adversarial network (termed cmGAN). To handle the issue of insufficient discriminative information, we leverage the cutting-edge generative adversarial training to design our own discriminator to learn discriminative feature representation from different modalities. To handle the issue of large-scale cross-modality metric learning, we integrates both identification loss and cross-modality triplet loss, which minimize inter-class ambiguity while maximizing cross-modality similarity among instances. The entire cmGAN can be trained in an end-to-end manner by using standard deep neural network framework. We have quantized the performance of our work in the newly-released SYSU RGB-IR Re-ID benchmark, and have reported superior performance, i.e., Cumulative Match Characteristic curve (CMC) and Mean Average Precision (MAP), over the state-of-the-art works [Wu et al., 2017], respectively.


2020 ◽  
Vol 2020 ◽  
pp. 1-7
Author(s):  
Tuozhong Yao ◽  
Wenfeng Wang ◽  
Yuhong Gu

Multiview active learning (MAL) is a technique which can achieve a large decrease in the size of the version space than traditional active learning and has great potential applications in large-scale data analysis. In this paper, we present a new deep multiview active learning (DMAL) framework which is the first to combine multiview active learning and deep learning for annotation effort reduction. In this framework, our approach advances the existing active learning methods in two aspects. First, we incorporate two different deep convolutional neural networks into active learning which uses multiview complementary information to improve the feature learnings. Second, through the properly designed framework, the feature representation and the classifier can be simultaneously updated with progressively annotated informative samples. The experiments with two challenging image datasets demonstrate that our proposed DMAL algorithm can achieve promising results than several state-of-the-art active learning algorithms.


Author(s):  
Xiaoxiao Sun ◽  
Liyi Chen ◽  
Jufeng Yang

Fine-grained classification is absorbed in recognizing the subordinate categories of one field, which need a large number of labeled images, while it is expensive to label these images. Utilizing web data has been an attractive option to meet the demands of training data for convolutional neural networks (CNNs), especially when the well-labeled data is not enough. However, directly training on such easily obtained images often leads to unsatisfactory performance due to factors such as noisy labels. This has been conventionally addressed by reducing the noise level of web data. In this paper, we take a fundamentally different view and propose an adversarial discriminative loss to advocate representation coherence between standard and web data. This is further encapsulated in a simple, scalable and end-to-end trainable multi-task learning framework. We experiment on three public datasets using large-scale web data to evaluate the effectiveness and generalizability of the proposed approach. Extensive experiments demonstrate that our approach performs favorably against the state-of-the-art methods.


2021 ◽  
Author(s):  
Xiangchun Li ◽  
Xilin Shen

Integration of the evolving large-scale single-cell transcriptomes requires scalable batch-correction approaches. Here we propose a simple batch-correction method that is scalable for integrating super large-scale single-cell transcriptomes from diverse sources. The core idea of the method is encoding batch information of each cell as a trainable parameter and added to its expression profile; subsequently, a contrastive learning approach is used to learn feature representation of the additive expression profile. We demonstrate the scalability of the proposed method by integrating 18 million cells obtained from the Human Cell Atlas. Our benchmark comparisons with current state-of-the-art single-cell integration methods demonstrated that our method could achieve comparable data alignment and cluster preservation. Our study would facilitate the integration of super large-scale single-cell transcriptomes. The source code is available at https://github.com/xilinshen/Fugue.


2020 ◽  
Vol 34 (01) ◽  
pp. 354-361 ◽  
Author(s):  
Chidubem Arachie ◽  
Manas Gaur ◽  
Sam Anzaroot ◽  
William Groves ◽  
Ke Zhang ◽  
...  

Social media plays a major role during and after major natural disasters (e.g., hurricanes, large-scale fires, etc.), as people “on the ground” post useful information on what is actually happening. Given the large amounts of posts, a major challenge is identifying the information that is useful and actionable. Emergency responders are largely interested in finding out what events are taking place so they can properly plan and deploy resources. In this paper we address the problem of automatically identifying important sub-events (within a large-scale emergency “event”, such as a hurricane). In particular, we present a novel, unsupervised learning framework to detect sub-events in Tweets for retrospective crisis analysis. We first extract noun-verb pairs and phrases from raw tweets as sub-event candidates. Then, we learn a semantic embedding of extracted noun-verb pairs and phrases, and rank them against a crisis-specific ontology. We filter out noisy and irrelevant information then cluster the noun-verb pairs and phrases so that the top-ranked ones describe the most important sub-events. Through quantitative experiments on two large crisis data sets (Hurricane Harvey and the 2015 Nepal Earthquake), we demonstrate the effectiveness of our approach over the state-of-the-art. Our qualitative evaluation shows better performance compared to our baseline.


2020 ◽  
Author(s):  
Saeed Khaki ◽  
Hieu Pham ◽  
Lizhi Wang

AbstractLarge scale crop yield estimation is, in part, made possible due to the availability of remote sensing data allowing for the continuous monitoring of crops throughout its growth state. Having this information allows stakeholders the ability to make real-time decisions to maximize yield potential. Although various models exist that predict yield from remote sensing data, there currently does not exist an approach that can estimate yield for multiple crops simultaneously, and thus leads to more accurate predictions. A model that predicts yield of multiple crops and concurrently considers the interaction between multiple crop’s yield. We propose a new model called YieldNet which utilizes a novel deep learning framework that uses transfer learning between corn and soybean yield predictions by sharing the weights of the backbone feature extractor. Additionally, to consider the multi-target response variable, we propose a new loss function. Numerical results demonstrate that our proposed method accurately predicts yield from one to four months before the harvest, and is competitive to other state-of-the-art approaches.


2021 ◽  
Vol 12 (6) ◽  
pp. 1-24
Author(s):  
Haoyi Zhou ◽  
Hao Peng ◽  
Jieqi Peng ◽  
Shuai Zhang ◽  
Jianxin Li

The spatial-temporal modeling on long sequences is of great importance in many real-world applications. Recent studies have shown the potential of applying the self-attention mechanism to improve capturing the complex spatial-temporal dependencies. However, the lack of underlying structure information weakens its general performance on long sequence spatial-temporal problem. To overcome this limitation, we proposed a novel method, named the Proximity-aware Long Sequence Learning framework, and apply it to the spatial-temporal forecasting task. The model substitutes the canonical self-attention by leveraging the proximity-aware attention, which enhances local structure clues in building long-range dependencies with a linear approximation of attention scores. The relief adjacency matrix technique can utilize the historical global graph information for consistent proximity learning. Meanwhile, the reduced decoder allows for fast inference in a non-autoregressive manner. Extensive experiments are conducted on five large-scale datasets, which demonstrate that our method achieves state-of-the-art performance and validates the effectiveness brought by local structure information.


2020 ◽  
Vol 34 (07) ◽  
pp. 11410-11417
Author(s):  
Wenjing Li ◽  
Zhongcheng Wu

This paper considers a novel problem, named One-View Learning (OVL), in human retrieval a.k.a. person re-identification (re-ID). Unlike fully-supervised learning, OVL only requires pretty cheap annotation cost: labeled training images are only provided from one camera view (source view/domain), while the annotations of training images from other camera views (target views/domains) are not available. OVL is a problem of multi-target open set domain adaptation that is difficult for existing domain adaptation methods to handle. This is because 1) unlabeled samples are drawn from multiple target views in different distributions, and 2) the target views may contain samples of “unknown identity” that are not shared by the source view. To address this problem, this work introduces a novel one-view learning framework for person re-ID. This is achieved by adversarial multi-view learning (AMVL) and adversarial unknown rejection learning (AURL). The former learns a multi-view discriminator by adversarial learning to align the feature distributions between all views. The later is designed to reject unknown samples from target views through adversarial learning with two unknown identity classifiers. Extensive experiments on three large-scale datasets demonstrate the advantage of the proposed method over state-of-the-art domain adaptation and semi-supervised methods.


2019 ◽  
Vol 11 (14) ◽  
pp. 1702 ◽  
Author(s):  
Rui Ba ◽  
Chen Chen ◽  
Jing Yuan ◽  
Weiguo Song ◽  
Siuming Lo

A variety of environmental analysis applications have been advanced by the use of satellite remote sensing. Smoke detection based on satellite imagery is imperative for wildfire detection and monitoring. However, the commonly used smoke detection methods mainly focus on smoke discrimination from a few specific classes, which reduces their applicability in different regions of various classes. To this end, in this paper, we present a new large-scale satellite imagery smoke detection benchmark based on Moderate Resolution Imaging Spectroradiometer (MODIS) data, namely USTC_SmokeRS, consisting of 6225 satellite images from six classes (i.e., cloud, dust, haze, land, seaside, and smoke) and covering various areas/regions over the world. To build a baseline for smoke detection in satellite imagery, we evaluate several state-of-the-art deep learning-based image classification models. Moreover, we propose a new convolution neural network (CNN) model, SmokeNet, which incorporates spatial and channel-wise attention in CNN to enhance feature representation for scene classification. The experimental results of our method using different proportions (16%, 32%, 48%, and 64%) of training images reveal that our model outperforms other approaches with higher accuracy and Kappa coefficient. Specifically, the proposed SmokeNet model trained with 64% training images achieves the best accuracy of 92.75% and Kappa coefficient of 0.9130. The model trained with 16% training images can also improve the classification accuracy and Kappa coefficient by at least 4.99% and 0.06, respectively, over the state-of-the-art models.


Sign in / Sign up

Export Citation Format

Share Document