transductive learning
Recently Published Documents


TOTAL DOCUMENTS

143
(FIVE YEARS 33)

H-INDEX

13
(FIVE YEARS 2)

Symmetry ◽  
2021 ◽  
Vol 14 (1) ◽  
pp. 30
Author(s):  
Qinglang Guo ◽  
Haiyong Xie ◽  
Yangyang Li ◽  
Wen Ma ◽  
Chao Zhang

The online social media ecosystem is becoming more and more confused because of more and more fake information and the social media of malicious users’ fake content; at the same time, unspeakable pain has been brought to mankind. Social robot detection uses supervised classification based on artificial feature extraction. However, user privacy is also involved in using these methods, and the hidden feature information is also ignored, such as semi-supervised algorithms with low utilization rates and graph features. In this work, we symmetrically combine BERT and GCN (Graph Convolutional Network, GCN) and propose a novel model that combines large scale pretraining and transductive learning for social robot detection, BGSRD. BGSRD constructs a heterogeneous graph over the dataset and represents Twitter as nodes using BERT representations. Corpus learning via text graph convolution network is a single text graph, which is mainly built for corpus-based on word co-occurrence and document word relationship. BERT and GCN modules can be jointly trained in BGSRD to achieve the best of merit, training data and unlabeled test data can spread label influence through graph convolution and can be carried out in the large-scale pre-training of massive raw data and the transduction learning of joint learning representation. The experiment shows that a better performance can also be achieved by BGSRD on a wide range of social robot detection datasets.


Author(s):  
Weijian Ni ◽  
Tong Liu ◽  
Qingtian Zeng ◽  
Nengfu Xie

Domain terminologies are a basic resource for various natural language processing tasks. To automatically discover terminologies for a domain of interest, most traditional approaches mostly rely on a domain-specific corpus given in advance; thus, the performance of traditional approaches can only be guaranteed when collecting a high-quality domain-specific corpus, which requires extensive human involvement and domain expertise. In this article, we propose a novel approach that is capable of automatically mining domain terminologies using search engine's query log—a type of domain-independent corpus of higher availability, coverage, and timeliness than a manually collected domain-specific corpus. In particular, we represent query log as a heterogeneous network and formulate the task of mining domain terminology as transductive learning on the heterogeneous network. In the proposed approach, the manifold structure of domain-specificity inherent in query log is captured by using a novel network embedding algorithm and further exploited to reduce the need for the manual annotation efforts for domain terminology classification. We select Agriculture and Healthcare as the target domains and experiment using a real query log from a commercial search engine. Experimental results show that the proposed approach outperforms several state-of-the-art approaches.


2021 ◽  
Author(s):  
Sohel Rana ◽  
Md Alamin Hossan ◽  
Abidullha Adel

Abstract In cloud security, detecting attack software is considered an essential task. Among several attack types, a zero-day attack is considered as most problematic because the antivirus cannot able to remove it. The existing attack detection model uses stored data about attack characteristics, which fails to detect zero-attack where an altered attack is implemented for an antivirus system to detect the attack. To detect and prevent zero-day attacks, this paper proposed a model stated as Hidden Markov Model Transductive Deep Learning (HMM_TDL), which generates hyper alerts when an attack is implemented. Also, the HMM_TDL assigns labels to data in the network and periodically updates the database (DB). Initially, the HMM model detects the attacks with hyper alerts in the database. In the next stage, transductive deep learning incorporates k-medoids for clustering attacks and assign labels. Finally, the trust value of the original data is computed and computed in the database based on the value network able to classify attacks and data. The developed HMM_TDL is trained with consideration of two datasets such as NSL-KDD and CIDD. The comparative analysis of HMM_TDL exhibits a higher accuracy value of 95% than existing attack classification techniques.


2021 ◽  
Vol 11 (21) ◽  
pp. 9832
Author(s):  
Junhui Chen ◽  
Feihu Huang ◽  
Jian Peng

Heterogeneous graph embedding has become a hot topic in network embedding in recent years and has been widely used in lots of practical scenarios. However, most of the existing heterogeneous graph embedding methods cannot make full use of all the auxiliary information. So we proposed a new method called Multi-Subgraph based Graph Convolution Network (MSGCN), which uses topology information, semantic information, and node feature information to learn node embedding vector. In MSGCN, the graph is firstly decomposed into multiple subgraphs according to the type of edges. Then convolution operation is adopted for each subgraph to obtain the node representations of each subgraph. Finally, the node representations are obtained by aggregating the representation vectors of nodes in each subgraph. Furthermore, we discussed the application of MSGCN with respect to a transductive learning task and inductive learning task, respectively. A node sampling method for inductive learning tasks to obtain representations of new nodes is proposed. This sampling method uses the attention mechanism to find important nodes and then assigns different weights to different nodes during aggregation. We conducted an experiment on three datasets. The experimental results indicate that our MSGCN outperforms the state-of-the-art methods in multi-class node classification tasks.


Author(s):  
Axel Kowald ◽  
Israel Barrantes ◽  
Steffen Möller ◽  
Daniel Palmer ◽  
Hugo Murua Escobar ◽  
...  

Accurate transfer learning of clinical outcomes, e.g., of the effects and side effects of drugs or other interventions, from one cellular context to another (in-vitro versus ex-vivo versus in-vivo, or across tissues), between cell-types, developmental stages, omics modalities or species, is considered tremendously useful. Ultimately, it may avoid most drug development failing in translation, despite large investments in the preclinical stages, which includes animal experiments requiring careful justification. Thus, when transferring a prediction task from a source (model) domain to a target domain, what counts is the high quality of the predictions in the target domain, requiring molecular states or processes common to both source and target that can be learned by the predictor, reflected by latent variables. These latent variables may form a compendium of knowledge that is learned in the source, to enable predictions in the target; usually, there are few, if any, labeled target training samples to learn from. Transductive learning then refers to the learning of the predictor in the source domain, transferring its outcome label calculations to the target domain, considering the same task. Inductive learning considers cases where the target predictor is performing a different yet related task as compared to the source predictor, making some labeled target data necessary. Often, there is also a need to first map the variables in the input/feature spaces (e.g. of gene names to orthologs) and/or the variables in the output/outcome spaces (e.g. by matching of labels). Transfer across omics modalities also requires that the molecular information flow connecting these modalities is sufficiently conserved. Only one of the methods for transfer learning we reviewed offers an assessment of input data, suggesting that transfer learning is unreliable in certain cases. Moreover, source domains feature their very own particularities, and transfer learning should consider these, e.g., as differences in pharmacokinetics, drug clearance or the microenvironment. In light of these general considerations, we here discuss and juxtapose various recent transfer learning approaches, specifically designed (or at least adaptable) to predict clinical (human in-vivo) outcomes based on molecular data, towards finding the right tool for a given task, and paving the way for a comprehensive and systematic comparison of the suitability and accuracy of transfer learning of clinical outcomes.


Author(s):  
Chen Li ◽  
Xutan Peng ◽  
Hao Peng ◽  
Jianxin Li ◽  
Lihong Wang

Compared with traditional sequential learning models, graph-based neural networks exhibit excellent properties when encoding text, such as the capacity of capturing global and local information simultaneously. Especially in the semi-supervised scenario, propagating information along the edge can effectively alleviate the sparsity of labeled data. In this paper, beyond the existing architecture of heterogeneous word-document graphs, for the first time, we investigate how to construct lightweight non-heterogeneous graphs based on different linguistic information to better serve free text representation learning. Then, a novel semi-supervised framework for text classification that refines graph topology under theoretical guidance and shares information across different text graphs, namely Text-oriented Graph-based Transductive Learning (TextGTL), is proposed. TextGTL also performs attribute space interpolation based on dense substructure in graphs to predict low-entropy labels with high-quality feature nodes for data augmentation. To verify the effectiveness of TextGTL, we conduct extensive experiments on various benchmark datasets, observing significant performance gains over conventional heterogeneous graphs. In addition, we also design ablation studies to dive deep into the validity of components in TextTGL.


2021 ◽  
Vol 15 (6) ◽  
pp. 1-25
Author(s):  
Man Wu ◽  
Shirui Pan ◽  
Lan Du ◽  
Xingquan Zhu

Graph neural networks (GNNs) are important tools for transductive learning tasks, such as node classification in graphs, due to their expressive power in capturing complex interdependency between nodes. To enable GNN learning, existing works typically assume that labeled nodes, from two or multiple classes, are provided, so that a discriminative classifier can be learned from the labeled data. In reality, this assumption might be too restrictive for applications, as users may only provide labels of interest in a single class for a small number of nodes. In addition, most GNN models only aggregate information from short distances ( e.g. , 1-hop neighbors) in each round, and fail to capture long-distance relationship in graphs. In this article, we propose a novel GNN framework, long-short distance aggregation networks, to overcome these limitations. By generating multiple graphs at different distance levels, based on the adjacency matrix, we develop a long-short distance attention model to model these graphs. The direct neighbors are captured via a short-distance attention mechanism, and neighbors with long distance are captured by a long-distance attention mechanism. Two novel risk estimators are further employed to aggregate long-short-distance networks, for PU learning and the loss is back-propagated for model learning. Experimental results on real-world datasets demonstrate the effectiveness of our algorithm.


2021 ◽  
pp. 107967
Author(s):  
Baoxiang Huang ◽  
Linyao Ge ◽  
Ge Chen ◽  
Milena Radenkovic ◽  
Xiaopeng Wang ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document