scholarly journals Partial Multi-Label Learning via Credible Label Elicitation

Author(s):  
Jun-Peng Fang ◽  
Min-Ling Zhang

In partial multi-label learning (PML), each training example is associated with multiple candidate labels which are only partially valid. The task of PML naturally arises in learning scenarios with inaccurate supervision, and the goal is to induce a multi-label predictor which can assign a set of proper labels for unseen instance. To learn from PML training examples, the training procedure is prone to be misled by the false positive labels concealed in candidate label set. In light of this major difficulty, a novel two-stage PML approach is proposed which works by eliciting credible labels from the candidate label set for model induction. In this way, most false positive labels are expected to be excluded from the training procedure. Specifically, in the first stage, the labeling confidence of candidate label for each PML training example is estimated via iterative label propagation. In the second stage, by utilizing credible labels with high labeling confidence, multi-label predictor is induced via pairwise label ranking with virtual label splitting or maximum a posteriori (MAP) reasoning. Extensive experiments on synthetic as well as real-world data sets clearly validate the effectiveness of credible label elicitation in learning from PML examples.

Author(s):  
Qian-Wei Wang ◽  
Yu-Feng Li ◽  
Zhi-Hua Zhou

Partial label learning deals with training examples each associated with a set of candidate labels, among which only one label is valid. Previous studies typically assume that the candidate label sets are provided for all training examples. In many real-world applications such as video character classification, however, it is generally difficult to label a large number of instances and there exists much data left to be unlabeled. We call this kind of problem semi-supervised partial label learning. In this paper, we propose the SSPL method to address this problem. Specifically, an iterative label propagation procedure between partial label examples and unlabeled instances is employed to disambiguate the candidate label sets of partial label examples as well as assign valid labels to unlabeled instances. The importance of unlabeled instances increases adaptively as the number of iteration increases, since they carry richer labeling information. Finally, unseen instances are classified based on the minimum reconstruction error on both partial label and unlabeled instances. Experiments on real-world data sets clearly validate the effectiveness of the proposed SSPL method.


2020 ◽  
Vol 12 (5) ◽  
pp. 1918
Author(s):  
Hussein Slim ◽  
Sylvie Nadeau

The task to understand systemic functioning and predict the behavior of today’s sociotechnical systems is a major challenge facing researchers due to the nonlinearity, dynamicity, and uncertainty of such systems. Many variables can only be evaluated in terms of qualitative terms due to their vague nature and uncertainty. In the first stage of our project, we proposed the application of the Functional Resonance Analysis Method (FRAM), a recently emerging technique, to evaluate aircraft deicing operations from a systemic perspective. In the second stage, we proposed the integration of fuzzy logic into FRAM to construct a predictive assessment model capable of providing quantified outcomes to present more intersubjective and comprehensible results. The integration process of fuzzy logic was thorough and required significant effort due to the high number of input variables and the consequent large number of rules. In this paper, we aim to further improve the proposed prototype in the second stage by integrating rough sets as a data-mining tool to generate and reduce the size of the rule base and classify outcomes. Rough sets provide a mathematical framework suitable for deriving rules and decisions from uncertain and incomplete data. The mixed rough sets/fuzzy logic model was applied again here to the context of aircraft deicing operations, keeping the same settings as in the second stage to better compare both results. The obtained results were identical to the results of the second stage despite the significant reduction in size of the rule base. However, the presented model here is a simulated one constructed with ideal data sets accounting for all possible combinations of input variables, which resulted in maximum accuracy. The same should be further optimized and examined using real-world data to validate the results.


Author(s):  
K Sobha Rani

Collaborative filtering suffers from the problems of data sparsity and cold start, which dramatically degrade recommendation performance. To help resolve these issues, we propose TrustSVD, a trust-based matrix factorization technique. By analyzing the social trust data from four real-world data sets, we conclude that not only the explicit but also the implicit influence of both ratings and trust should be taken into consideration in a recommendation model. Hence, we build on top of a state-of-the-art recommendation algorithm SVD++ which inherently involves the explicit and implicit influence of rated items, by further incorporating both the explicit and implicit influence of trusted users on the prediction of items for an active user. To our knowledge, the work reported is the first to extend SVD++ with social trust information. Experimental results on the four data sets demonstrate that our approach TrustSVD achieves better accuracy than other ten counterparts, and can better handle the concerned issues.


1995 ◽  
Vol 31 (2) ◽  
pp. 193-204 ◽  
Author(s):  
Koen Grijspeerdt ◽  
Peter Vanrolleghem ◽  
Willy Verstraete

A comparative study of several recently proposed one-dimensional sedimentation models has been made. This has been achieved by fitting these models to steady-state and dynamic concentration profiles obtained in a down-scaled secondary decanter. The models were evaluated with several a posteriori model selection criteria. Since the purpose of the modelling task is to do on-line simulations, the calculation time was used as one of the selection criteria. Finally, the practical identifiability of the models for the available data sets was also investigated. It could be concluded that the model of Takács et al. (1991) gave the most reliable results.


Entropy ◽  
2021 ◽  
Vol 23 (5) ◽  
pp. 507
Author(s):  
Piotr Białczak ◽  
Wojciech Mazurczyk

Malicious software utilizes HTTP protocol for communication purposes, creating network traffic that is hard to identify as it blends into the traffic generated by benign applications. To this aim, fingerprinting tools have been developed to help track and identify such traffic by providing a short representation of malicious HTTP requests. However, currently existing tools do not analyze all information included in the HTTP message or analyze it insufficiently. To address these issues, we propose Hfinger, a novel malware HTTP request fingerprinting tool. It extracts information from the parts of the request such as URI, protocol information, headers, and payload, providing a concise request representation that preserves the extracted information in a form interpretable by a human analyst. For the developed solution, we have performed an extensive experimental evaluation using real-world data sets and we also compared Hfinger with the most related and popular existing tools such as FATT, Mercury, and p0f. The conducted effectiveness analysis reveals that on average only 1.85% of requests fingerprinted by Hfinger collide between malware families, what is 8–34 times lower than existing tools. Moreover, unlike these tools, in default mode, Hfinger does not introduce collisions between malware and benign applications and achieves it by increasing the number of fingerprints by at most 3 times. As a result, Hfinger can effectively track and hunt malware by providing more unique fingerprints than other standard tools.


2021 ◽  
pp. 1-13
Author(s):  
Qingtian Zeng ◽  
Xishi Zhao ◽  
Xiaohui Hu ◽  
Hua Duan ◽  
Zhongying Zhao ◽  
...  

Word embeddings have been successfully applied in many natural language processing tasks due to its their effectiveness. However, the state-of-the-art algorithms for learning word representations from large amounts of text documents ignore emotional information, which is a significant research problem that must be addressed. To solve the above problem, we propose an emotional word embedding (EWE) model for sentiment analysis in this paper. This method first applies pre-trained word vectors to represent document features using two different linear weighting methods. Then, the resulting document vectors are input to a classification model and used to train a text sentiment classifier, which is based on a neural network. In this way, the emotional polarity of the text is propagated into the word vectors. The experimental results on three kinds of real-world data sets demonstrate that the proposed EWE model achieves superior performances on text sentiment prediction, text similarity calculation, and word emotional expression tasks compared to other state-of-the-art models.


Entropy ◽  
2021 ◽  
Vol 23 (2) ◽  
pp. 240
Author(s):  
Muhammad Umar Farooq ◽  
Alexandre Graell i Amat ◽  
Michael Lentmaier

In this paper, we perform a belief propagation (BP) decoding threshold analysis of spatially coupled (SC) turbo-like codes (TCs) (SC-TCs) on the additive white Gaussian noise (AWGN) channel. We review Monte-Carlo density evolution (MC-DE) and efficient prediction methods, which determine the BP thresholds of SC-TCs over the AWGN channel. We demonstrate that instead of performing time-consuming MC-DE computations, the BP threshold of SC-TCs over the AWGN channel can be predicted very efficiently from their binary erasure channel (BEC) thresholds. From threshold results, we conjecture that the similarity of MC-DE and predicted thresholds is related to the threshold saturation capability as well as capacity-approaching maximum a posteriori (MAP) performance of an SC-TC ensemble.


2021 ◽  
Vol 13 (3) ◽  
pp. 1522
Author(s):  
Raja Majid Ali Ujjan ◽  
Zeeshan Pervez ◽  
Keshav Dahal ◽  
Wajahat Ali Khan ◽  
Asad Masood Khattak ◽  
...  

In modern network infrastructure, Distributed Denial of Service (DDoS) attacks are considered as severe network security threats. For conventional network security tools it is extremely difficult to distinguish between the higher traffic volume of a DDoS attack and large number of legitimate users accessing a targeted network service or a resource. Although these attacks have been widely studied, there are few works which collect and analyse truly representative characteristics of DDoS traffic. The current research mostly focuses on DDoS detection and mitigation with predefined DDoS data-sets which are often hard to generalise for various network services and legitimate users’ traffic patterns. In order to deal with considerably large DDoS traffic flow in a Software Defined Networking (SDN), in this work we proposed a fast and an effective entropy-based DDoS detection. We deployed generalised entropy calculation by combining Shannon and Renyi entropy to identify distributed features of DDoS traffic—it also helped SDN controller to effectively deal with heavy malicious traffic. To lower down the network traffic overhead, we collected data-plane traffic with signature-based Snort detection. We then analysed the collected traffic for entropy-based features to improve the detection accuracy of deep learning models: Stacked Auto Encoder (SAE) and Convolutional Neural Network (CNN). This work also investigated the trade-off between SAE and CNN classifiers by using accuracy and false-positive results. Quantitative results demonstrated SAE achieved relatively higher detection accuracy of 94% with only 6% of false-positive alerts, whereas the CNN classifier achieved an average accuracy of 93%.


Sign in / Sign up

Export Citation Format

Share Document