Penalty-Enhanced Utility-Based Multi-Criteria Recommendations

Recommender systems have been successfully applied to assist decision making in multiple domains and applications. Multi-criteria recommender systems try to take the user preferences on multiple criteria into consideration, in order to further improve the quality of the recommendations. Most recently, the utility-based multi-criteria recommendation approach has been proposed as an effective and promising solution. However, the issue of over-/under-expectations was ignored in the approach, which may bring risks to the recommendation model. In this paper, we propose a penalty-enhanced model to alleviate this issue. Our experimental results based on multiple real-world data sets can demonstrate the effectiveness of the proposed solutions. In addition, the outcomes of the proposed solution can also help explain the characteristics of the applications by observing the treatment on the issue of over-/under-expectations.

Download Full-text

Something’s Missing? A Procedure for Extending Item Content Data Sets in the Context of Recommender Systems

Information Systems Frontiers ◽

10.1007/s10796-020-10071-y ◽

2020 ◽

Author(s):

Bernd Heinrich ◽

Marcus Hopf ◽

Daniel Lohninger ◽

Alexander Schiller ◽

Michael Szubartowicz

Keyword(s):

Recommender Systems ◽

Additional Data ◽

Rapid Development ◽

Data Sets ◽

Web Portals ◽

Real World Data ◽

Data Set ◽

Item Content ◽

Choice Literature

Abstract The rapid development of e-commerce has led to a swiftly increasing number of competing providers in electronic markets, which maintain their own, individual data describing the offered items. Recommender systems are popular and powerful tools relying on this data to guide users to their individually best item choice. Literature suggests that data quality of item content data has substantial influence on recommendation quality. Thereby, the dimension completeness is expected to be particularly important. Herein resides a considerable chance to improve recommendation quality by increasing completeness via extending an item content data set with an additional data set of the same domain. This paper therefore proposes a procedure for such a systematic data extension and analyzes effects of the procedure regarding items, content and users based on real-world data sets from four leading web portals. The evaluation results suggest that the proposed procedure is indeed effective in enabling improved recommendation quality.

Download Full-text

Contributions of Real-World Evidence and Real-World Data to Decision-Making in the Management of Soft Tissue Sarcomas

Oncology ◽

10.1159/000515266 ◽

2021 ◽

Vol 99 (Suppl. 1) ◽

pp. 3-7

Author(s):

George D. Demetri ◽

Silvia Stacchiotti

Keyword(s):

Decision Making ◽

Soft Tissue ◽

Real World ◽

Soft Tissue Sarcomas ◽

Clinical Settings ◽

Disease Entity ◽

Real World Data ◽

World Data ◽

Real World Evidence

Real-world data are defined as data relating to any aspect of a patient’s health status collected in the context of routine health surveillance and medical care delivery. Sources range from insurance billing claims through to electronic surveillance data (e.g., activity trackers). Real-world data derive from large populations in diverse clinical settings and thus can be extrapolated more readily than clinical trial data to patients in different clinical settings or with a variety of comorbidities. Real-world data are used to generate real-world evidence, which might be regarded as a “meta-analysis” of accumulated real-world data. Increasingly, regulatory authorities are recognizing the value of real-world data and real-world evidence, especially for rare diseases where it may be practically unfeasible to conduct randomized controlled trials. However, the quality of real-world evidence depends on the quality of the data collected which, in turn, depends on a correct pathological diagnosis and the homogeneous behaviour of a reliably defined and consistent disease entity. As each of the more than 80 varieties of soft tissue sarcoma (STS) types represents a distinct disease entity, the situation is exceedingly complicated. Discordant diagnoses, which affect data quality, present a major challenge for use of real-world data. As real-world data are difficult to collect, collaboration across sarcoma reference institutions and sophisticated information technology solutions are required before the potential of real-world evidence to inform decision-making in the management of STS can be fully exploited.

Download Full-text

Effective Refinement of Distinctive Analysis of the Facial Matrices for Automatic Face Annotation

International Journal for Research in Applied Science and Engineering Technology ◽

10.22214/ijraset.2021.34869 ◽

2021 ◽

Vol 9 (VI) ◽

pp. 1259-1267

Author(s):

Kavitha G L

Keyword(s):

Real World ◽

Nearest Neighbor ◽

Low Rank ◽

Data Sets ◽

Real World Data ◽

Mahalanobis Distances ◽

Rank Matrix ◽

The Face ◽

Low Rank Matrix

We deal with real world images which contains numerous faces captioned with equivalent names, it may be wrongly annotated. The face naming technique that we propose, exploits the weakly labeled image dataset, and aims at labeling a face in the image accurately. We propose this efficient face naming technique which is self regulated and aims at correctly labeling a face in an image. This is a challenging task because of the very large appearance variation in the images, as well as the potential mismatch between images and their captions. This paper introduces a method called Refined Low-Rank Regularization (RLRR) which productively employs the weakly named image information to determine a low-rank matrix which is obtained by examining many subspace structures of the recreated data. From the recreation method used a discriminatory matrix is deduced. Also, Large Margin Nearest Neighbor (LMNN) method is used to label an image, which further leads to another kernel matrix, based on the Mahalanobis distances of the data and the two consistent facial matrices can be fused to enhance the quality of each other and it is used as a new reiterative method to infer the names of each facial image. Experimental results on synthetic and real world data sets validate the effectiveness of the proposed method.

Download Full-text

Movie Recommendation through Multiple Bias Analysis

Applied Sciences ◽

10.3390/app11062817 ◽

2021 ◽

Vol 11 (6) ◽

pp. 2817

Author(s):

Tae-Gyu Hwang ◽

Sung Kwon Kim

Keyword(s):

Decision Making ◽

Recommender System ◽

Real World ◽

Matrix Factorization ◽

User Preferences ◽

Improve Performance ◽

Bias Analysis ◽

Useful Knowledge ◽

Existing Problems ◽

Movie Recommendation

A recommender system (RS) refers to an agent that recommends items that are suitable for users, and it is implemented through collaborative filtering (CF). CF has a limitation in improving the accuracy of recommendations based on matrix factorization (MF). Therefore, a new method is required for analyzing preference patterns, which could not be derived by existing studies. This study aimed at solving the existing problems through bias analysis. By analyzing users’ and items’ biases of user preferences, the bias-based predictor (BBP) was developed and shown to outperform memory-based CF. In this paper, in order to enhance BBP, multiple bias analysis (MBA) was proposed to efficiently reflect the decision-making in real world. The experimental results using movie data revealed that MBA enhanced BBP accuracy, and that the hybrid models outperformed MF and SVD++. Based on this result, MBA is expected to improve performance when used as a system in related studies and provide useful knowledge in any areas that need features that can represent users.

Download Full-text

Hfinger: Malware HTTP Request Fingerprinting

Entropy ◽

10.3390/e23050507 ◽

2021 ◽

Vol 23 (5) ◽

pp. 507

Author(s):

Piotr Białczak ◽

Wojciech Mazurczyk

Keyword(s):

Real World ◽

Network Traffic ◽

Experimental Evaluation ◽

Data Sets ◽

Real World Data ◽

Malicious Software ◽

Default Mode ◽

World Data ◽

Effectiveness Analysis ◽

Http Protocol

Malicious software utilizes HTTP protocol for communication purposes, creating network traffic that is hard to identify as it blends into the traffic generated by benign applications. To this aim, fingerprinting tools have been developed to help track and identify such traffic by providing a short representation of malicious HTTP requests. However, currently existing tools do not analyze all information included in the HTTP message or analyze it insufficiently. To address these issues, we propose Hfinger, a novel malware HTTP request fingerprinting tool. It extracts information from the parts of the request such as URI, protocol information, headers, and payload, providing a concise request representation that preserves the extracted information in a form interpretable by a human analyst. For the developed solution, we have performed an extensive experimental evaluation using real-world data sets and we also compared Hfinger with the most related and popular existing tools such as FATT, Mercury, and p0f. The conducted effectiveness analysis reveals that on average only 1.85% of requests fingerprinted by Hfinger collide between malware families, what is 8–34 times lower than existing tools. Moreover, unlike these tools, in default mode, Hfinger does not introduce collisions between malware and benign applications and achieves it by increasing the number of fingerprints by at most 3 times. As a result, Hfinger can effectively track and hunt malware by providing more unique fingerprints than other standard tools.

Download Full-text

Pain is not the major determinant of quality of life in fibromyalgia: results from a retrospective “real world” data analysis of fibromyalgia patients

Rheumatology International ◽

10.1007/s00296-020-04702-5 ◽

2021 ◽

Author(s):

Martin Offenbaecher ◽

Niko Kohls ◽

Thomas Ewert ◽

Claudia Sigl ◽

Robin Hieblinger ◽

...

Keyword(s):

Quality Of Life ◽

Data Analysis ◽

Real World ◽

Major Determinant ◽

Real World Data ◽

World Data

Download Full-text

Review Summary Generation in Online Systems: Frameworks for Supervised and Unsupervised Scenarios

ACM Transactions on the Web ◽

10.1145/3448015 ◽

2021 ◽

Vol 15 (3) ◽

pp. 1-33

Author(s):

Wenjun Jiang ◽

Jing Chen ◽

Xiaofei Ding ◽

Jie Wu ◽

Jiawei He ◽

...

Keyword(s):

Decision Making ◽

Real World ◽

Text Summarization ◽

Experimental Results ◽

Product Review ◽

Comprehensive Review ◽

Online Systems ◽

Real World Datasets ◽

Different Characteristics

In online systems, including e-commerce platforms, many users resort to the reviews or comments generated by previous consumers for decision making, while their time is limited to deal with many reviews. Therefore, a review summary, which contains all important features in user-generated reviews, is expected. In this article, we study “how to generate a comprehensive review summary from a large number of user-generated reviews.” This can be implemented by text summarization, which mainly has two types of extractive and abstractive approaches. Both of these approaches can deal with both supervised and unsupervised scenarios, but the former may generate redundant and incoherent summaries, while the latter can avoid redundancy but usually can only deal with short sequences. Moreover, both approaches may neglect the sentiment information. To address the above issues, we propose comprehensive Review Summary Generation frameworks to deal with the supervised and unsupervised scenarios. We design two different preprocess models of re-ranking and selecting to identify the important sentences while keeping users’ sentiment in the original reviews. These sentences can be further used to generate review summaries with text summarization methods. Experimental results in seven real-world datasets (Idebate, Rotten Tomatoes Amazon, Yelp, and three unlabelled product review datasets in Amazon) demonstrate that our work performs well in review summary generation. Moreover, the re-ranking and selecting models show different characteristics.

Download Full-text

Different algorithms, different models

Quality & Quantity ◽

10.1007/s11135-021-01193-9 ◽

2021 ◽

Author(s):

Martyna Daria Swiatczak

Keyword(s):

Comparative Analysis ◽

Real World ◽

Qualitative Comparative Analysis ◽

Comparative Methods ◽

Data Sets ◽

Simulation Studies ◽

Threshold Values ◽

Real World Data ◽

Software Packages ◽

Methodological Approaches

AbstractThis study assesses the extent to which the two main Configurational Comparative Methods (CCMs), i.e. Qualitative Comparative Analysis (QCA) and Coincidence Analysis (CNA), produce different models. It further explains how this non-identity is due to the different algorithms upon which both methods are based, namely QCA’s Quine–McCluskey algorithm and the CNA algorithm. I offer an overview of the fundamental differences between QCA and CNA and demonstrate both underlying algorithms on three data sets of ascending proximity to real-world data. Subsequent simulation studies in scenarios of varying sample sizes and degrees of noise in the data show high overall ratios of non-identity between the QCA parsimonious solution and the CNA atomic solution for varying analytical choices, i.e. different consistency and coverage threshold values and ways to derive QCA’s parsimonious solution. Clarity on the contrasts between the two methods is supposed to enable scholars to make more informed decisions on their methodological approaches, enhance their understanding of what is happening behind the results generated by the software packages, and better navigate the interpretation of results. Clarity on the non-identity between the underlying algorithms and their consequences for the results is supposed to provide a basis for a methodological discussion about which method and which variants thereof are more successful in deriving which search target.

Download Full-text

Improving recommender systems’ performance on cold-start users and controversial items by a new similarity model

International Journal of Web Information Systems ◽

10.1108/ijwis-07-2015-0024 ◽

2016 ◽

Vol 12 (2) ◽

pp. 126-149 ◽

Cited By ~ 4

Author(s):

Masoud Mansoury ◽

Mehdi Shajari

Keyword(s):

Real World ◽

Design Methodology ◽

Cold Start ◽

Selection Function ◽

Data Sets ◽

Real World Data ◽

Content Type ◽

User Similarity ◽

Active User ◽

Similarity Model

Purpose This paper aims to improve the recommendations performance for cold-start users and controversial items. Collaborative filtering (CF) generates recommendations on the basis of similarity between users. It uses the opinions of similar users to generate the recommendation for an active user. As a similarity model or a neighbor selection function is the key element for effectiveness of CF, many variations of CF are proposed. However, these methods are not very effective, especially for users who provide few ratings (i.e. cold-start users). Design/methodology/approach A new user similarity model is proposed that focuses on improving recommendations performance for cold-start users and controversial items. To show the validity of the authors’ similarity model, they conducted some experiments and showed the effectiveness of this model in calculating similarity values between users even when only few ratings are available. In addition, the authors applied their user similarity model to a recommender system and analyzed its results. Findings Experiments on two real-world data sets are implemented and compared with some other CF techniques. The results show that the authors’ approach outperforms previous CF techniques in coverage metric while preserves accuracy for cold-start users and controversial items. Originality/value In the proposed approach, the conditions in which CF is unable to generate accurate recommendations are addressed. These conditions affect CF performance adversely, especially in the cold-start users’ condition. The authors show that their similarity model overcomes CF weaknesses effectively and improve its performance even in the cold users’ condition.

Download Full-text