A Novel Feature Correlation Approach for Brand Spam Detection

Data Preprocessing, Active Learning, and Cost Perceptive Approaches for Resolving Data Imbalance - Advances in Data Mining and Database Management ◽

10.4018/978-1-7998-7371-6.ch008 ◽

2021 ◽

pp. 149-161

Author(s):

Bharat Tidke ◽

Swati Tidke

Keyword(s):

State Of The Art ◽

The Internet ◽

Spam Detection ◽

Feature Engineering ◽

Novel Approach ◽

Feature Correlation ◽

The Given ◽

Improve State

In this age of the internet, no person wants to make his decision on his own. Be it for purchasing a product, watching a movie, reading a book, a person looks out for reviews. People are unaware of the fact that these reviews may not always be true. It is the age of paid reviews, where the reviews are not just written to promote one's product but also to demote a competitor's product. But the ones which are turning out to be the most critical are given on brand of a certain product. This chapter proposed a novel approach for brand spam detection using feature correlation to improve state-of-the-art approaches. Correlation-based feature engineering is considered as one of the finest methods for determining the relations among the features. Several features attached with reviews are important, keeping in focus customer and company needs in making strong decisions, user for purchasing, and company for improving sales and services. Due to severe spamming these days, it has become nearly impossible to judge whether the given review is a trusted or a fake review.

Download Full-text

Partial Label Learning by Semantic Difference Maximization

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2019/318 ◽

2019 ◽

Cited By ~ 7

Author(s):

Lei Feng ◽

Bo An

Keyword(s):

State Of The Art ◽

Ground Truth ◽

Positive Answer ◽

Learning Framework ◽

Novel Approach ◽

Semantic Difference ◽

Weakly Supervised ◽

Partial Label Learning ◽

The Given ◽

First Time

Partial label learning is a weakly supervised learning framework, in which each instance is provided with multiple candidate labels while only one of them is correct. Most of the existing approaches focus on leveraging the instance relationships to disambiguate the given noisy label space, while it is still unclear whether we can exploit potentially useful information in label space to alleviate the label ambiguities. This paper gives a positive answer to this question for the first time. Specifically, if two instances do not share any common candidate labels, they cannot have the same ground-truth label. By exploiting such dissimilarity relationships from label space, we propose a novel approach that aims to maximize the latent semantic differences of the two instances whose ground-truth labels are definitely different, while training the desired model simultaneously, thereby continually enlarging the gap of label confidences between two instances of different classes. Extensive experiments on artificial and real-world partial label datasets show that our approach significantly outperforms state-of-the-art counterparts.

Download Full-text

Children in Cyberspace: Challenges for Librarians

Bibliotekovedenie [Library and Information Science (Russia)] ◽

10.25281/0869-608x-2009-0-5-39-45 ◽

2009 ◽

pp. 39-45 ◽

Cited By ~ 1

Author(s):

Vera P. Chudinova

Keyword(s):

Legal Aspects ◽

The Internet ◽

Direct Relevance ◽

Russian State ◽

Younger Generation ◽

The Given

The results of the research the interaction problems between children, teenagers and the Internet, and problems of protection and realization of their rights by librarians are considered in the article. The results of foreign researches on the given theme are also presented. A number of its legal aspects is analysed, the rights of children of direct relevance to the theme of “Children and the Information” on the basis of the UN Convention are placed in strong relief. The main features, possibilities and dangers of the Internet to development of the person are shown.The surveys of schoolchildren, teachers and librarians, which were hold by researchers of the Russian State Children Library, have allowed the author to find out various aspects of the younger generation interaction with the Internet, to set out the actual problems facing library experts today.

Download Full-text

Deep Learning for Transient Image Reconstruction from ToF Data

Sensors ◽

10.3390/s21061962 ◽

2021 ◽

Vol 21 (6) ◽

pp. 1962

Author(s):

Enrico Buratto ◽

Adriano Simonetto ◽

Gianluca Agresti ◽

Henrik Schäfer ◽

Pietro Zanuttigh

Keyword(s):

Deep Learning ◽

State Of The Art ◽

Light Response ◽

Real Data ◽

Depth Image ◽

Learning Approach ◽

Multiple Reflections ◽

Noisy Input ◽

Novel Approach ◽

Incoming Light

In this work, we propose a novel approach for correcting multi-path interference (MPI) in Time-of-Flight (ToF) cameras by estimating the direct and global components of the incoming light. MPI is an error source linked to the multiple reflections of light inside a scene; each sensor pixel receives information coming from different light paths which generally leads to an overestimation of the depth. We introduce a novel deep learning approach, which estimates the structure of the time-dependent scene impulse response and from it recovers a depth image with a reduced amount of MPI. The model consists of two main blocks: a predictive model that learns a compact encoded representation of the backscattering vector from the noisy input data and a fixed backscattering model which translates the encoded representation into the high dimensional light response. Experimental results on real data show the effectiveness of the proposed approach, which reaches state-of-the-art performances.

Download Full-text

A novel optimal multi-pattern matching method with wildcards for DNA sequence

Technology and Health Care ◽

10.3233/thc-218012 ◽

2021 ◽

Vol 29 ◽

pp. 115-124

Author(s):

Xinlu Wang ◽

Ahmed A.F. Saif ◽

Dayou Liu ◽

Yungang Zhu ◽

Jon Atli Benediktsson

Keyword(s):

Dna Sequence ◽

Pattern Matching ◽

Health Informatics ◽

State Of The Art ◽

Machine Language ◽

Data Sets ◽

Fundamental Issue ◽

Matching Method ◽

Dna Sequence Alignment ◽

The Given

BACKGROUND: DNA sequence alignment is one of the most fundamental and important operation to identify which gene family may contain this sequence, pattern matching for DNA sequence has been a fundamental issue in biomedical engineering, biotechnology and health informatics. OBJECTIVE: To solve this problem, this study proposes an optimal multi pattern matching with wildcards for DNA sequence. METHODS: This proposed method packs the patterns and a sliding window of texts, and the window slides along the given packed text, matching against stored packed patterns. RESULTS: Three data sets are used to test the performance of the proposed algorithm, and the algorithm was seen to be more efficient than the competitors because its operation is close to machine language. CONCLUSIONS: Theoretical analysis and experimental results both demonstrate that the proposed method outperforms the state-of-the-art methods and is especially effective for the DNA sequence.

Download Full-text

A New Multi-Person Pose Estimation Method Using the Partitioned CenterPose Network

Applied Sciences ◽

10.3390/app11094241 ◽

2021 ◽

Vol 11 (9) ◽

pp. 4241

Author(s):

Jiahua Wu ◽

Hyo Jong Lee

Keyword(s):

Pose Estimation ◽

Human Body ◽

State Of The Art ◽

Estimation Method ◽

Bottom Up ◽

Center Point ◽

Novel Approach ◽

Body Joints

In bottom-up multi-person pose estimation, grouping joint candidates into the appropriately structured corresponding instance of a person is challenging. In this paper, a new bottom-up method, the Partitioned CenterPose (PCP) Network, is proposed to better cluster the detected joints. To achieve this goal, we propose a novel approach called Partition Pose Representation (PPR) which integrates the instance of a person and its body joints based on joint offset. PPR leverages information about the center of the human body and the offsets between that center point and the positions of the body’s joints to encode human poses accurately. To enhance the relationships between body joints, we divide the human body into five parts, and then, we generate a sub-PPR for each part. Based on this PPR, the PCP Network can detect people and their body joints simultaneously, then group all body joints according to joint offset. Moreover, an improved l1 loss is designed to more accurately measure joint offset. Using the COCO keypoints and CrowdPose datasets for testing, it was found that the performance of the proposed method is on par with that of existing state-of-the-art bottom-up methods in terms of accuracy and speed.

Download Full-text

CNFE-SE: a novel approach combining complex network-based feature engineering and stacked ensemble to predict the success of intrauterine insemination and ranking the features

BMC Medical Informatics and Decision Making ◽

10.1186/s12911-020-01362-0 ◽

2021 ◽

Vol 21 (1) ◽

Author(s):

Sima Ranjbari ◽

Toktam Khatibi ◽

Ahmad Vosough Dizaji ◽

Hesamoddin Sajadi ◽

Mehdi Totonchi ◽

...

Keyword(s):

Complex Networks ◽

Complex Network ◽

Semen Quality ◽

Intrauterine Insemination ◽

Machine Learning Algorithms ◽

Semen Parameters ◽

Feature Engineering ◽

Operating Characteristics ◽

Scoring Method ◽

Novel Approach

Abstract Background Intrauterine Insemination (IUI) outcome prediction is a challenging issue which the assisted reproductive technology (ART) practitioners are dealing with. Predicting the success or failure of IUI based on the couples' features can assist the physicians to make the appropriate decision for suggesting IUI to the couples or not and/or continuing the treatment or not for them. Many previous studies have been focused on predicting the in vitro fertilization (IVF) and intracytoplasmic sperm injection (ICSI) outcome using machine learning algorithms. But, to the best of our knowledge, a few studies have been focused on predicting the outcome of IUI. The main aim of this study is to propose an automatic classification and feature scoring method to predict intrauterine insemination (IUI) outcome and ranking the most significant features. Methods For this purpose, a novel approach combining complex network-based feature engineering and stacked ensemble (CNFE-SE) is proposed. Three complex networks are extracted considering the patients' data similarities. The feature engineering step is performed on the complex networks. The original feature set and/or the features engineered are fed to the proposed stacked ensemble to classify and predict IUI outcome for couples per IUI treatment cycle. Our study is a retrospective study of a 5-year couples' data undergoing IUI. Data is collected from Reproductive Biomedicine Research Center, Royan Institute describing 11,255 IUI treatment cycles for 8,360 couples. Our dataset includes the couples' demographic characteristics, historical data about the patients' diseases, the clinical diagnosis, the treatment plans and the prescribed drugs during the cycles, semen quality, laboratory tests and the clinical pregnancy outcome. Results Experimental results show that the proposed method outperforms the compared methods with Area under receiver operating characteristics curve (AUC) of 0.84 ± 0.01, sensitivity of 0.79 ± 0.01, specificity of 0.91 ± 0.01, and accuracy of 0.85 ± 0.01 for the prediction of IUI outcome. Conclusions The most important predictors for predicting IUI outcome are semen parameters (sperm motility and concentration) as well as female body mass index (BMI).

Download Full-text

Density Guarantee on Finding Multiple Subgraphs and Subtensors

ACM Transactions on Knowledge Discovery from Data ◽

10.1145/3446668 ◽

2021 ◽

Vol 15 (5) ◽

pp. 1-32

Author(s):

Quang-huy Duong ◽

Heri Ramampiaro ◽

Kjetil Nørvåg ◽

Thu-lan Dam

Keyword(s):

Lower Bound ◽

State Of The Art ◽

The State ◽

The Other ◽

Exact Methods ◽

Practical Solution ◽

Novel Approach ◽

Wide Range ◽

Real World Datasets ◽

Tensor Data

Dense subregion (subgraph & subtensor) detection is a well-studied area, with a wide range of applications, and numerous efficient approaches and algorithms have been proposed. Approximation approaches are commonly used for detecting dense subregions due to the complexity of the exact methods. Existing algorithms are generally efficient for dense subtensor and subgraph detection, and can perform well in many applications. However, most of the existing works utilize the state-or-the-art greedy 2-approximation algorithm to capably provide solutions with a loose theoretical density guarantee. The main drawback of most of these algorithms is that they can estimate only one subtensor, or subgraph, at a time, with a low guarantee on its density. While some methods can, on the other hand, estimate multiple subtensors, they can give a guarantee on the density with respect to the input tensor for the first estimated subsensor only. We address these drawbacks by providing both theoretical and practical solution for estimating multiple dense subtensors in tensor data and giving a higher lower bound of the density. In particular, we guarantee and prove a higher bound of the lower-bound density of the estimated subgraph and subtensors. We also propose a novel approach to show that there are multiple dense subtensors with a guarantee on its density that is greater than the lower bound used in the state-of-the-art algorithms. We evaluate our approach with extensive experiments on several real-world datasets, which demonstrates its efficiency and feasibility.

Download Full-text

New Multi-View Classification Method with Uncertain Data

ACM Transactions on Knowledge Discovery from Data ◽

10.1145/3458282 ◽

2021 ◽

Vol 16 (1) ◽

pp. 1-23

Author(s):

Bo Liu ◽

Haowen Zhong ◽

Yanshan Xiao

Keyword(s):

Learning Strategy ◽

State Of The Art ◽

Uncertain Data ◽

Real Life ◽

Support Vector ◽

Classification Methods ◽

Complementary Information ◽

Novel Approach ◽

Svm Model ◽

Iterative Framework

Multi-view classification aims at designing a multi-view learning strategy to train a classifier from multi-view data, which are easily collected in practice. Most of the existing works focus on multi-view classification by assuming the multi-view data are collected with precise information. However, we always collect the uncertain multi-view data due to the collection process is corrupted with noise in real-life application. In this case, this article proposes a novel approach, called uncertain multi-view learning with support vector machine (UMV-SVM) to cope with the problem of multi-view learning with uncertain data. The method first enforces the agreement among all the views to seek complementary information of multi-view data and takes the uncertainty of the multi-view data into consideration by modeling reachability area of the noise. Then it proposes an iterative framework to solve the proposed UMV-SVM model such that we can obtain the multi-view classifier for prediction. Extensive experiments on real-life datasets have shown that the proposed UMV-SVM can achieve a better performance for uncertain multi-view classification in comparison to the state-of-the-art multi-view classification methods.

Download Full-text

Question-aware memory network for multi-hop question answering in human–robot interaction

Complex & Intelligent Systems ◽

10.1007/s40747-021-00448-0 ◽

2021 ◽

Author(s):

Xinmeng Li ◽

Mamoun Alazab ◽

Qian Li ◽

Keping Yu ◽

Quanjun Yin

Keyword(s):

Question Answering ◽

State Of The Art ◽

Human Robot Interaction ◽

Knowledge Graph ◽

Robot Interaction ◽

Natural Language Question ◽

Memory Network ◽

The Given ◽

Fine Tune ◽

Language Question

AbstractKnowledge graph question answering is an important technology in intelligent human–robot interaction, which aims at automatically giving answer to human natural language question with the given knowledge graph. For the multi-relation question with higher variety and complexity, the tokens of the question have different priority for the triples selection in the reasoning steps. Most existing models take the question as a whole and ignore the priority information in it. To solve this problem, we propose question-aware memory network for multi-hop question answering, named QA2MN, to update the attention on question timely in the reasoning process. In addition, we incorporate graph context information into knowledge graph embedding model to increase the ability to represent entities and relations. We use it to initialize the QA2MN model and fine-tune it in the training process. We evaluate QA2MN on PathQuestion and WorldCup2014, two representative datasets for complex multi-hop question answering. The result demonstrates that QA2MN achieves state-of-the-art Hits@1 accuracy on the two datasets, which validates the effectiveness of our model.

Download Full-text

Accurate Instance-Based Segmentation for Boundary Detection in Robot Grasping Application

Applied Sciences ◽

10.3390/app11094248 ◽

2021 ◽

Vol 11 (9) ◽

pp. 4248

Author(s):

Hong Hai Hoang ◽

Bao Long Tran

Keyword(s):

Object Segmentation ◽

State Of The Art ◽

Rapid Development ◽

Spatial Relationship ◽

Learning Technologies ◽

Average Precision ◽

Novel Approach ◽

3D Camera ◽

Robot Grasping ◽

Instance Segmentation

With the rapid development of cameras and deep learning technologies, computer vision tasks such as object detection, object segmentation and object tracking are being widely applied in many fields of life. For robot grasping tasks, object segmentation aims to classify and localize objects, which helps robots to be able to pick objects accurately. The state-of-the-art instance segmentation network framework, Mask Region-Convolution Neural Network (Mask R-CNN), does not always perform an excellent accurate segmentation at the edge or border of objects. The approach using 3D camera, however, is able to extract the entire (foreground) objects easily but can be difficult or require a large amount of computation effort to classify it. We propose a novel approach, in which we combine Mask R-CNN with 3D algorithms by adding a 3D process branch for instance segmentation. Both outcomes of two branches are contemporaneously used to classify the pixels at the edge objects by dealing with the spatial relationship between edge region and mask region. We analyze the effectiveness of the method by testing with harsh cases of object positions, for example, objects are closed, overlapped or obscured by each other to focus on edge and border segmentation. Our proposed method is about 4 to 7% higher and more stable in IoU (intersection of union). This leads to a reach of 46% of mAP (mean Average Precision), which is a higher accuracy than its counterpart. The feasibility experiment shows that our method could be a remarkable promoting for the research of the grasping robot.

Download Full-text