Intention based Clustering of Relevant Reviews using Content Similarity

The proposed work deals with finding related reviews posted on various online Forums. Conventional methods for matching related documents compute the content similarity over the entire review instead of partitioning into segments revealing different intentions. In this work, intention-based similarity clustering is introduced to find the relatedness of two documents. This method forms the document clusters based on the similarity of the segments with similar intentions. The segmentation points are identified using a number of text features which can express when the segmentation should be done. Finally, the document clusters are formed by grouping the segments with similar intentions in same cluster and then the similarities among the segments with the same intention are computed. The proposed model is trained on TripAdvisor and Yelp Open Review datasets to evaluate the performance of the model, and the evaluation results show that the model produces more precise results in mining documents related to the user’s interest.

Download Full-text

A Densely Connected GRU Neural Network Based on Coattention Mechanism for Chinese Rice-Related Question Similarity Matching

Agronomy ◽

10.3390/agronomy11071307 ◽

2021 ◽

Vol 11 (7) ◽

pp. 1307

Author(s):

Haoriqin Wang ◽

Huaji Zhu ◽

Huarui Wu ◽

Xiaomin Wang ◽

Xiao Han ◽

...

Keyword(s):

Recommending Mobile Microblog Users via a Tensor Factorization Based on User Cluster Approach

Wireless Communications and Mobile Computing ◽

10.1155/2018/9434239 ◽

2018 ◽

Vol 2018 ◽

pp. 1-11 ◽

Cited By ~ 1

Author(s):

Xiangwen Liao ◽

Lingying Zhang ◽

Jingjing Wei ◽

Dingda Yang ◽

Guolong Chen

Keyword(s):

Network Centrality ◽

Network Clustering ◽

Tensor Factorization ◽

Latent Factors ◽

Cp Decomposition ◽

Temporal Features ◽

User Influence ◽

Proposed Model ◽

The Mean ◽

Content Similarity

User influence is a very important factor for microblog user recommendation in mobile social network. However, most existing user influence analysis works ignore user’s temporal features and fail to filter the marketing users with low influence, which limits the performance of recommendation methods. In this paper, a Tensor Factorization based User Cluster (TFUC) model is proposed. We firstly identify latent influential users by neural network clustering. Then, we construct a features tensor according to latent influential user’s opinion, activity, and network centrality information. Furthermore, user influences are predicted by the latent factors resulting from the temporal restrained CP decomposition. Finally, we recommend microblog users considering both user influence and content similarity. Our experimental results show that the proposed model significantly improves recommendation performance. Meanwhile, the mean average precision of TFUC outperforms the baselines with 3.4% at least.

Download Full-text

Unconventional reconciliation path for quantum mechanics and general relativity

10.21203/rs.3.rs-63303/v8 ◽

2021 ◽

Author(s):

Samuel Yuguru

Keyword(s):

General Relativity ◽

Quantum Mechanics ◽

Wave Diffraction ◽

Extra Dimensions ◽

Electron Wave ◽

Present Stage ◽

Gedanken Experiment ◽

Proposed Model ◽

Conventional Methods ◽

Reconciliation Process

Abstract Physics in general is successfully governed by quantum mechanics at the microscale and principles of relativity at the macroscale. Any attempts to unify them using conventional methods have somewhat remained elusive for nearly a century up to the present stage. Here in this study, a classical gedanken experiment of electron-wave diffraction of a single slit is intuitively examined for its quantized states. A unidirectional monopole field as quanta of the electric field is pictorially conceptualized into 4D space-time. Its application towards quantum mechanics and general relativity in accordance with existing knowledge in physics paves an alternative path towards their reconciliation process. This assumes a multiverse at a hierarchy of scales with gravity localized to a body into space. Principles of special relativity are then sustained along inertia frames of extra dimensions within the proposed model. Such descriptions provide an approximate intuitive tool to examine physics in general from alternative perspectives using conventional methods and this warrants further investigations.

Download Full-text

Content Noise Detection Model Using Deep Learning in Web Forums

Sustainability ◽

10.3390/su12125074 ◽

2020 ◽

Vol 12 (12) ◽

pp. 5074

Author(s):

Jiyoung Woo ◽

Jaeseok Yun

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Learning Model ◽

Detection Model ◽

Proposed Model ◽

Web Forum ◽

Web Forums ◽

Conventional Machine ◽

Text Features ◽

Deep Learning Model

Spam posts in web forum discussions cause user inconvenience and lower the value of the web forum as an open source of user opinion. In this regard, as the importance of a web post is evaluated in terms of the number of involved authors, noise distorts the analysis results by adding unnecessary data to the opinion analysis. Here, in this work, an automatic detection model for spam posts in web forums using both conventional machine learning and deep learning is proposed. To automatically differentiate between normal posts and spam, evaluators were asked to recognize spam posts in advance. To construct the machine learning-based model, text features from posted content using text mining techniques from the perspective of linguistics were extracted, and supervised learning was performed to distinguish content noise from normal posts. For the deep learning model, raw text including and excluding special characters was utilized. A comparison analysis on deep neural networks using the two different recurrent neural network (RNN) models of the simple RNN and long short-term memory (LSTM) network was also performed. Furthermore, the proposed model was applied to two web forums. The experimental results indicate that the deep learning model affords significant improvements over the accuracy of conventional machine learning associated with text features. The accuracy of the proposed model using LSTM reaches 98.56%, and the precision and recall of the noise class reach 99% and 99.53%, respectively.

Download Full-text

Webshell Detection Based on Executable Data Characteristics of PHP Code

Wireless Communications and Mobile Computing ◽

10.1155/2021/5533963 ◽

2021 ◽

Vol 2021 ◽

pp. 1-12

Author(s):

Zulie Pan ◽

Yuanchao Chen ◽

Yu Chen ◽

Yi Shen ◽

Xuanzhen Guo

Keyword(s):

State Of The Art ◽

Recall Rate ◽

Remote Access ◽

Detection Accuracy ◽

Controlled Experiments ◽

Data Set ◽

Detection Model ◽

Proposed Model ◽

Text Features ◽

And Control

A webshell is a malicious backdoor that allows remote access and control to a web server by executing arbitrary commands. The wide use of obfuscation and encryption technologies has greatly increased the difficulty of webshell detection. To this end, we propose a novel webshell detection model leveraging the grammatical features extracted from the PHP code. The key idea is to combine the executable data characteristics of the PHP code with static text features for webshell classification. To verify the proposed model, we construct a cleaned data set of webshell consisting of 2,917 samples from 17 webshell collection projects and conduct extensive experiments. We have designed three sets of controlled experiments, the results of which show that the accuracy of the three algorithms has reached more than 99.40%, the highest reached 99.66%, the recall rate has been increased by at least 1.8%, the most increased by 6.75%, and the F1 value has increased by 2.02% on average. It not only confirms the efficiency of the grammatical features in webshell detection but also shows that our system significantly outperforms several state-of-the-art rivals in terms of detection accuracy and recall rate.

Download Full-text

Blog Recommendation and Management Implications in an Emergency Context: An Information Entropy Perspective

Asia Pacific Journal of Operational Research ◽

10.1142/s0217595917400073 ◽

2017 ◽

Vol 34 (01) ◽

pp. 1740007 ◽

Cited By ~ 3

Author(s):

Siqing Shan ◽

Jihong Shi ◽

Qi Yan

Keyword(s):

Decision Making ◽

Information Entropy ◽

Rapid Development ◽

Mobile Internet ◽

User Generated Content ◽

Management Implications ◽

Modeling Methodology ◽

Proposed Model ◽

Content Similarity ◽

Primary Contribution

A modeling methodology for blog recommendation and forecasting based on information entropy is presented. With the increasing popularity of smartphones and the rapid development of the mobile Internet, the amount of user-generated content such as blogs is increasing daily. Valuable information, such as bloggers’ opinions, feelings, and attitudes, is often part of this content. Particularly in the context of an emergency, this information should also be used to facilitate decision making. The current blog recommendation model examines primarily users’ interests or content similarity, whereas in this paper, the value of the blog is considered. The primary contribution of this paper is the proposal of an information-entropy-based blog recommendation model for finding valuable blogs to facilitate decision-making in an emergency context. A series of indicators for evaluating a blog in an emergency context are proposed. Using the method of information entropy, a blog recommendation model is developed. The model can also be used to forecast the value of emergency blogs in the future. The model has been tested and validated using crawled data from the Sina Blog, and the results have demonstrated that the proposed model can effectively determine the value of emergency-related blogs.

Download Full-text

Unconventional reconciliation path for quantum mechanics and general relativity

10.21203/rs.3.rs-63303/v7 ◽

2021 ◽

Author(s):

Samuel Yuguru

Keyword(s):

General Relativity ◽

Quantum Mechanics ◽

Wave Diffraction ◽

Extra Dimensions ◽

Electron Wave ◽

Present Stage ◽

Gedanken Experiment ◽

Proposed Model ◽

Conventional Methods ◽

Reconciliation Process

Download Full-text

Idea plagiarism detection with recurrent neural networks and vector space model

International Journal of Intelligent Computing and Cybernetics ◽

10.1108/ijicc-11-2020-0178 ◽

2021 ◽

Vol ahead-of-print (ahead-of-print) ◽

Author(s):

Azra Nazir ◽

Roohie Naaz Mir ◽

Shaima Qureshi

Keyword(s):

Similarity Index ◽

Similarity Score ◽

Academic Community ◽

Plagiarism Detection ◽

Natural Languages ◽

Content Type ◽

Proposed Model ◽

Text Features ◽

Single Idea

PurposeNatural languages have a fundamental quality of suppleness that makes it possible to present a single idea in plenty of different ways. This feature is often exploited in the academic world, leading to the theft of work referred to as plagiarism. Many approaches have been put forward to detect such cases based on various text features and grammatical structures of languages. However, there is a huge scope of improvement for detecting intelligent plagiarism.Design/methodology/approachTo realize this, the paper introduces a hybrid model to detect intelligent plagiarism by breaking the entire process into three stages: (1) clustering, (2) vector formulation in each cluster based on semantic roles, normalization and similarity index calculation and (3) Summary generation using encoder-decoder. An effective weighing scheme has been introduced to select terms used to build vectors based on K-means, which is calculated on the synonym set for the said term. If the value calculated in the last stage lies above a predefined threshold, only then the next semantic argument is analyzed. When the similarity score for two documents is beyond the threshold, a short summary for plagiarized documents is created.FindingsExperimental results show that this method is able to detect connotation and concealment used in idea plagiarism besides detecting literal plagiarism.Originality/valueThe proposed model can help academics stay updated by providing summaries of relevant articles. It would eliminate the practice of plagiarism infesting the academic community at an unprecedented pace. The model will also accelerate the process of reviewing academic documents, aiding in the speedy publishing of research articles.

Download Full-text

Cross-Modality Paired-Images Generation for RGB-Infrared Person Re-Identification

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i07.6894 ◽

2020 ◽

Vol 34 (07) ◽

pp. 12144-12151

Author(s):

Guan-An Wang ◽

Tianzhu Zhang ◽

Yang Yang ◽

Jian Cheng ◽

Jianlong Chang ◽

...

Keyword(s):

State Of The Art ◽

Experimental Results ◽

Fine Grained ◽

Invariant Features ◽

Proposed Model ◽

Art Methods ◽

Conventional Methods

RGB-Infrared (IR) person re-identification is very challenging due to the large cross-modality variations between RGB and IR images. The key solution is to learn aligned features to the bridge RGB and IR modalities. However, due to the lack of correspondence labels between every pair of RGB and IR images, most methods try to alleviate the variations with set-level alignment by reducing the distance between the entire RGB and IR sets. However, this set-level alignment may lead to misalignment of some instances, which limits the performance for RGB-IR Re-ID. Different from existing methods, in this paper, we propose to generate cross-modality paired-images and perform both global set-level and fine-grained instance-level alignments. Our proposed method enjoys several merits. First, our method can perform set-level alignment by disentangling modality-specific and modality-invariant features. Compared with conventional methods, ours can explicitly remove the modality-specific features and the modality variation can be better reduced. Second, given cross-modality unpaired-images of a person, our method can generate cross-modality paired images from exchanged images. With them, we can directly perform instance-level alignment by minimizing distances of every pair of images. Extensive experimental results on two standard benchmarks demonstrate that the proposed model favourably against state-of-the-art methods. Especially, on SYSU-MM01 dataset, our model can achieve a gain of 9.2% and 7.7% in terms of Rank-1 and mAP. Code is available at https://github.com/wangguanan/JSIA-ReID.

Download Full-text

Collaborative Graph Learning with Auxiliary Text for Temporal Event Prediction in Healthcare

Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2021/486 ◽

2021 ◽

Author(s):

Chang Lu ◽

Chandan K Reddy ◽

Prithwish Chakraborty ◽

Samantha Kleinberg ◽

Yue Ning

Keyword(s):

Domain Knowledge ◽

Healthcare Providers ◽

Structural Features ◽

Text Data ◽

Graph Learning ◽

Care Plans ◽

Event Prediction ◽

Proposed Model ◽

Health Event ◽

Text Features

Accurate and explainable health event predictions are becoming crucial for healthcare providers to develop care plans for patients. The availability of electronic health records (EHR) has enabled machine learning advances in providing these predictions. However, many deep-learning-based methods are not satisfactory in solving several key challenges: 1) effectively utilizing disease domain knowledge; 2) collaboratively learning representations of patients and diseases; and 3) incorporating unstructured features. To address these issues, we propose a collaborative graph learning model to explore patient-disease interactions and medical domain knowledge. Our solution is able to capture structural features of both patients and diseases. The proposed model also utilizes unstructured text data by employing an attention manipulating strategy and then integrates attentive text features into a sequential learning process. We conduct extensive experiments on two important healthcare problems to show the competitive prediction performance of the proposed method compared with various state-of-the-art models. We also confirm the effectiveness of learned representations and model interpretability by a set of ablation and case studies.

Download Full-text