experimental results
Recently Published Documents


TOTAL DOCUMENTS

30104
(FIVE YEARS 5631)

H-INDEX

151
(FIVE YEARS 19)

2022 ◽  
Vol 24 (3) ◽  
pp. 1-18
Author(s):  
Neeru Dubey ◽  
Amit Arjun Verma ◽  
Simran Setia ◽  
S. R. S. Iyengar

The size of Wikipedia grows exponentially every year, due to which users face the problem of information overload. We purpose a remedy to this problem by developing a recommendation system for Wikipedia articles. The proposed technique automatically generates a personalized synopsis of the article that a user aims to read next. We develop a tool, called PerSummRe, which learns the reading preferences of a user through a vision-based analysis of his/her past reads. We use an ensemble non-invasive eye gaze tracking technique to analyze user’s reading pattern. This tool performs user profiling and generates a recommended personalized summary of yet unread Wikipedia article for a user. Experimental results showcase the efficiency of the recommendation technique.


Author(s):  
Haitong Yang ◽  
Guangyou Zhou ◽  
Tingting He

This article considers the task of text style transfer: transforming a specific style of sentence into another while preserving its style-independent content. A dominate approach to text style transfer is to learn a good content factor of text, define a fixed vector for every style and recombine them to generate text in the required style. In fact, there are a large number of different words to convey the same style from different aspects. Thus, using a fixed vector to represent one style is very inefficient, which causes the weak representation power of the style vector and limits text diversity of the same style. To address this problem, we propose a novel neural generative model called Adversarial Separation Network (ASN), which can learn the content and style vector jointly and the learnt vectors have strong representation power and good interpretabilities. In our method, adversarial learning is implemented to enhance our model’s capability of disentangling the two factors. To evaluate our method, we conduct experiments on two benchmark datasets. Experimental results show our method can perform style transfer better than strong comparison systems. We also demonstrate the strong interpretability of the learnt latent vectors.


Author(s):  
Guirong Bai ◽  
Shizhu He ◽  
Kang Liu ◽  
Jun Zhao

Active learning is an effective method to substantially alleviate the problem of expensive annotation cost for data-driven models. Recently, pre-trained language models have been demonstrated to be powerful for learning language representations. In this article, we demonstrate that the pre-trained language model can also utilize its learned textual characteristics to enrich criteria of active learning. Specifically, we provide extra textual criteria with the pre-trained language model to measure instances, including noise, coverage, and diversity. With these extra textual criteria, we can select more efficient instances for annotation and obtain better results. We conduct experiments on both English and Chinese sentence matching datasets. The experimental results show that the proposed active learning approach can be enhanced by the pre-trained language model and obtain better performance.


2022 ◽  
Vol 40 (2) ◽  
pp. 1-38
Author(s):  
Shangsong Liang ◽  
Yupeng Luo ◽  
Zaiqiao Meng

In this article, we study the task of user profiling in question answering communities (QACs). Previous user profiling algorithms suffer from a number of defects: they regard users and words as atomic units, leading to the mismatch between them; they are designed for other applications but not for QACs; and some semantic profiling algorithms do not co-embed users and words, leading to making the affinity measurement between them difficult. To improve the profiling performance, we propose a neural Flow-based Constrained Co-embedding Model, abbreviated as FCCM. FCCM jointly co-embeds the vector representations of both users and words in QACs such that the affinities between them can be semantically measured. Specifically, FCCM extends the standard variational auto-encoder model to enforce the inferred embeddings of users and words subject to the voting constraint, i.e., given a question and the users who answer this question in the community, representations of the users whose answers receive more votes are closer to the representations of the words associated with these answers, compared with representations of whose receiving fewer votes. In addition, FCCM integrates normalizing flow into the variational auto-encoder framework to avoid the assumption that the distributions of the embeddings are Gaussian, making the inferred embeddings fit the real distributions of the data better. Experimental results on a Chinese Zhihu question answering dataset demonstrate the effectiveness of our proposed FCCM model for the task of user profiling in QACs.


Author(s):  
Ahmad Alzu'bi ◽  
Maysarah Barham

<p>Breast cancer is one of the most common diseases diagnosed in women over the world. The balanced iterative reducing and clustering using hierarchies (BIRCH) has been widely used in many applications. However, clustering the patient records and selecting an optimal threshold for the hierarchical clusters still a challenging task. In addition, the existing BIRCH is sensitive to the order of data records and influenced by many numerical and functional parameters. Therefore, this paper proposes a unique BIRCH-based algorithm for breast cancer clustering. We aim at transforming the medical records using the breast screening features into sub-clusters to group the subject cases into malignant or benign clusters. The basic BIRCH clustering is firstly fed by a set of normalized features then we automate the threshold initialization to enhance the tree-based sub-clustering procedure. Additionally, we present a thorough analysis on the performance impact of tuning BIRCH with various relevant linkage functions and similarity measures. Two datasets of the standard breast cancer wisconsin (BCW) benchmarking collection are used to evaluate our algorithm. The experimental results show a clustering accuracy of 97.7% in 0.0004 seconds only, thereby confirming the efficiency of the proposed method in clustering the patient records and making timely decisions.</p>


Author(s):  
Ahmad AL Smadi ◽  
Atif Mehmood ◽  
Ahed Abugabah ◽  
Eiad Almekhlafi ◽  
Ahmad Mohammad Al-smadi

<p>In computer vision, image classification is one of the potential image processing tasks. Nowadays, fish classification is a wide considered issue within the areas of machine learning and image segmentation. Moreover, it has been extended to a variety of domains, such as marketing strategies. This paper presents an effective fish classification method based on convolutional neural networks (CNNs). The experiments were conducted on the new dataset of Bangladesh’s indigenous fish species with three kinds of splitting: 80-20%, 75-25%, and 70-30%. We provide a comprehensive comparison of several popular optimizers of CNN. In total, we perform a comparative analysis of 5 different state-of-the-art gradient descent-based optimizers, namely adaptive delta (AdaDelta), stochastic gradient descent (SGD), adaptive momentum (Adam), adaptive max pooling (Adamax), Root mean square propagation (Rmsprop), for CNN. Overall, the obtained experimental results show that Rmsprop, Adam, Adamax performed well compared to the other optimization techniques used, while AdaDelta and SGD performed the worst. Furthermore, the experimental results demonstrated that Adam optimizer attained the best results in performance measures for 70-30% and 80-20% splitting experiments, while the Rmsprop optimizer attained the best results in terms of performance measures of 70-25% splitting experiments. Finally, the proposed model is then compared with state-of-the-art deep CNNs models. Therefore, the proposed model attained the best accuracy of 98.46% in enhancing the CNN ability in classification, among others.</p>


2022 ◽  
Vol 34 (4) ◽  
pp. 1-17
Author(s):  
Yunhong Xu ◽  
Guangyu Wu ◽  
Yu Chen

Online medical communities have revolutionized the way patients obtain medical-related information and services. Investigating what factors might influence patients’ satisfaction with doctors and predicting their satisfaction can help patients narrow down their choices and increase their loyalty towards online medical communities. Considering the imbalanced feature of dataset collected from Good Doctor, we integrated XGBoost and SMOTE algorithm to examine what factors and these factors can be used to predict patient satisfaction. SMOTE algorithm addresses the imbalanced issue by oversampling imbalanced classification datasets. And XGBoost algorithm is an ensemble of decision trees algorithm where new trees fix errors of existing trees. The experimental results demonstrate that SMOTE and XGBoost algorithm can achieve better performance. We further analyzed the role of features played in satisfaction prediction from two levels: individual feature level and feature combination level.


2022 ◽  
Vol 24 (3) ◽  
pp. 0-0

The size of Wikipedia grows exponentially every year, due to which users face the problem of information overload. We purpose a remedy to this problem by developing a recommendation system for Wikipedia articles. The proposed technique automatically generates a personalized synopsis of the article that a user aims to read next. We develop a tool, called PerSummRe, which learns the reading preferences of a user through a vision-based analysis of his/her past reads. We use an ensemble non-invasive eye gaze tracking technique to analyze user’s reading pattern. This tool performs user profiling and generates a recommended personalized summary of yet unread Wikipedia article for a user. Experimental results showcase the efficiency of the recommendation technique.


Author(s):  
Hu Zhang ◽  
Bangze Pan ◽  
Ru Li

Legal judgment elements extraction (LJEE) aims to identify the different judgment features from the fact description in legal documents automatically, which helps to improve the accuracy and interpretability of the judgment results. In real court rulings, judges usually need to scan both the fact descriptions and the law articles repeatedly to find out the relevant information, and it is hard to acquire the key judgment features quickly, so legal judgment elements extraction is a crucial and challenging task for legal judgment prediction. However, most existing methods follow the text classification framework, which fails to model the attentive relations of the law articles and the legal judgment elements. To address this issue, we simulate the working process of human judges, and propose a legal judgment elements extraction method with a law article-aware mechanism, which captures the complex semantic correlations of the law article and the legal judgment elements. Experimental results show that our proposed method achieves significant improvements than other state-of-the-art baselines on the element recognition task dataset. Compared with the BERT-CNN model, the proposed “All labels Law Articles Embedding Model (ALEM)” improves the accuracy, recall, and F1 value by 0.5, 1.4 and 1.0, respectively.


2022 ◽  
Vol 22 (1) ◽  
pp. 1-28
Author(s):  
Sajib Mistry ◽  
Lie Qu ◽  
Athman Bouguettaya

We propose a novel generic reputation bootstrapping framework for composite services. Multiple reputation-related indicators are considered in a layer-based framework to implicitly reflect the reputation of the component services. The importance of an indicator on the future performance of a component service is learned using a modified Random Forest algorithm. We propose a topology-aware Forest Deep Neural Network (fDNN) to find the correlations between the reputation of a composite service and reputation indicators of component services. The trained fDNN model predicts the reputation of a new composite service with the confidence value. Experimental results with real-world dataset prove the efficiency of the proposed approach.


Sign in / Sign up

Export Citation Format

Share Document