An Efficient Tongue Segmentation Model Based on U-Net Framework

Author(s):  
Qunsheng Ruan ◽  
Qingfeng Wu ◽  
Junfeng Yao ◽  
Yingdong Wang ◽  
Hsien-Wei Tseng ◽  
...  

In the intelligently processing of the tongue image, one of the most important tasks is to accurately segment the tongue body from a whole tongue image, and the good quality of tongue body edge processing is of great significance for the relevant tongue feature extraction. To improve the performance of the segmentation model for tongue images, we propose an efficient tongue segmentation model based on U-Net. Three important studies are launched, including optimizing the model’s main network, innovating a new network to specially handle tongue edge cutting and proposing a weighted binary cross-entropy loss function. The purpose of optimizing the tongue image main segmentation network is to make the model recognize the foreground and background features for the tongue image as well as possible. A novel tongue edge segmentation network is used to focus on handling the tongue edge because the edge of the tongue contains a number of important information. Furthermore, the advantageous loss function proposed is to be adopted to enhance the pixel supervision corresponding to tongue images. Moreover, thanks to a lack of tongue image resources on Traditional Chinese Medicine (TCM), some special measures are adopted to augment training samples. Various comparing experiments on two datasets were conducted to verify the performance of the segmentation model. The experimental results indicate that the loss rate of our model converges faster than the others. It is proved that our model has better stability and robustness of segmentation for tongue image from poor environment. The experimental results also indicate that our model outperforms the state-of-the-art ones in aspects of the two most important tongue image segmentation indexes: IoU and Dice. Moreover, experimental results on augmentation samples demonstrate our model have better performances.

2019 ◽  
Vol 9 (13) ◽  
pp. 2684 ◽  
Author(s):  
Hongyang Li ◽  
Lizhuang Liu ◽  
Zhenqi Han ◽  
Dan Zhao

Peeling fibre is an indispensable process in the production of preserved Szechuan pickle, the accuracy of which can significantly influence the quality of the products, and thus the contour method of fibre detection, as a core algorithm of the automatic peeling device, is studied. The fibre contour is a kind of non-salient contour, characterized by big intra-class differences and small inter-class differences, meaning that the feature of the contour is not discriminative. The method called dilated-holistically-nested edge detection (Dilated-HED) is proposed to detect the fibre contour, which is built based on the HED network and dilated convolution. The experimental results for our dataset show that the Pixel Accuracy (PA) is 99.52% and the Mean Intersection over Union (MIoU) is 49.99%, achieving state-of-the-art performance.


2015 ◽  
Vol 2015 ◽  
pp. 1-11 ◽  
Author(s):  
Tao Xiang ◽  
Tao Li ◽  
Mao Ye ◽  
Zijian Liu

Pedestrian detection with large intraclass variations is still a challenging task in computer vision. In this paper, we propose a novel pedestrian detection method based on Random Forest. Firstly, we generate a few local templates with different sizes and different locations in positive exemplars. Then, the Random Forest is built whose splitting functions are optimized by maximizing class purity of matching the local templates to the training samples, respectively. To improve the classification accuracy, we adopt a boosting-like algorithm to update the weights of the training samples in a layer-wise fashion. During detection, the trained Random Forest will vote the category when a sliding window is input. Our contributions are the splitting functions based on local template matching with adaptive size and location and iteratively weight updating method. We evaluate the proposed method on 2 well-known challenging datasets: TUD pedestrians and INRIA pedestrians. The experimental results demonstrate that our method achieves state-of-the-art or competitive performance.


2020 ◽  
Vol 34 (05) ◽  
pp. 9749-9756
Author(s):  
Junnan Zhu ◽  
Yu Zhou ◽  
Jiajun Zhang ◽  
Haoran Li ◽  
Chengqing Zong ◽  
...  

Multimodal summarization with multimodal output (MSMO) is to generate a multimodal summary for a multimodal news report, which has been proven to effectively improve users' satisfaction. The existing MSMO methods are trained by the target of text modality, leading to the modality-bias problem that ignores the quality of model-selected image during training. To alleviate this problem, we propose a multimodal objective function with the guidance of multimodal reference to use the loss from the summary generation and the image selection. Due to the lack of multimodal reference data, we present two strategies, i.e., ROUGE-ranking and Order-ranking, to construct the multimodal reference by extending the text reference. Meanwhile, to better evaluate multimodal outputs, we propose a novel evaluation metric based on joint multimodal representation, projecting the model output and multimodal reference into a joint semantic space during evaluation. Experimental results have shown that our proposed model achieves the new state-of-the-art on both automatic and manual evaluation metrics. Besides, our proposed evaluation method can effectively improve the correlation with human judgments.


Author(s):  
Ziming Li ◽  
Julia Kiseleva ◽  
Maarten De Rijke

The performance of adversarial dialogue generation models relies on the quality of the reward signal produced by the discriminator. The reward signal from a poor discriminator can be very sparse and unstable, which may lead the generator to fall into a local optimum or to produce nonsense replies. To alleviate the first problem, we first extend a recently proposed adversarial dialogue generation method to an adversarial imitation learning solution. Then, in the framework of adversarial inverse reinforcement learning, we propose a new reward model for dialogue generation that can provide a more accurate and precise reward signal for generator training. We evaluate the performance of the resulting model with automatic metrics and human evaluations in two annotation settings. Our experimental results demonstrate that our model can generate more high-quality responses and achieve higher overall performance than the state-of-the-art.


2018 ◽  
Vol 2018 ◽  
pp. 1-16 ◽  
Author(s):  
Lei He ◽  
Yan Xing ◽  
Kangxiong Xia ◽  
Jieqing Tan

In view of the drawback of most image inpainting algorithms by which texture was not prominent, an adaptive inpainting algorithm based on continued fractions was proposed in this paper. In order to restore every damaged point, the information of known pixel points around the damaged point was used to interpolate the intensity of the damaged point. The proposed method included two steps; firstly, Thiele’s rational interpolation combined with the mask image was used to interpolate adaptively the intensities of damaged points to get an initial repaired image, and then Newton-Thiele’s rational interpolation was used to refine the initial repaired image to get a final result. In order to show the superiority of the proposed algorithm, plenty of experiments were tested on damaged images. Subjective evaluation and objective evaluation were used to evaluate the quality of repaired images, and the objective evaluation was comparison of Peak Signal to Noise Ratios (PSNRs). The experimental results showed that the proposed algorithm had better visual effect and higher Peak Signal to Noise Ratio compared with the state-of-the-art methods.


2019 ◽  
Vol 9 (24) ◽  
pp. 5427 ◽  
Author(s):  
Beomjun Kim ◽  
Sungwon Kang ◽  
Seonah Lee

For software maintenance, bug reports provide useful information to developers because they can be used for various tasks such as debugging and understanding previous changes. However, as they are typically written in the form of conversations among developers, bug reports tend to be unnecessarily long and verbose, with the consequence that developers often have difficulties reading or understanding bug reports. To mitigate this problem, methods that automatically generate a summary of bug reports have been proposed, and various related studies have been conducted. However, existing bug report summarization methods have not fully exploited the inherent characteristics of bug reports. In this paper, we propose a bug report summarization method that uses the weighted-PageRank algorithm and exploits the 'duplicates’, ‘blocks’, and ‘depends-on’ relationships between bug reports. The experimental results show that our method outperforms the state-of-the-art method in terms of both the quality of the summary and the number of applicable bug reports.


PLoS ONE ◽  
2021 ◽  
Vol 16 (8) ◽  
pp. e0255685
Author(s):  
Guangchao Yuan ◽  
Munindar P. Singh ◽  
Pradeep K. Murukannaiah

Geographical characteristics have been proven to be effective in improving the quality of point-of-interest (POI) recommendation. However, existing works on POI recommendation focus on cost (time or money) of travel for a user. An important geographical aspect that has not been studied adequately is the neighborhood effect, which captures a user’s POI visiting behavior based on the user’s preference not only to a POI, but also to the POI’s neighborhood. To provide an interpretable framework to fully study the neighborhood effect, first, we develop different sets of insightful features, representing different aspects of neighborhood effect. We employ a Yelp data set to evaluate how different aspects of the neighborhood effect affect a user’s POI visiting behavior. Second, we propose a deep learning–based recommendation framework that exploits the neighborhood effect. Experimental results show that our approach is more effective than two state-of-the-art matrix factorization–based POI recommendation techniques.


Author(s):  
Ke Wang ◽  
Xiaojun Wan

Generating texts of different sentiment labels is getting more and more attention in the area of natural language generation. Recently, Generative Adversarial Net (GAN) has shown promising results in text generation. However, the texts generated by GAN usually suffer from the problems of poor quality, lack of diversity and mode collapse. In this paper, we propose a novel framework - SentiGAN, which has multiple generators and one multi-class discriminator, to address the above problems. In our framework, multiple generators are trained simultaneously, aiming at generating texts of different sentiment labels without supervision. We propose a penalty based objective in the generators to force each of them to generate diversified examples of a specific sentiment label. Moreover, the use of multiple generators and one multi-class discriminator can make each generator focus on generating its own examples of a specific sentiment label accurately. Experimental results on four datasets demonstrate that our model consistently outperforms several state-of-the-art text generation methods in the sentiment accuracy and quality of generated texts.


Information ◽  
2022 ◽  
Vol 13 (1) ◽  
pp. 32
Author(s):  
Gang Sun ◽  
Hancheng Yu ◽  
Xiangtao Jiang ◽  
Mingkui Feng

Edge detection is one of the fundamental computer vision tasks. Recent methods for edge detection based on a convolutional neural network (CNN) typically employ the weighted cross-entropy loss. Their predicted results being thick and needing post-processing before calculating the optimal dataset scale (ODS) F-measure for evaluation. To achieve end-to-end training, we propose a non-maximum suppression layer (NMS) to obtain sharp boundaries without the need for post-processing. The ODS F-measure can be calculated based on these sharp boundaries. So, the ODS F-measure loss function is proposed to train the network. Besides, we propose an adaptive multi-level feature pyramid network (AFPN) to better fuse different levels of features. Furthermore, to enrich multi-scale features learned by AFPN, we introduce a pyramid context module (PCM) that includes dilated convolution to extract multi-scale features. Experimental results indicate that the proposed AFPN achieves state-of-the-art performance on the BSDS500 dataset (ODS F-score of 0.837) and the NYUDv2 dataset (ODS F-score of 0.780).


Author(s):  
Marlene Goncalves ◽  
María Esther Vidal

Criteria that induce a Skyline naturally represent user’s preference conditions useful to discard irrelevant data in large datasets. However, in the presence of high-dimensional Skyline spaces, the size of the Skyline can still be very large. To identify the best k points among the Skyline, the Top-k Skyline approach has been proposed. This chapter describes existing solutions and proposes to use the TKSI algorithm for the Top-k Skyline problem. TKSI reduces the search space by computing only a subset of the Skyline that is required to produce the top-k objects. In addition, the Skyline Frequency Metric is implemented to discriminate among the Skyline objects those that best meet the multidimensional criteria. This chapter’s authors have empirically studied the quality of TKSI, and their experimental results show the TKSI may be able to speed up the computation of the Top-k Skyline in at least 50% percent with regard to the state-of-the-art solutions.


Sign in / Sign up

Export Citation Format

Share Document