scholarly journals A Feature Fusion Method with Guided Training for Classification Tasks

2021 ◽  
Vol 2021 ◽  
pp. 1-11
Author(s):  
Taohong Zhang ◽  
Suli Fan ◽  
Junnan Hu ◽  
Xuxu Guo ◽  
Qianqian Li ◽  
...  

In this paper, a feature fusion method with guiding training (FGT-Net) is constructed to fuse image data and numerical data for some specific recognition tasks which cannot be classified accurately only according to images. The proposed structure is divided into the shared weight network part, the feature fused layer part, and the classification layer part. First, the guided training method is proposed to optimize the training process, the representative images and training images are input into the shared weight network to learn the ability that extracts the image features better, and then the image features and numerical features are fused together in the feature fused layer to input into the classification layer for the classification task. Experiments are carried out to verify the effectiveness of the proposed model. Loss is calculated by the output of both the shared weight network and classification layer. The results of experiments show that the proposed FGT-Net achieves the accuracy of 87.8%, which is 15% higher than the CNN model of ShuffleNetv2 (which can process image data only) and 9.8% higher than the DNN method (which processes structured data only).

2020 ◽  
Vol 10 (19) ◽  
pp. 6823
Author(s):  
Hongwei Ding ◽  
Xiaohui Cui ◽  
Leiyang Chen ◽  
Kun Zhao

Fundus blood vessel image segmentation plays an important role in the diagnosis and treatment of diseases and is the basis of computer-aided diagnosis. Feature information from the retinal blood vessel image is relatively complicated, and the existing algorithms are sometimes difficult to perform effective segmentation with. Aiming at the problems of low accuracy and low sensitivity of the existing segmentation methods, an improved U-shaped neural network (MRU-NET) segmentation method for retinal vessels was proposed. Firstly, the image enhancement algorithm and random segmentation method are used to solve the problems of low contrast and insufficient image data of the original image. Moreover, smaller image blocks after random segmentation are helpful to reduce the complexity of the U-shaped neural network model; secondly, the residual learning is introduced into the encoder and decoder to improve the efficiency of feature use and to reduce information loss, and a feature fusion module is introduced between the encoder and decoder to extract image features with different granularities; and finally, a feature balancing module is added to the skip connections to resolve the semantic gap between low-dimensional features in the encoder and high-dimensional features in decoder. Experimental results show that our method has better accuracy and sensitivity on the DRIVE and STARE datasets (accuracy (ACC) = 0.9611, sensitivity (SE) = 0.8613; STARE: ACC = 0.9662, SE = 0.7887) than some of the state-of-the-art methods.


Symmetry ◽  
2020 ◽  
Vol 12 (12) ◽  
pp. 2010
Author(s):  
Kang Zhang ◽  
Yushui Geng ◽  
Jing Zhao ◽  
Jianxin Liu ◽  
Wenxiao Li

In recent years, with the popularity of social media, users are increasingly keen to express their feelings and opinions in the form of pictures and text, which makes multimodal data with text and pictures the con tent type with the most growth. Most of the information posted by users on social media has obvious sentimental aspects, and multimodal sentiment analysis has become an important research field. Previous studies on multimodal sentiment analysis have primarily focused on extracting text and image features separately and then combining them for sentiment classification. These studies often ignore the interaction between text and images. Therefore, this paper proposes a new multimodal sentiment analysis model. The model first eliminates noise interference in textual data and extracts more important image features. Then, in the feature-fusion part based on the attention mechanism, the text and images learn the internal features from each other through symmetry. Then the fusion features are applied to sentiment classification tasks. The experimental results on two common multimodal sentiment datasets demonstrate the effectiveness of the proposed model.


2021 ◽  
pp. 0734242X2098788
Author(s):  
Yifeng Li ◽  
Xunpeng Qin ◽  
Zhenyuan Zhang ◽  
Huanyu Dong

End-of-life vehicles (ELVs) provide a particularly potent source of supply for metals. Hence, the recycling and sorting techniques for ferrous and nonferrous metal scraps from ELVs significantly increase metal resource utilization. However, different kinds of nonferrous metal scraps, such as aluminium (Al) and copper (Cu), are not further automatically classified due to the lack of proper techniques. The purpose of this study is to propose an identification method for different nonferrous metal scraps, facilitate the further separation of nonferrous metal scraps, achieve better management of recycled metal resources and increase sustainability. A convolutional neural network (CNN) and SEEDS (superpixels extracted via energy-driven sampling) were adopted in this study. To build the classifier, 80 training images of randomly chosen Al and Cu scraps were taken, and some practical methods were proposed, including training patch generation with SEEDS, image data augmentation and automatic labelling methods for enormous training data. To obtain more accurate results, SEEDS was also used to optimize the coarse results obtained from the pretrained CNN model. Five indicators were adopted to evaluate the final identification results. Furthermore, 15 test samples concerning different classification environments were tested through the proposed model, and it performed well under all of the employed evaluation indexes, with an average precision of 0.98. The results demonstrate that the proposed model is robust for metal scrap identification, which can be expanded to a complex industrial environment, and it presents new possibilities for highly accurate automatic nonferrous metal scrap classification.


Sensors ◽  
2021 ◽  
Vol 21 (6) ◽  
pp. 2203
Author(s):  
Antal Hiba ◽  
Attila Gáti ◽  
Augustin Manecy

Precise navigation is often performed by sensor fusion of different sensors. Among these sensors, optical sensors use image features to obtain the position and attitude of the camera. Runway relative navigation during final approach is a special case where robust and continuous detection of the runway is required. This paper presents a robust threshold marker detection method for monocular cameras and introduces an on-board real-time implementation with flight test results. Results with narrow and wide field-of-view optics are compared. The image processing approach is also evaluated on image data captured by a different on-board system. The pure optical approach of this paper increases sensor redundancy because it does not require input from an inertial sensor as most of the robust runway detectors.


IEEE Access ◽  
2021 ◽  
pp. 1-1
Author(s):  
LaiHang Yu ◽  
DongYan Zhang ◽  
NingZhong Liu ◽  
WenGang Zhou

Author(s):  
Daniel Overhoff ◽  
Peter Kohlmann ◽  
Alex Frydrychowicz ◽  
Sergios Gatidis ◽  
Christian Loewe ◽  
...  

Purpose The DRG-ÖRG IRP (Deutsche Röntgengesellschaft-Österreichische Röntgengesellschaft international radiomics platform) represents a web-/cloud-based radiomics platform based on a public-private partnership. It offers the possibility of data sharing, annotation, validation and certification in the field of artificial intelligence, radiomics analysis, and integrated diagnostics. In a first proof-of-concept study, automated myocardial segmentation and automated myocardial late gadolinum enhancement (LGE) detection using radiomic image features will be evaluated for myocarditis data sets. Materials and Methods The DRG-ÖRP IRP can be used to create quality-assured, structured image data in combination with clinical data and subsequent integrated data analysis and is characterized by the following performance criteria: Possibility of using multicentric networked data, automatically calculated quality parameters, processing of annotation tasks, contour recognition using conventional and artificial intelligence methods and the possibility of targeted integration of algorithms. In a first study, a neural network pre-trained using cardiac CINE data sets was evaluated for segmentation of PSIR data sets. In a second step, radiomic features were applied for segmental detection of LGE of the same data sets, which were provided multicenter via the IRP. Results First results show the advantages (data transparency, reliability, broad involvement of all members, continuous evolution as well as validation and certification) of this platform-based approach. In the proof-of-concept study, the neural network demonstrated a Dice coefficient of 0.813 compared to the expert's segmentation of the myocardium. In the segment-based myocardial LGE detection, the AUC was 0.73 and 0.79 after exclusion of segments with uncertain annotation.The evaluation and provision of the data takes place at the IRP, taking into account the FAT (fairness, accountability, transparency) and FAIR (findable, accessible, interoperable, reusable) criteria. Conclusion It could be shown that the DRG-ÖRP IRP can be used as a crystallization point for the generation of further individual and joint projects. The execution of quantitative analyses with artificial intelligence methods is greatly facilitated by the platform approach of the DRG-ÖRP IRP, since pre-trained neural networks can be integrated and scientific groups can be networked.In a first proof-of-concept study on automated segmentation of the myocardium and automated myocardial LGE detection, these advantages were successfully applied.Our study shows that with the DRG-ÖRP IRP, strategic goals can be implemented in an interdisciplinary way, that concrete proof-of-concept examples can be demonstrated, and that a large number of individual and joint projects can be realized in a participatory way involving all groups. Key Points:  Citation Format


Author(s):  
Huimin Lu ◽  
Rui Yang ◽  
Zhenrong Deng ◽  
Yonglin Zhang ◽  
Guangwei Gao ◽  
...  

Chinese image description generation tasks usually have some challenges, such as single-feature extraction, lack of global information, and lack of detailed description of the image content. To address these limitations, we propose a fuzzy attention-based DenseNet-BiLSTM Chinese image captioning method in this article. In the proposed method, we first improve the densely connected network to extract features of the image at different scales and to enhance the model’s ability to capture the weak features. At the same time, a bidirectional LSTM is used as the decoder to enhance the use of context information. The introduction of an improved fuzzy attention mechanism effectively improves the problem of correspondence between image features and contextual information. We conduct experiments on the AI Challenger dataset to evaluate the performance of the model. The results show that compared with other models, our proposed model achieves higher scores in objective quantitative evaluation indicators, including BLEU , BLEU , METEOR, ROUGEl, and CIDEr. The generated description sentence can accurately express the image content.


2021 ◽  
Vol 11 (3) ◽  
pp. 1064
Author(s):  
Jenq-Haur Wang ◽  
Yen-Tsang Wu ◽  
Long Wang

In social networks, users can easily share information and express their opinions. Given the huge amount of data posted by many users, it is difficult to search for relevant information. In addition to individual posts, it would be useful if we can recommend groups of people with similar interests. Past studies on user preference learning focused on single-modal features such as review contents or demographic information of users. However, such information is usually not easy to obtain in most social media without explicit user feedback. In this paper, we propose a multimodal feature fusion approach to implicit user preference prediction which combines text and image features from user posts for recommending similar users in social media. First, we use the convolutional neural network (CNN) and TextCNN models to extract image and text features, respectively. Then, these features are combined using early and late fusion methods as a representation of user preferences. Lastly, a list of users with the most similar preferences are recommended. The experimental results on real-world Instagram data show that the best performance can be achieved when we apply late fusion of individual classification results for images and texts, with the best average top-k accuracy of 0.491. This validates the effectiveness of utilizing deep learning methods for fusing multimodal features to represent social user preferences. Further investigation is needed to verify the performance in different types of social media.


Sign in / Sign up

Export Citation Format

Share Document