scholarly journals Extracting Visual Knowledge from the Web with Multimodal Learning

Author(s):  
Dihong Gong ◽  
Daisy Zhe Wang

We consider the problem of automatically extracting visual objects from web images. Despite the extraordinary advancement in deep learning, visual object detection remains a challenging task. To overcome the deficiency of pure visual techniques, we propose to make use of meta text surrounding images on the Web for enhanced detection accuracy. In this paper we present a multimodal learning algorithm to integrate text information into visual knowledge extraction. To demonstrate the effectiveness of our approach, we developed a system that takes raw webpages as input, and automatically extracts visual knowledge (e.g. object bounding boxes) from tens of millions of images crawled from the Web. Experimental results based on 46 object categories show that the extraction precision is improved significantly from 73% (with state-of-the-art deep learning programs) to 81%, which is equivalent to a 31% reduction in error rates.

2019 ◽  
Vol 2019 ◽  
pp. 1-14 ◽  
Author(s):  
Balakrishnan Ramalingam ◽  
Vega-Heredia Manuel ◽  
Mohan Rajesh Elara ◽  
Ayyalusami Vengadesh ◽  
Anirudh Krishna Lakshmanan ◽  
...  

Aircraft surface inspection includes detecting surface defects caused by corrosion and cracks and stains from the oil spill, grease, dirt sediments, etc. In the conventional aircraft surface inspection process, human visual inspection is performed which is time-consuming and inefficient whereas robots with onboard vision systems can inspect the aircraft skin safely, quickly, and accurately. This work proposes an aircraft surface defect and stain detection model using a reconfigurable climbing robot and an enhanced deep learning algorithm. A reconfigurable, teleoperated robot, named as “Kiropter,” is designed to capture the aircraft surface images with an onboard RGB camera. An enhanced SSD MobileNet framework is proposed for stain and defect detection from these images. A Self-filtering-based periodic pattern detection filter has been included in the SSD MobileNet deep learning framework to achieve the enhanced detection of the stains and defects on the aircraft skin images. The model has been tested with real aircraft surface images acquired from a Boeing 737 and a compact aircraft’s surface using the teleoperated robot. The experimental results prove that the enhanced SSD MobileNet framework achieves improved detection accuracy of aircraft surface defects and stains as compared to the conventional models.


2021 ◽  
Vol 2021 ◽  
pp. 1-10
Author(s):  
Yiran Feng ◽  
Xueheng Tao ◽  
Eung-Joo Lee

In view of the current absence of any deep learning algorithm for shellfish identification in real contexts, an improved Faster R-CNN-based detection algorithm is proposed in this paper. It achieves multiobject recognition and localization through a second-order detection network and replaces the original feature extraction module with DenseNet, which can fuse multilevel feature information, increase network depth, and avoid the disappearance of network gradients. Meanwhile, the proposal merging strategy is improved with Soft-NMS, where an attenuation function is designed to replace the conventional NMS algorithm, thereby avoiding missed detection of adjacent or overlapping objects and enhancing the network detection accuracy under multiple objects. By constructing a real contexts shellfish dataset and conducting experimental tests on a vision recognition seafood sorting robot production line, we were able to detect the features of shellfish in different scenarios, and the detection accuracy was improved by nearly 4% compared to the original detection model, achieving a better detection accuracy. This provides favorable technical support for future quality sorting of seafood using the improved Faster R-CNN-based approach.


Author(s):  
Jiajia Liao ◽  
Yujun Liu ◽  
Yingchao Piao ◽  
Jinhe Su ◽  
Guorong Cai ◽  
...  

AbstractRecent advances in camera-equipped drone applications increased the demand for visual object detection algorithms with deep learning for aerial images. There are several limitations in accuracy for a single deep learning model. Inspired by ensemble learning can significantly improve the generalization ability of the model in the machine learning field, we introduce a novel integration strategy to combine the inference results of two different methods without non-maximum suppression. In this paper, a global and local ensemble network (GLE-Net) was proposed to increase the quality of predictions by considering the global weights for different models and adjusting the local weights for bounding boxes. Specifically, the global module assigns different weights to models. In the local module, we group the bounding boxes that corresponding to the same object as a cluster. Each cluster generates a final predict box and assigns the highest score in the cluster as the score of the final predict box. Experiments on benchmarks VisDrone2019 show promising performance of GLE-Net compared with the baseline network.


CONVERTER ◽  
2021 ◽  
pp. 598-605
Author(s):  
Zhao Jianchao

Behind the rapid development of the Internet industry, Internet security has become a hidden danger. In recent years, the outstanding performance of deep learning in classification and behavior prediction based on massive data makes people begin to study how to use deep learning technology. Therefore, this paper attempts to apply deep learning to intrusion detection to learn and classify network attacks. Aiming at the nsl-kdd data set, this paper first uses the traditional classification methods and several different deep learning algorithms for learning classification. This paper deeply analyzes the correlation among data sets, algorithm characteristics and experimental classification results, and finds out the deep learning algorithm which is relatively good at. Then, a normalized coding algorithm is proposed. The experimental results show that the algorithm can improve the detection accuracy and reduce the false alarm rate.


Scanning ◽  
2021 ◽  
Vol 2021 ◽  
pp. 1-13
Author(s):  
Lun Zhao ◽  
Yunlong Pan ◽  
Sen Wang ◽  
Liang Zhang ◽  
Md Shafiqul Islam

The scanning electron microscope (SEM) is widely used in the analysis and research of materials, including fracture analysis, microstructure morphology, and nanomaterial analysis. With the rapid development of materials science and computer vision technology, the level of detection technology is constantly improving. In this paper, the deep learning method is used to intelligently identify microcracks in the microscopic morphology of SEM image. A deep learning model based on image level is selected to reduce the interference of other complex microscopic topography, and a detection method with dense continuous bounding boxes suitable for SEM images is proposed. The dense and continuous bounding boxes were used to obtain the local features of the cracks and rotating the bounding boxes to reduce the feature differences between the bounding boxes. Finally, the bounding boxes with filled regression were used to highlight the microcrack detection effect. The results show that the detection accuracy of our approach reached 71.12%, and the highest mIOU reached 64.13%. Also, microcracks in different magnifications and in different backgrounds were detected successfully.


Author(s):  
Zheng Wang ◽  
Ti Liang

The explosion of images on the Web has led to a number of efforts to organize images semantically and compile collections of visual knowledge. While there has been enormous progress on categorizing entire images or bounding boxes, only few studies have targeted fine-grained image understanding at the level of specific shape contours. For example, given an image of a cat, we would like a system to not merely recognize the existence of a cat, but also to distinguish between the cat’s legs, head, tail, and so on. In this paper, we present ShapeLearner, a system that acquires such visual knowledge about object shapes and their parts. ShapeLearner jointly learns this knowledge from sets of segmented images. The space of label and segmentation hypotheses is pruned and then evaluated using Integer Linear Programming. ShapeLearner places the resulting knowledge in a semantic taxonomy based on WordNet and is able to exploit this hierarchy in order to analyze new kinds of objects that it has not observed before. We conduct experiments using a variety of shape classes from several representative categories and demonstrate the accuracy and robustness of our method.


2021 ◽  
Vol 11 (17) ◽  
pp. 8226
Author(s):  
Shyang-Jye Chang ◽  
Chien-Yu Huang

The detection of coffee bean defects is the most crucial step prior to bean roasting. Existing defect detection methods used in the specialty coffee bean industry entail manual screening and sorting, require substantial human resources, and are not standardized. To solve these problems, this study developed a deep learning algorithm to detect defects in coffee beans. The results reveal that when the pooling layer was used to enhance features and reduce neural dimensionality, some of the coffee been features were lost or misclassified. Therefore, a novel dimensionality reduction method was adopted to increase the ability of feature extraction. The developed model also overcame the drawbacks of padding causing blurred image boundaries and the dead neurons causing impeding feature propagation. Images of eight types of coffee beans were used to train and test the proposed detection model. The proposed method was verified to reduce the bias when classifying defects in coffee beans. The detection accuracy rate of the proposed model was 95.2%. When the model was only used to detect the presence of defects, the accuracy rate increased to 100%. Thus, the proposed model is highly accurate in coffee bean defect detection in the classification of eight types of coffee beans.


2021 ◽  
Vol 2021 ◽  
pp. 1-7
Author(s):  
Xiuying Meng

Crack is the early expression form of the concrete pavement disease. Early discovery and treatment of it can play an important role in the maintenance of the pavement. With ongoing advancements in computer hardware technology, continual optimization of deep learning algorithms, as compared to standard digital image processing algorithms, utilizing automation of crack detection technology has a deep learning algorithm that is more exact. As a result of the benefits of greater robustness, the study of concrete pavement crack picture has become popular. In view of the poor effect and weak generalization ability of traditional image processing technology on image segmentation of concrete cracks, this paper studies the image segmentation algorithm of concrete cracks based on convolutional neural network and designs an end-to-end segmentation model based on ResNet101. It integrates more low-level features, which make the fracture segmentation results more refined and closer to the practical application scenarios. Compared with other methods, the algorithm in this paper has achieved higher detection accuracy and generalization ability.


Author(s):  
Zheng Wang ◽  
Ti Liang

The explosion of images on the Web has led to a number of efforts to organize images semantically and compile collections of visual knowledge. While there has been enormous progress on categorizing entire images or bounding boxes, only few studies have targeted fine-grained image understanding at the level of specific shape contours. For example, given an image of a cat, we would like a system to not merely recognize the existence of a cat, but also to distinguish between the cat’s legs, head, tail, and so on. In this paper, we present ShapeLearner, a system that acquires such visual knowledge about object shapes and their parts. ShapeLearner jointly learns this knowledge from sets of segmented images. The space of label and segmentation hypotheses is pruned and then evaluated using Integer Linear Programming. ShapeLearner places the resulting knowledge in a semantic taxonomy based on WordNet and is able to exploit this hierarchy in order to analyze new kinds of objects that it has not observed before. We conduct experiments using a variety of shape classes from several representative categories and demonstrate the accuracy and robustness of our method.


Author(s):  
Greta Pratuzaitė ◽  
Nijolė Maknickienė

Criminal financial behaviour is a problem for both banks and newly created fintech companies. Credit card fraud detection becomes a challenge for any such company. The aim of this paper is to com-pare ability to detect credit card fraud by four algorithmic methods: Generalized method of moments, K-nearest neighbour, Naive Bayes classification and Deep learning. The deep learning algorithm has been tuned to select key parameters so that fraud detection accuracy is the best. Five recognition accuracy parameters and a cost calcualtions showed that the deep learning algorithm is the best fraud detection meth-od compared to other classification algorithms. A financial company reduces losses and increases customer confidence by using fraud prevention technologies.


Sign in / Sign up

Export Citation Format

Share Document