CBIR Using Features Derived by Deep Learning

In a Content-based Image Retrieval (CBIR) System, the task is to retrieve similar images from a large database given a query image. The usual procedure is to extract some useful features from the query image and retrieve images that have a similar set of features. For this purpose, a suitable similarity measure is chosen, and images with high similarity scores are retrieved. Naturally, the choice of these features play a very important role in the success of this system, and high-level features are required to reduce the “semantic gap.” In this article, we propose to use features derived from pre-trained network models from a deep-learning convolution network trained for a large image classification problem. This approach appears to produce vastly superior results for a variety of databases, and it outperforms many contemporary CBIR systems. We analyse the retrieval time of the method and also propose a pre-clustering of the database based on the above-mentioned features, which yields comparable results in a much shorter time in most of the cases.

Download Full-text

A Review on Content Based Image Retrieval System Features derived by Deep Learning Models

International Journal for Research in Applied Science and Engineering Technology ◽

10.22214/ijraset.2021.39172 ◽

2021 ◽

Vol 9 (12) ◽

pp. 42-57

Author(s):

Mr. Kommu Naveen

Keyword(s):

Deep Learning ◽

Image Retrieval ◽

Network Models ◽

Classification Problem ◽

Content Based Image Retrieval ◽

Query Image ◽

Usual Procedure ◽

Image Retrieval System ◽

High Level ◽

Trained Network

Abstract: In a Content Based Image Retrieval (CBIR) System, the task is to retrieve similar images from a large database given a query image. The usual procedure is to extract some useful features from the query image, and retrieve images which have similar set of features. For this purpose, a suitable similarity measure is chosen, and images with high similarity scores are retrieved. Naturally the choice of these features play a very important role in the success of this system, and high level features are required to reduce the “semantic gap”. In this paper, we propose to use features derived from pre-trained network models from a deep- learning convolution network trained for a large image classification problem. This approach appears to produce vastly superior results for a variety of databases, and it outperforms many contemporary CBIR systems. We analyse the retrieval time of the method, and also propose a pre-clustering of the database based on the above-mentioned features which yields comparable results in a much shorter time in most of the cases. Keywords Content Based Image Retrieval Feature Selection Deep Learning Pre-trained Network Models Pre-clustering

Download Full-text

Deep Learning With TensorFlow: A Review

Journal of Educational and Behavioral Statistics ◽

10.3102/1076998619872761 ◽

2019 ◽

Vol 45 (2) ◽

pp. 227-248 ◽

Cited By ~ 4

Author(s):

Bo Pang ◽

Erik Nijkamp ◽

Ying Nian Wu

Keyword(s):

Neural Network ◽

Deep Learning ◽

Convolutional Neural Network ◽

Network Models ◽

Optimization Method ◽

Stochastic Gradient Descent ◽

Processing Unit ◽

Neural Network Models ◽

Core Concepts ◽

High Level

This review covers the core concepts and design decisions of TensorFlow. TensorFlow, originally created by researchers at Google, is the most popular one among the plethora of deep learning libraries. In the field of deep learning, neural networks have achieved tremendous success and gained wide popularity in various areas. This family of models also has tremendous potential to promote data analysis and modeling for various problems in educational and behavioral sciences given its flexibility and scalability. We give the reader an overview of the basics of neural network models such as the multilayer perceptron, the convolutional neural network, and stochastic gradient descent, the most commonly used optimization method for neural network models. However, the implementation of these models and optimization algorithms is time-consuming and error-prone. Fortunately, TensorFlow greatly eases and accelerates the research and application of neural network models. We review several core concepts of TensorFlow such as graph construction functions, graph execution tools, and TensorFlow’s visualization tool, TensorBoard. Then, we apply these concepts to build and train a convolutional neural network model to classify handwritten digits. This review is concluded by a comparison of low- and high-level application programming interfaces and a discussion of graphical processing unit support, distributed training, and probabilistic modeling with TensorFlow Probability library.

Download Full-text

SIR-DL: AN ARCHITECTURE OF SEMANTIC-BASED IMAGE RETRIEVAL USING DEEP LEARNING TECHNIQUE AND RDF TRIPLE LANGUAGE

Journal of Computer Science and Cybernetics ◽

10.15625/1813-9663/35/1/13097 ◽

2019 ◽

Vol 35 (1) ◽

pp. 39-56

Author(s):

Van The Thanh ◽

Do Quang Khoi ◽

Le Huu Ha ◽

Le Manh Thanh

Keyword(s):

Information System ◽

Deep Learning ◽

Image Retrieval ◽

Multimedia Systems ◽

Visual Feature ◽

Query Image ◽

Semantic Classification ◽

Visual Words ◽

Learning Technique ◽

Similar Images

The problem of finding and identifying semantics of images is applied in multimedia applications of many different fields such as Hospital Information System, Geographic Information System, Digital Library System, etc. In this paper, we propose the semantic-based image retrieval (SBIR) system based on the deep learning technique; this system is called as SIR-DL that generates visual semantics based on classifying image contents. At the same time we identify the semantics of similar images on Ontology, which describes semantics of visual features of images. Firstly, the color and spatial features of segmented images are we extracted and these visual feature vectors are trained on the deep neural network to obtain visual words vectors. The process of image retrieval is executed rely on semantic classification of SIR-DL according to the visual feature vector of the query image from which it produces a visual word vector. Then, we retrieve it on Ontology to provide the identities and the semantics of similar images corresponds to a similarity measure. In order to carry out SIR-DL, the algorithms and diagram of this image retrieval system are proposed after that we implement them on ImageCLEF@IAPR, which has 20,000 images. On the base of the experimental results, the effectiveness of our method is evaluated by the accuracy, precision, recall, and F-measure; these results are compared with some of works recently published on the same image dataset. It shows that SIR-DL effectively solves the problem of semantic-based image retrieval and can be used to build multimedia systems in many different fields.

Download Full-text

KTRICT A KAZE Feature Extraction

International Journal of Multimedia Data Engineering and Management ◽

10.4018/ijmdem.2020040104 ◽

2020 ◽

Vol 11 (2) ◽

pp. 49-65

Author(s):

Badal Soni ◽

Angana Borah ◽

Pidugu Naga Lakshmi Sowgandhi ◽

Pramod Sarma ◽

Ermyas Fekadu Shiferaw

Keyword(s):

Feature Extraction ◽

Search Space ◽

Random Projection ◽

Semantic Gap ◽

Query Image ◽

Retrieval Time ◽

Similar Images ◽

Two Phases ◽

Kaze Feature ◽

Memory Indexing

To improve the retrieval accuracy in CBIR system means reducing this semantic gap. Reducing semantic is a necessity to build a better, trusted system, since CBIR systems are applied to a lot of fields that require utmost accuracy. Time constraint is also a very important factor since a fast CBIR system leads to a fast completion of different tasks. The aim of the paper is to build a CBIR system that provides high accuracy in lower time complexity and work towards bridging the aforementioned semantic gap. CBIR systems retrieve images that are related to query image (QI) from huge datasets. The traditional CBIR systems include two phases: feature extraction and similarity matching. Here, a technique called KTRICT, a KAZE-feature extraction, tree and random-projection indexing-based CBIR technique, is introduced which incorporates indexing after feature extraction. This reduces the retrieval time by a great extent and also saves memory. Indexing divides a search space into subspaces containing similar images together, thereby decreasing the overall retrieval time.

Download Full-text

Investigating the impact of pre-processing techniques and pre-trained word embeddings in detecting Arabic health information on social media

Journal Of Big Data ◽

10.1186/s40537-021-00488-w ◽

2021 ◽

Vol 8 (1) ◽

Author(s):

Yahya Albalawi ◽

Jim Buckley ◽

Nikola S. Nikolov

Keyword(s):

Social Media ◽

Deep Learning ◽

Comprehensive Evaluation ◽

Classification Problem ◽

Data Sets ◽

Word Embeddings ◽

Data Set ◽

Lower Accuracy ◽

Health Related ◽

The Impact

AbstractThis paper presents a comprehensive evaluation of data pre-processing and word embedding techniques in the context of Arabic document classification in the domain of health-related communication on social media. We evaluate 26 text pre-processings applied to Arabic tweets within the process of training a classifier to identify health-related tweets. For this task we use the (traditional) machine learning classifiers KNN, SVM, Multinomial NB and Logistic Regression. Furthermore, we report experimental results with the deep learning architectures BLSTM and CNN for the same text classification problem. Since word embeddings are more typically used as the input layer in deep networks, in the deep learning experiments we evaluate several state-of-the-art pre-trained word embeddings with the same text pre-processing applied. To achieve these goals, we use two data sets: one for both training and testing, and another for testing the generality of our models only. Our results point to the conclusion that only four out of the 26 pre-processings improve the classification accuracy significantly. For the first data set of Arabic tweets, we found that Mazajak CBOW pre-trained word embeddings as the input to a BLSTM deep network led to the most accurate classifier with F1 score of 89.7%. For the second data set, Mazajak Skip-Gram pre-trained word embeddings as the input to BLSTM led to the most accurate model with F1 score of 75.2% and accuracy of 90.7% compared to F1 score of 90.8% achieved by Mazajak CBOW for the same architecture but with lower accuracy of 70.89%. Our results also show that the performance of the best of the traditional classifier we trained is comparable to the deep learning methods on the first dataset, but significantly worse on the second dataset.

Download Full-text

Advancing Stress Detection Methodology with Deep Learning Techniques Targeting UX Evaluation in AAL Scenarios: Applying Embeddings for Categorical Variables

Electronics ◽

10.3390/electronics10131550 ◽

2021 ◽

Vol 10 (13) ◽

pp. 1550

Author(s):

Alexandros Liapis ◽

Evanthia Faliagka ◽

Christos P. Antonopoulos ◽

Georgios Keramidas ◽

Nikolaos Voros

Keyword(s):

Machine Learning ◽

Deep Learning ◽

User Experience ◽

Electrodermal Activity ◽

Binary Classification ◽

Research Question ◽

Classification Problem ◽

Categorical Variables ◽

Stress Detection ◽

Software Failures

Physiological measurements have been widely used by researchers and practitioners in order to address the stress detection challenge. So far, various datasets for stress detection have been recorded and are available to the research community for testing and benchmarking. The majority of the stress-related available datasets have been recorded while users were exposed to intense stressors, such as songs, movie clips, major hardware/software failures, image datasets, and gaming scenarios. However, it remains an open research question if such datasets can be used for creating models that will effectively detect stress in different contexts. This paper investigates the performance of the publicly available physiological dataset named WESAD (wearable stress and affect detection) in the context of user experience (UX) evaluation. More specifically, electrodermal activity (EDA) and skin temperature (ST) signals from WESAD were used in order to train three traditional machine learning classifiers and a simple feed forward deep learning artificial neural network combining continues variables and entity embeddings. Regarding the binary classification problem (stress vs. no stress), high accuracy (up to 97.4%), for both training approaches (deep-learning, machine learning), was achieved. Regarding the stress detection effectiveness of the created models in another context, such as user experience (UX) evaluation, the results were quite impressive. More specifically, the deep-learning model achieved a rather high agreement when a user-annotated dataset was used for validation.

Download Full-text

Deep learning and complex network theory based analysis on socialized manufacturing resources utilisations and an application case study

Concurrent Engineering ◽

10.1177/1063293x211003194 ◽

2021 ◽

pp. 1063293X2110031

Author(s):

Maolin Yang ◽

Auwal H Abubakar ◽

Pingyu Jiang

Keyword(s):

Deep Learning ◽

Complex Network ◽

Network Models ◽

Neural Network Models ◽

Complex Network Theory ◽

New Type ◽

Manufacturing Resource ◽

Manufacturing Resources ◽

Resource Characteristics

Social manufacturing is characterized by its capability of utilizing socialized manufacturing resources to achieve value adding. Recently, a new type of social manufacturing pattern emerges and shows potential for core factories to improve their limited manufacturing capabilities by utilizing the resources from outside socialized manufacturing resource communities. However, the core factories need to analyze the resource characteristics of the socialized resource communities before making operation plans, and this is challenging due to the unaffiliated and self-driven characteristics of the resource providers in socialized resource communities. In this paper, a deep learning and complex network based approach is established to address this challenge by using socialized designer community for demonstration. Firstly, convolutional neural network models are trained to identify the design resource characteristics of each socialized designer in designer community according to the interaction texts posted by the socialized designer on internet platforms. During the process, an iterative dataset labelling method is established to reduce the time cost for training set labelling. Secondly, complex networks are used to model the design resource characteristics of the community according to the resource characteristics of all the socialized designers in the community. Two real communities from RepRap 3D printer project are used as case study.

Download Full-text

Deep-Framework: A Distributed, Scalable, and Edge-Oriented Framework for Real-Time Analysis of Video Streams

Sensors ◽

10.3390/s21124045 ◽

2021 ◽

Vol 21 (12) ◽

pp. 4045

Author(s):

Alessandro Sassu ◽

Jose Francisco Saenz-Cogollo ◽

Maurizio Agelli

Keyword(s):

Deep Learning ◽

Real Time ◽

Video Data ◽

Video Analytics ◽

Web Based ◽

Real Time Analysis ◽

Open Source Framework ◽

Cluster Configuration ◽

Time Requirements ◽

High Level

Edge computing is the best approach for meeting the exponential demand and the real-time requirements of many video analytics applications. Since most of the recent advances regarding the extraction of information from images and video rely on computation heavy deep learning algorithms, there is a growing need for solutions that allow the deployment and use of new models on scalable and flexible edge architectures. In this work, we present Deep-Framework, a novel open source framework for developing edge-oriented real-time video analytics applications based on deep learning. Deep-Framework has a scalable multi-stream architecture based on Docker and abstracts away from the user the complexity of cluster configuration, orchestration of services, and GPU resources allocation. It provides Python interfaces for integrating deep learning models developed with the most popular frameworks and also provides high-level APIs based on standard HTTP and WebRTC interfaces for consuming the extracted video data on clients running on browsers or any other web-based platform.

Download Full-text

Deep Learning for Laryngopharyngeal Reflux Diagnosis

Applied Sciences ◽

10.3390/app11114753 ◽

2021 ◽

Vol 11 (11) ◽

pp. 4753

Author(s):

Gen Ye ◽

Chen Du ◽

Tong Lin ◽

Yan Yan ◽

Jack Jiang

Keyword(s):

Deep Learning ◽

Speech Processing ◽

Data Augmentation ◽

Laryngopharyngeal Reflux ◽

Ph Monitoring ◽

Binary Classification ◽

Classification Problem ◽

Learning Approaches ◽

Learning Techniques ◽

Auc Value

(1) Background: Deep learning has become ubiquitous due to its impressive performance in various domains, such as varied as computer vision, natural language and speech processing, and game-playing. In this work, we investigated the performance of recent deep learning approaches on the laryngopharyngeal reflux (LPR) diagnosis task. (2) Methods: Our dataset is composed of 114 subjects with 37 pH-positive cases and 77 control cases. In contrast to prior work based on either reflux finding score (RFS) or pH monitoring, we directly take laryngoscope images as inputs to neural networks, as laryngoscopy is the most common and simple diagnostic method. The diagnosis task is formulated as a binary classification problem. We first tested a powerful backbone network that incorporates residual modules, attention mechanism and data augmentation. Furthermore, recent methods in transfer learning and few-shot learning were investigated. (3) Results: On our dataset, the performance is the best test classification accuracy is 73.4%, while the best AUC value is 76.2%. (4) Conclusions: This study demonstrates that deep learning techniques can be applied to classify LPR images automatically. Although the number of pH-positive images used for training is limited, deep network can still be capable of learning discriminant features with the advantage of technique.

Download Full-text

Feature extraction-based image steganalysis using deep learning

WEENTECH Proceedings in Energy ◽

10.32438/wpe.182021 ◽

2021 ◽

pp. 188-198

Keyword(s):

Neural Network ◽

Deep Learning ◽

Convolutional Neural Network ◽

Secure Communication ◽

Information Technologies ◽

Network Models ◽

Error Rates ◽

Multimedia Data ◽

Secret Message ◽

Neural Network Models

The innovations in advanced information technologies has led to rapid delivery and sharing of multimedia data like images and videos. The digital steganography offers ability to secure communication and imperative for internet. The image steganography is essential to preserve confidential information of security applications. The secret image is embedded within pixels. The embedding of secret message is done by applied with S-UNIWARD and WOW steganography. Hidden messages are reveled using steganalysis. The exploration of research interests focused on conventional fields and recent technological fields of steganalysis. This paper devises Convolutional neural network models for steganalysis. Convolutional neural network (CNN) is one of the most frequently used deep learning techniques. The Convolutional neural network is used to extract spatio-temporal information or features and classification. We have compared steganalysis outcome with AlexNet and SRNeT with same dataset. The stegnalytic error rates are compared with different payloads.

Download Full-text