A Convolutional Neural Networks-Based Approach for Texture Directionality Detection

Marcin Kociołek; Michał Kozłowski; Antonio Cardone

doi:10.3390/s22020562

A Convolutional Neural Networks-Based Approach for Texture Directionality Detection

Sensors ◽

10.3390/s22020562 ◽

2022 ◽

Vol 22 (2) ◽

pp. 562

Author(s):

Marcin Kociołek ◽

Michał Kozłowski ◽

Antonio Cardone

Keyword(s):

Neural Networks ◽

Convolutional Neural Networks ◽

Visual Analysis ◽

Real Life ◽

Image Data ◽

Training Data ◽

Training Dataset ◽

Gradient Orientation ◽

Image Characteristic ◽

Occurrence Matrix

The perceived texture directionality is an important, not fully explored image characteristic. In many applications texture directionality detection is of fundamental importance. Several approaches have been proposed, such as the fast Fourier-based method. We recently proposed a method based on the interpolated grey-level co-occurrence matrix (iGLCM), robust to image blur and noise but slower than the Fourier-based method. Here we test the applicability of convolutional neural networks (CNNs) to texture directionality detection. To obtain the large amount of training data required, we built a training dataset consisting of synthetic textures with known directionality and varying perturbation levels. Subsequently, we defined and tested shallow and deep CNN architectures. We present the test results focusing on the CNN architectures and their robustness with respect to image perturbations. We identify the best performing CNN architecture, and compare it with the iGLCM, the Fourier and the local gradient orientation methods. We find that the accuracy of CNN is lower, yet comparable to the iGLCM, and it outperforms the other two methods. As expected, the CNN method shows the highest computing speed. Finally, we demonstrate the best performing CNN on real-life images. Visual analysis suggests that the learned patterns generalize to real-life image data. Hence, CNNs represent a promising approach for texture directionality detection, warranting further investigation.

Download Full-text

Automatic classification of ovarian cancer types from cytological images using deep convolutional neural networks

Bioscience Reports ◽

10.1042/bsr20180289 ◽

2018 ◽

Vol 38 (3) ◽

Cited By ~ 11

Author(s):

Miao Wu ◽

Chuanbo Yan ◽

Huiqiang Liu ◽

Qian Liu

Keyword(s):

Neural Networks ◽

Ovarian Cancer ◽

Convolutional Neural Networks ◽

Image Data ◽

Training Data ◽

Serous Carcinoma ◽

Deep Convolutional Neural Networks ◽

Ovarian Cancers ◽

Cancer Types

Ovarian cancer is one of the most common gynecologic malignancies. Accurate classification of ovarian cancer types (serous carcinoma, mucous carcinoma, endometrioid carcinoma, transparent cell carcinoma) is an essential part in the different diagnosis. Computer-aided diagnosis (CADx) can provide useful advice for pathologists to determine the diagnosis correctly. In our study, we employed a Deep Convolutional Neural Networks (DCNN) based on AlexNet to automatically classify the different types of ovarian cancers from cytological images. The DCNN consists of five convolutional layers, three max pooling layers, and two full reconnect layers. Then we trained the model by two group input data separately, one was original image data and the other one was augmented image data including image enhancement and image rotation. The testing results are obtained by the method of 10-fold cross-validation, showing that the accuracy of classification models has been improved from 72.76 to 78.20% by using augmented images as training data. The developed scheme was useful for classifying ovarian cancers from cytological images.

Download Full-text

Towards Storytelling with Animated Pictorial Map Objects – An Experiment with Convolutional Neural Networks

Abstracts of the ICA ◽

10.5194/ica-abs-1-324-2019 ◽

2019 ◽

Vol 1 ◽

pp. 1-2

Author(s):

Raimund Schnürer ◽

Cengiz Öztireli ◽

René Sieber ◽

Lorenz Hurni

Keyword(s):

Neural Networks ◽

Convolutional Neural Networks ◽

Positive Emotions ◽

Personal Narratives ◽

Training Data ◽

Training Dataset ◽

Body Parts ◽

Human Pose ◽

Map Objects ◽

Mixed Approach

Abstract. Storytelling is a popular technique applied in many fields including cartography. On the one hand, stories can be told intrinsically by map elements per se. An often quoted example in this regard is Minard’s map of Napoleon’s Russian Campaign (e.g. Denil 2017) which depicts the loss of troops in a spatio-temporally aligned Sankey diagram. On the other hand, stories can be conveyed extrinsically by multimedia elements aside the map. For instance, the travel route of a soldier during the First World War can be shown on a temporally navigable map and accompanied with photos, videos, diary entries, and military forms (Cartwright & Field 2015). In this experiment, we follow a mixed approach where human figures on the map will be animated and address the map reader via speech bubbles. As source data, we consider pictorial maps from digital map libraries (e.g. the David Rumsey Map Collection) and social media websites (e.g. Pinterest). These maps contain realistically drawn representations which are in our opinion very suitable for communicating personal narratives.We present a workflow with convolutional neural networks (CNNs), a type of artificial neural network primarily used for image recognition, to detect human figures in pictorial maps. In particular, we use Mask R-CNN (He et al. 2017) for identifying bounding boxes and silhouettes of figures. For the segmentation of body parts (i.e. head, torso, arms, hands, legs, feet) and the detection of joints (i.e. nose, thorax, shoulders, elbows, wrists, hip, knees, ankles), we combine the U-Net architecture (Ronneberger et al. 2015) with a ResNet (He et al. 2015). In a final step, we implement a simple 2Danimation of waving and walking characters and add speech bubbles near head positions. As a first training dataset, we created parametric SVG character models with different postures originating from the MPII Human Pose Dataset. The second training dataset contains real image human body parts from the PASCAL-Part Dataset. Humans from both datasets are placed randomly on pictorial maps without any other figures. Preliminary results show that the validation accuracy is the highest when synthetic and real training datasets are combined. We implemented the CNNs with TensorFlow’s keras API, whereas training data and animations are generated with the web browser.Our approach enables giving storytellers a physical presence and anchoring them spatially within the map. By animating characters, we can gain the map reader’s attention and guide him/her to special and possibly hidden places (e.g. in touristic maps). By telling personal stories, we may raise the interest of people to explore the maps (e.g. in museums) and give a better understanding of the often abstractly encoded information in maps (e.g. in atlases). When a certain aesthetic value has been reached, pictorial objects may also generate positive emotions so that anxieties about the complexity of data may become secondary (e.g. in education). Overall, the goal of our work is to engage map readers, give them valuable support while studying a map, and create long-lasting memories of the map content.

Download Full-text

Batch Similarity Based Triplet Loss Assembled into Light-Weighted Convolutional Neural Networks for Medical Image Classification

Sensors ◽

10.3390/s21030764 ◽

2021 ◽

Vol 21 (3) ◽

pp. 764

Author(s):

Zhiwen Huang ◽

Quan Zhou ◽

Xingxing Zhu ◽

Xuming Zhang

Keyword(s):

Neural Networks ◽

Image Classification ◽

Convolutional Neural Networks ◽

Medical Image ◽

Image Data ◽

Classification Performance ◽

Training Data ◽

Deep Convolutional Neural Networks ◽

Triplet Loss ◽

Medical Image Classification

In many medical image classification tasks, there is insufficient image data for deep convolutional neural networks (CNNs) to overcome the over-fitting problem. The light-weighted CNNs are easy to train but they usually have relatively poor classification performance. To improve the classification ability of light-weighted CNN models, we have proposed a novel batch similarity-based triplet loss to guide the CNNs to learn the weights. The proposed loss utilizes the similarity among multiple samples in the input batches to evaluate the distribution of training data. Reducing the proposed loss can increase the similarity among images of the same category and reduce the similarity among images of different categories. Besides this, it can be easily assembled into regular CNNs. To appreciate the performance of the proposed loss, some experiments have been done on chest X-ray images and skin rash images to compare it with several losses based on such popular light-weighted CNN models as EfficientNet, MobileNet, ShuffleNet and PeleeNet. The results demonstrate the applicability and effectiveness of our method in terms of classification accuracy, sensitivity and specificity.

Download Full-text

Interpretation of Swedish Sign Language Using Convolutional Neural Networks and Transfer Learning

SN Computer Science ◽

10.1007/s42979-021-00612-w ◽

2021 ◽

Vol 2 (3) ◽

Author(s):

Gustaf Halvardsson ◽

Johanna Peterson ◽

César Soto-Valero ◽

Benoit Baudry

Keyword(s):

Neural Networks ◽

Sign Language ◽

Transfer Learning ◽

Convolutional Neural Networks ◽

Web Application ◽

Training Dataset ◽

Motion Processing ◽

Image Perception ◽

Sign Languages ◽

High Level

AbstractThe automatic interpretation of sign languages is a challenging task, as it requires the usage of high-level vision and high-level motion processing systems for providing accurate image perception. In this paper, we use Convolutional Neural Networks (CNNs) and transfer learning to make computers able to interpret signs of the Swedish Sign Language (SSL) hand alphabet. Our model consists of the implementation of a pre-trained InceptionV3 network, and the usage of the mini-batch gradient descent optimization algorithm. We rely on transfer learning during the pre-training of the model and its data. The final accuracy of the model, based on 8 study subjects and 9400 images, is 85%. Our results indicate that the usage of CNNs is a promising approach to interpret sign languages, and transfer learning can be used to achieve high testing accuracy despite using a small training dataset. Furthermore, we describe the implementation details of our model to interpret signs as a user-friendly web application.

Download Full-text

Data Augmentation Methods Applying Grayscale Images for Convolutional Neural Networks in Machine Vision

Applied Sciences ◽

10.3390/app11156721 ◽

2021 ◽

Vol 11 (15) ◽

pp. 6721

Author(s):

Jinyeong Wang ◽

Sanghwan Lee

Keyword(s):

Neural Networks ◽

Machine Vision ◽

Object Detection ◽

Image Classification ◽

Convolutional Neural Networks ◽

Data Augmentation ◽

Image Data ◽

Manufacturing Productivity ◽

Smart Factories ◽

Grayscale Images

In increasing manufacturing productivity with automated surface inspection in smart factories, the demand for machine vision is rising. Recently, convolutional neural networks (CNNs) have demonstrated outstanding performance and solved many problems in the field of computer vision. With that, many machine vision systems adopt CNNs to surface defect inspection. In this study, we developed an effective data augmentation method for grayscale images in CNN-based machine vision with mono cameras. Our method can apply to grayscale industrial images, and we demonstrated outstanding performance in the image classification and the object detection tasks. The main contributions of this study are as follows: (1) We propose a data augmentation method that can be performed when training CNNs with industrial images taken with mono cameras. (2) We demonstrate that image classification or object detection performance is better when training with the industrial image data augmented by the proposed method. Through the proposed method, many machine-vision-related problems using mono cameras can be effectively solved by using CNNs.

Download Full-text

Generation and Annotation of Simulation-Real Ship Images for Convolutional Neural Networks Training and Testing

Applied Sciences ◽

10.3390/app11135931 ◽

2021 ◽

Vol 11 (13) ◽

pp. 5931

Author(s):

Ji’an You ◽

Zhaozheng Hu ◽

Chao Peng ◽

Zhiqiang Wang

Keyword(s):

Neural Networks ◽

Convolutional Neural Networks ◽

Image Annotation ◽

Three Dimensional ◽

Image Data ◽

Detection Algorithm ◽

Simulation Software ◽

High Quality ◽

Annotation Method ◽

Detection Of Objects

Large amounts of high-quality image data are the basis and premise of the high accuracy detection of objects in the field of convolutional neural networks (CNN). It is challenging to collect various high-quality ship image data based on the marine environment. A novel method based on CNN is proposed to generate a large number of high-quality ship images to address this. We obtained ship images with different perspectives and different sizes by adjusting the ships’ postures and sizes in three-dimensional (3D) simulation software, then 3D ship data were transformed into 2D ship image according to the principle of pinhole imaging. We selected specific experimental scenes as background images, and the target ships of the 2D ship images were superimposed onto the background images to generate “Simulation–Real” ship images (named SRS images hereafter). Additionally, an image annotation method based on SRS images was designed. Finally, the target detection algorithm based on CNN was used to train and test the generated SRS images. The proposed method is suitable for generating a large number of high-quality ship image samples and annotation data of corresponding ship images quickly to significantly improve the accuracy of ship detection. The annotation method proposed is superior to the annotation methods that label images with the image annotation software of Label-me and Label-img in terms of labeling the SRS images.

Download Full-text

Identifying Habitat Elements from Bird Images Using Deep Convolutional Neural Networks

Animals ◽

10.3390/ani11051263 ◽

2021 ◽

Vol 11 (5) ◽

pp. 1263

Author(s):

Zhaojun Wang ◽

Jiangning Wang ◽

Congtian Lin ◽

Yan Han ◽

Zhaosheng Wang ◽

...

Keyword(s):

Neural Networks ◽

Convolutional Neural Networks ◽

Digital Technology ◽

Rapid Development ◽

Image Data ◽

Image Database ◽

Test Accuracy ◽

Practical Application ◽

Accuracy Rate ◽

Deep Convolutional Neural Networks

With the rapid development of digital technology, bird images have become an important part of ornithology research data. However, due to the rapid growth of bird image data, it has become a major challenge to effectively process such a large amount of data. In recent years, deep convolutional neural networks (DCNNs) have shown great potential and effectiveness in a variety of tasks regarding the automatic processing of bird images. However, no research has been conducted on the recognition of habitat elements in bird images, which is of great help when extracting habitat information from bird images. Here, we demonstrate the recognition of habitat elements using four DCNN models trained end-to-end directly based on images. To carry out this research, an image database called Habitat Elements of Bird Images (HEOBs-10) and composed of 10 categories of habitat elements was built, making future benchmarks and evaluations possible. Experiments showed that good results can be obtained by all the tested models. ResNet-152-based models yielded the best test accuracy rate (95.52%); the AlexNet-based model yielded the lowest test accuracy rate (89.48%). We conclude that DCNNs could be efficient and useful for automatically identifying habitat elements from bird images, and we believe that the practical application of this technology will be helpful for studying the relationships between birds and habitat elements.

Download Full-text

Hybrid Mamdani Fuzzy Rules and Convolutional Neural Networks for Analysis and Identification of Animal Images

Computation ◽

10.3390/computation9030035 ◽

2021 ◽

Vol 9 (3) ◽

pp. 35

Author(s):

Hind R. Mohammed ◽

Zahir M. Hussain

Keyword(s):

Neural Networks ◽

Convolutional Neural Networks ◽

High Speed ◽

Moving Objects ◽

Real Life ◽

Fuzzy Rules ◽

High Rate ◽

Moving Images ◽

Animal Images

Accurate, fast, and automatic detection and classification of animal images is challenging, but it is much needed for many real-life applications. This paper presents a hybrid model of Mamdani Type-2 fuzzy rules and convolutional neural networks (CNNs) applied to identify and distinguish various animals using different datasets consisting of about 27,307 images. The proposed system utilizes fuzzy rules to detect the image and then apply the CNN model for the object’s predicate category. The CNN model was trained and tested based on more than 21,846 pictures of animals. The experiments’ results of the proposed method offered high speed and efficiency, which could be a prominent aspect in designing image-processing systems based on Type 2 fuzzy rules characterization for identifying fixed and moving images. The proposed fuzzy method obtained an accuracy rate for identifying and recognizing moving objects of 98% and a mean square error of 0.1183464 less than other studies. It also achieved a very high rate of correctly predicting malicious objects equal to recall = 0.98121 and a precision rate of 1. The test’s accuracy was evaluated using the F1 Score, which obtained a high percentage of 0.99052.

Download Full-text

Image Classification for the Automatic Feature Extraction in Human Worn Fashion Data

Mathematics ◽

10.3390/math9060624 ◽

2021 ◽

Vol 9 (6) ◽

pp. 624

Author(s):

Stefan Rohrmanstorfer ◽

Mikhail Komarov ◽

Felix Mödritscher

Keyword(s):

Neural Networks ◽

Feature Extraction ◽

Image Classification ◽

Convolutional Neural Networks ◽

Data Augmentation ◽

State Of The Art ◽

Image Data ◽

Classification Model ◽

Upper Body ◽

Automatic Feature Extraction

With the always increasing amount of image data, it has become a necessity to automatically look for and process information in these images. As fashion is captured in images, the fashion sector provides the perfect foundation to be supported by the integration of a service or application that is built on an image classification model. In this article, the state of the art for image classification is analyzed and discussed. Based on the elaborated knowledge, four different approaches will be implemented to successfully extract features out of fashion data. For this purpose, a human-worn fashion dataset with 2567 images was created, but it was significantly enlarged by the performed image operations. The results show that convolutional neural networks are the undisputed standard for classifying images, and that TensorFlow is the best library to build them. Moreover, through the introduction of dropout layers, data augmentation and transfer learning, model overfitting was successfully prevented, and it was possible to incrementally improve the validation accuracy of the created dataset from an initial 69% to a final validation accuracy of 84%. More distinct apparel like trousers, shoes and hats were better classified than other upper body clothes.

Download Full-text

Text Localization in Scientific Figures using Fully Convolutional Neural Networks on Limited Training Data

Proceedings of the ACM Symposium on Document Engineering 2019 - DocEng '19 ◽

10.1145/3342558.3345396 ◽

2019 ◽

Cited By ~ 1

Author(s):

Morten Jessen ◽

Falk Böschen ◽

Ansgar Scherp

Keyword(s):

Neural Networks ◽

Convolutional Neural Networks ◽

Training Data ◽

Text Localization ◽

Fully Convolutional Neural Networks

Download Full-text