scholarly journals Large-Scale Whale-Call Classification by Transfer Learning on Multi-Scale Waveforms and Time-Frequency Features

2019 ◽  
Vol 9 (5) ◽  
pp. 1020 ◽  
Author(s):  
Lilun Zhang ◽  
Dezhi Wang ◽  
Changchun Bao ◽  
Yongxian Wang ◽  
Kele Xu

Whale vocal calls contain valuable information and abundant characteristics that are important for classification of whale sub-populations and related biological research. In this study, an effective data-driven approach based on pre-trained Convolutional Neural Networks (CNN) using multi-scale waveforms and time-frequency feature representations is developed in order to perform the classification of whale calls from a large open-source dataset recorded by sensors carried by whales. Specifically, the classification is carried out through a transfer learning approach by using pre-trained state-of-the-art CNN models in the field of computer vision. 1D raw waveforms and 2D log-mel features of the whale-call data are respectively used as the input of CNN models. For raw waveform input, windows are applied to capture multiple sketches of a whale-call clip at different time scales and stack the features from different sketches for classification. When using the log-mel features, the delta and delta-delta features are also calculated to produce a 3-channel feature representation for analysis. In the training, a 4-fold cross-validation technique is employed to reduce the overfitting effect, while the Mix-up technique is also applied to implement data augmentation in order to further improve the system performance. The results show that the proposed method can improve the accuracies by more than 20% in percentage for the classification into 16 whale pods compared with the baseline method using groups of 2D shape descriptors of spectrograms and the Fisher discriminant scores on the same dataset. Moreover, it is shown that classifications based on log-mel features have higher accuracies than those based directly on raw waveforms. The phylogeny graph is also produced to significantly illustrate the relationships among the whale sub-populations.

2021 ◽  
Vol 11 (9) ◽  
pp. 3974
Author(s):  
Laila Bashmal ◽  
Yakoub Bazi ◽  
Mohamad Mahmoud Al Rahhal ◽  
Haikel Alhichri ◽  
Naif Al Ajlan

In this paper, we present an approach for the multi-label classification of remote sensing images based on data-efficient transformers. During the training phase, we generated a second view for each image from the training set using data augmentation. Then, both the image and its augmented version were reshaped into a sequence of flattened patches and then fed to the transformer encoder. The latter extracts a compact feature representation from each image with the help of a self-attention mechanism, which can handle the global dependencies between different regions of the high-resolution aerial image. On the top of the encoder, we mounted two classifiers, a token and a distiller classifier. During training, we minimized a global loss consisting of two terms, each corresponding to one of the two classifiers. In the test phase, we considered the average of the two classifiers as the final class labels. Experiments on two datasets acquired over the cities of Trento and Civezzano with a ground resolution of two-centimeter demonstrated the effectiveness of the proposed model.


2021 ◽  
pp. 1-10
Author(s):  
Gayatri Pattnaik ◽  
Vimal K. Shrivastava ◽  
K. Parvathi

Pests are major threat to economic growth of a country. Application of pesticide is the easiest way to control the pest infection. However, excessive utilization of pesticide is hazardous to environment. The recent advances in deep learning have paved the way for early detection and improved classification of pest in tomato plants which will benefit the farmers. This paper presents a comprehensive analysis of 11 state-of-the-art deep convolutional neural network (CNN) models with three configurations: transfers learning, fine-tuning and scratch learning. The training in transfer learning and fine tuning initiates from pre-trained weights whereas random weights are used in case of scratch learning. In addition, the concept of data augmentation has been explored to improve the performance. Our dataset consists of 859 tomato pest images from 10 categories. The results demonstrate that the highest classification accuracy of 94.87% has been achieved in the transfer learning approach by DenseNet201 model with data augmentation.


2020 ◽  
Vol 2020 ◽  
pp. 1-11
Author(s):  
Gong-Xu Luo ◽  
Ya-Ting Yang ◽  
Rui Dong ◽  
Yan-Hong Chen ◽  
Wen-Bo Zhang

Neural machine translation (NMT) for low-resource languages has drawn great attention in recent years. In this paper, we propose a joint back-translation and transfer learning method for low-resource languages. It is widely recognized that data augmentation methods and transfer learning methods are both straight forward and effective ways for low-resource problems. However, existing methods, which utilize one of these methods alone, limit the capacity of NMT models for low-resource problems. In order to make full use of the advantages of existing methods and further improve the translation performance of low-resource languages, we propose a new method to perfectly integrate the back-translation method with mainstream transfer learning architectures, which can not only initialize the NMT model by transferring parameters of the pretrained models, but also generate synthetic parallel data by translating large-scale monolingual data of the target side to boost the fluency of translations. We conduct experiments to explore the effectiveness of the joint method by incorporating back-translation into the parent-child and the hierarchical transfer learning architecture. In addition, different preprocessing and training methods are explored to get better performance. Experimental results on Uygur-Chinese and Turkish-English translation demonstrate the superiority of the proposed method over the baselines that use single methods.


Author(s):  
Na Wu ◽  
Fei Liu ◽  
Fanjia Meng ◽  
Mu Li ◽  
Chu Zhang ◽  
...  

Rapid varieties classification of crop seeds is significant for breeders to screen out seeds with specific traits and market regulators to detect seed purity. However, collecting high-quality, large-scale samples takes high costs in some cases, making it difficult to build an accurate classification model. This study aimed to explore a rapid and accurate method for varieties classification of different crop seeds under the sample-limited condition based on hyperspectral imaging (HSI) and deep transfer learning. Three deep neural networks with typical structures were designed based on a sample-rich Pea dataset. Obtained the highest accuracy of 99.57%, VGG-MODEL was transferred to classify four target datasets (rice, oat, wheat, and cotton) with limited samples. Accuracies of the deep transferred model achieved 95, 99, 80.8, and 83.86% on the four datasets, respectively. Using training sets with different sizes, the deep transferred model could always obtain higher performance than other traditional methods. The visualization of the deep features and classification results confirmed the portability of the shared features of seed spectra, providing an interpreted method for rapid and accurate varieties classification of crop seeds. The overall results showed great superiority of HSI combined with deep transfer learning for seed detection under sample-limited condition. This study provided a new idea for facilitating a crop germplasm screening process under the scenario of sample scarcity and the detection of other qualities of crop seeds under sample-limited condition based on HSI.


Author(s):  
J. Yan ◽  
E. Guilbert ◽  
E. Saux

On nautical charts, undersea features are portrayed by sets of soundings (depth points) and isobaths (depth contours) from which map readers can interpret landforms. Different techniques were developed for automatic soundings selection and isobath generalisation from a sounding set. These methods are mainly used to generate a new chart from the bathymetric database or from a large scale chart through selection and simplification however a part of the process consists in selecting and emphasising undersea features on the chart according to their relevance to navigation. Its automation requires classification of the features from the set of isobaths and soundings and their generalisation through the selection and application of a set of operators according not only to geometrical constraints but also to semantic constraints. <br><br> The objective of this paper is to define an ontology formalising undersea feature representation and the generalisation process achieving this representation on a nautical chart. The ontology is built in two parts addressing on one hand the definition of the features and on the other hand their generalisation. The central concept is the undersea feature around which other concepts are organised. The generalisation process is driven by the features where the objective is to select or emphasise information according to their meaning for a specific purpose. The ontologies were developed in Protégé and a bathymetric database server integrating the ontology was implemented. A generalisation platform was also developed and examples of representations obtained by the platform are presented. Finally, current results and on-going research are discussed.


Author(s):  
Qaiser Abbas ◽  
Farheen Ramzan ◽  
Muhammad Usman Ghani

AbstractAcral melanoma (AM) is a rare and lethal type of skin cancer. It can be diagnosed by expert dermatologists, using dermoscopic imaging. It is challenging for dermatologists to diagnose melanoma because of the very minor differences between melanoma and non-melanoma cancers. Most of the research on skin cancer diagnosis is related to the binary classification of lesions into melanoma and non-melanoma. However, to date, limited research has been conducted on the classification of melanoma subtypes. The current study investigated the effectiveness of dermoscopy and deep learning in classifying melanoma subtypes, such as, AM. In this study, we present a novel deep learning model, developed to classify skin cancer. We utilized a dermoscopic image dataset from the Yonsei University Health System South Korea for the classification of skin lesions. Various image processing and data augmentation techniques have been applied to develop a robust automated system for AM detection. Our custom-built model is a seven-layered deep convolutional network that was trained from scratch. Additionally, transfer learning was utilized to compare the performance of our model, where AlexNet and ResNet-18 were modified, fine-tuned, and trained on the same dataset. We achieved improved results from our proposed model with an accuracy of more than 90 % for AM and benign nevus, respectively. Additionally, using the transfer learning approach, we achieved an average accuracy of nearly 97 %, which is comparable to that of state-of-the-art methods. From our analysis and results, we found that our model performed well and was able to effectively classify skin cancer. Our results show that the proposed system can be used by dermatologists in the clinical decision-making process for the early diagnosis of AM.


2020 ◽  
Author(s):  
Than Le

In this paper, we focus on simple data-driven approach to solve deep learning based on implementing the Mask R-CNN module by analyzing deeper manipulation of datasets. We firstly approach to affine transformation and projective representation to data augmentation analysis in order to increasing large-scale data manually based on the state-of-the-art in views of computer vision. Then we evaluate our method concretely by connection our datasets by visualization data and completely in testing to many methods to understand intelligent data analysis in object detection and segmentation by using more than 5000 image according to many similar objects. As far as, it illustrated efficiency of small applications such as food recognition, grasp and manipulation in robotics<br>


2021 ◽  
Vol 11 (1) ◽  
pp. 23
Author(s):  
Ozgun Akcay ◽  
Ahmet Cumhur Kinaci ◽  
Emin Ozgur Avsar ◽  
Umut Aydar

In geospatial applications such as urban planning and land use management, automatic detection and classification of earth objects are essential and primary subjects. When the significant semantic segmentation algorithms are considered, DeepLabV3+ stands out as a state-of-the-art CNN. Although the DeepLabV3+ model is capable of extracting multi-scale contextual information, there is still a need for multi-stream architectural approaches and different training approaches of the model that can leverage multi-modal geographic datasets. In this study, a new end-to-end dual-stream architecture that considers geospatial imagery was developed based on the DeepLabV3+ architecture. As a result, the spectral datasets other than RGB provided increments in semantic segmentation accuracies when they were used as additional channels to height information. Furthermore, both the given data augmentation and Tversky loss function which is sensitive to imbalanced data accomplished better overall accuracies. Also, it has been shown that the new dual-stream architecture using Potsdam and Vaihingen datasets produced 88.87% and 87.39% overall semantic segmentation accuracies, respectively. Eventually, it was seen that enhancement of the traditional significant semantic segmentation networks has a great potential to provide higher model performances, whereas the contribution of geospatial data as the second stream to RGB to segmentation was explicitly shown.


Sign in / Sign up

Export Citation Format

Share Document