scholarly journals Improving a neural network model for semantic segmentation of images of monitored objects in aerial photographs

2021 ◽  
Vol 6 (2 (114)) ◽  
pp. 86-95
Author(s):  
Vadym Slyusar ◽  
Mykhailo Protsenko ◽  
Anton Chernukha ◽  
Vasyl Melkin ◽  
Olena Petrova ◽  
...  

This paper considers a model of the neural network for semantically segmenting the images of monitored objects on aerial photographs. Unmanned aerial vehicles monitor objects by analyzing (processing) aerial photographs and video streams. The results of aerial photography are processed by the operator in a manual mode; however, there are objective difficulties associated with the operator's handling a large number of aerial photographs, which is why it is advisable to automate this process. Analysis of the models showed that to perform the task of semantic segmentation of images of monitored objects on aerial photographs, the U-Net model (Germany), which is a convolutional neural network, is most suitable as a basic model. This model has been improved by using a wavelet layer and the optimal values of the model training parameters: speed (step) ‒ 0.001, the number of epochs ‒ 60, the optimization algorithm ‒ Adam. The training was conducted by a set of segmented images acquired from aerial photographs (with a resolution of 6,000×4,000 pixels) by the Image Labeler software in the mathematical programming environment MATLAB R2020b (USA). As a result, a new model for semantically segmenting the images of monitored objects on aerial photographs with the proposed name U-NetWavelet was built. The effectiveness of the improved model was investigated using an example of processing 80 aerial photographs. The accuracy, sensitivity, and segmentation error were selected as the main indicators of the model's efficiency. The use of a modified wavelet layer has made it possible to adapt the size of an aerial photograph to the parameters of the input layer of the neural network, to improve the efficiency of image segmentation in aerial photographs; the application of a convolutional neural network has allowed this process to be automatic.

2020 ◽  
Vol 1682 ◽  
pp. 012077
Author(s):  
Tingting Li ◽  
Chunshan Jiang ◽  
Zhenqi Bian ◽  
Mingchang Wang ◽  
Xuefeng Niu

2021 ◽  
Vol 3 (1) ◽  
pp. 8-14
Author(s):  
D. V. Fedasyuk ◽  
◽  
T. V. Demianets ◽  

A melanoma is the deadliest skin cancer, so early diagnosis can provide a positive prognosis for treatment. Modern methods for early detecting melanoma on the image of the tumor are considered, and their advantages and disadvantages are analyzed. The article demonstrates a prototype of a mobile application for the detection of melanoma on the image of a mole based on a convolutional neural network, which is developed for the Android operating system. The mobile application contains melanoma detection functions, history of the previous examinations and a gallery with images of the previous examinations grouped by the location of the lesion. The HAM10000-based training dataset has been supplemented with the images of melanoma from the archive of The International Skin Imaging Collaboration to eliminate class imbalances and improve network accuracy. The search for existing neural networks that provide high accuracy was conducted, and VGG16, MobileNet, and NASNetMobile neural networks have been selected for research. Transfer learning and fine-tuning has been applied to the given neural networks to adapt the networks for the task of skin lesion classification. It is established that the use of these techniques allows to obtain high accuracy of the neural network for this task. The process of converting a convolutional neural network to an optimized Flatbuffer format using TensorFlow Lite for placement and use on a mobile device is described. The performance characteristics of the selected neural networks on the mobile device are evaluated according to the classification time on the CPU and GPU and the amount of memory occupied by the file of a single network is compared. The neural network file size was compared before and after conversion. It has been shown that the use of the TensorFlow Lite converter significantly reduces the file size of the neural network without affecting its accuracy by using an optimized format. The results of the study indicate a high speed of application and compactness of networks on the device, and the use of graphical acceleration can significantly decrease the image classification time of the tumor. According to the analyzed parameters, NASNetMobile was selected as the optimal neural network to be used in the mobile application of melanoma detection.


2021 ◽  
Vol 7 (10) ◽  
pp. 850
Author(s):  
Veena Mayya ◽  
Sowmya Kamath Shevgoor ◽  
Uma Kulkarni ◽  
Manali Hazarika ◽  
Prabal Datta Barua ◽  
...  

Microbial keratitis is an infection of the cornea of the eye that is commonly caused by prolonged contact lens wear, corneal trauma, pre-existing systemic disorders and other ocular surface disorders. It can result in severe visual impairment if improperly managed. According to the latest World Vision Report, at least 4.2 million people worldwide suffer from corneal opacities caused by infectious agents such as fungi, bacteria, protozoa and viruses. In patients with fungal keratitis (FK), often overt symptoms are not evident, until an advanced stage. Furthermore, it has been reported that clear discrimination between bacterial keratitis and FK is a challenging process even for trained corneal experts and is often misdiagnosed in more than 30% of the cases. However, if diagnosed early, vision impairment can be prevented through early cost-effective interventions. In this work, we propose a multi-scale convolutional neural network (MS-CNN) for accurate segmentation of the corneal region to enable early FK diagnosis. The proposed approach consists of a deep neural pipeline for corneal region segmentation followed by a ResNeXt model to differentiate between FK and non-FK classes. The model trained on the segmented images in the region of interest, achieved a diagnostic accuracy of 88.96%. The features learnt by the model emphasize that it can correctly identify dominant corneal lesions for detecting FK.


2021 ◽  
Vol 5 (2) ◽  
pp. 312-318
Author(s):  
Rima Dias Ramadhani ◽  
Afandi Nur Aziz Thohari ◽  
Condro Kartiko ◽  
Apri Junaidi ◽  
Tri Ginanjar Laksana ◽  
...  

Waste is goods / materials that have no value in the scope of production, where in some cases the waste is disposed of carelessly and can damage the environment. The Indonesian government in 2019 recorded waste reaching 66-67 million tons, which is higher than the previous year, which was 64 million tons. Waste is differentiated based on its type, namely organic and anorganic waste. In the field of computer science, the process of sensing the type waste can be done using a camera and the Convolutional Neural Networks (CNN) method, which is a type of neural network that works by receiving input in the form of images. The input will be trained using CNN architecture so that it will produce output that can recognize the object being inputted. This study optimizes the use of the CNN method to obtain accurate results in identifying types of waste. Optimization is done by adding several hyperparameters to the CNN architecture. By adding hyperparameters, the accuracy value is 91.2%. Meanwhile, if the hyperparameter is not used, the accuracy value is only 67.6%. There are three hyperparameters used to increase the accuracy value of the model. They are dropout, padding, and stride. 20% increase in dropout to increase training overfit. Whereas padding and stride are used to speed up the model training process.


2020 ◽  
Vol 36 (3) ◽  
pp. 1166-1187 ◽  
Author(s):  
Shohei Naito ◽  
Hiromitsu Tomozawa ◽  
Yuji Mori ◽  
Takeshi Nagata ◽  
Naokazu Monma ◽  
...  

This article presents a method for detecting damaged buildings in the event of an earthquake using machine learning models and aerial photographs. We initially created training data for machine learning models using aerial photographs captured around the town of Mashiki immediately after the main shock of the 2016 Kumamoto earthquake. All buildings are classified into one of the four damage levels by visual interpretation. Subsequently, two damage discrimination models are developed: a bag-of-visual-words model and a model based on a convolutional neural network. Results are compared and validated in terms of accuracy, revealing that the latter model is preferable. Moreover, for the convolutional neural network model, the target areas are expanded and the recalls of damage classification at the four levels range approximately from 66% to 81%.


Sign in / Sign up

Export Citation Format

Share Document