scholarly journals Semantic Image Segmentation in Duckietown

2021 ◽  
Vol 19 (3) ◽  
pp. 26-39
Author(s):  
D. E. Shabalina ◽  
K. S. Lanchukovskaya ◽  
T. V. Liakh ◽  
K. V. Chaika

The article is devoted to evaluation of the applicability of existing semantic segmentation algorithms for the “Duckietown” simulator. The article explores classical semantic segmentation algorithms as well as ones based on neural networks. We also examined machine learning frameworks, taking into account all the limitations of the “Duckietown” simulator. According to the research results, we selected neural network algorithms based on U-Net, SegNet, DeepLab-v3, FC-DenceNet and PSPNet networks to solve the segmentation problem in the “Duckietown” project. U-Net and SegNet have been tested on the “Duckietown” simulator.

2020 ◽  
Vol 19 (01) ◽  
pp. 147-165 ◽  
Author(s):  
Fan Jia ◽  
Jun Liu ◽  
Xue-Cheng Tai

Convolutional neural networks (CNNs) have achieved prominent performance in a series of image processing problems. CNNs become the first choice for dense classification problems such as semantic segmentation. However, CNNs predict the class of each pixel independently in semantic segmentation tasks, spatial regularity of the segmented objects is still a problem for these methods. Especially when given few training data, CNN could not perform well in the details, isolated and scattered small regions often appear in all kinds of CNN segmentation results. In this paper, we propose a method to add spatial regularization to the segmented objects. In our method, the spatial regularization such as total variation (TV) can be easily integrated into CNN network and it produces smooth edges and eliminate isolated points. We apply our proposed method to Unet and Segnet, which are well-established CNNs for image segmentation, and test them on WBC and CamVid datasets, respectively. The results show that the details of predictions are well improved by regularized networks.


Author(s):  
Tomasz Rymarczyk ◽  
Barbara Stefaniak ◽  
Przemysław Adamkiewicz

The solution shows the architecture of the system collecting and analyzing data. There was tried to develop algorithms to image segmentation. These algorithms are needed to identify arbitrary number of phases for the segmentation problem. With the use of algorithms such as the level set method, neural networks and deep learning methods, it can obtain a quicker diagnosis and automatically marking areas of the interest region in medical images.


2018 ◽  
Vol 2018 (3) ◽  
pp. 123-142 ◽  
Author(s):  
Ehsan Hesamifard ◽  
Hassan Takabi ◽  
Mehdi Ghasemi ◽  
Rebecca N. Wright

Abstract Machine learning algorithms based on deep Neural Networks (NN) have achieved remarkable results and are being extensively used in different domains. On the other hand, with increasing growth of cloud services, several Machine Learning as a Service (MLaaS) are offered where training and deploying machine learning models are performed on cloud providers’ infrastructure. However, machine learning algorithms require access to the raw data which is often privacy sensitive and can create potential security and privacy risks. To address this issue, we present CryptoDL, a framework that develops new techniques to provide solutions for applying deep neural network algorithms to encrypted data. In this paper, we provide the theoretical foundation for implementing deep neural network algorithms in encrypted domain and develop techniques to adopt neural networks within practical limitations of current homomorphic encryption schemes. We show that it is feasible and practical to train neural networks using encrypted data and to make encrypted predictions, and also return the predictions in an encrypted form. We demonstrate applicability of the proposed CryptoDL using a large number of datasets and evaluate its performance. The empirical results show that it provides accurate privacy-preserving training and classification.


2020 ◽  
pp. paper71-1-paper71-12
Author(s):  
Aleksandr Markelov ◽  
Ivan Krivorotov ◽  
Vadim Gorbachev

Semantic segmentation is one of the important ways of extracting information about objects in images. State of the art neural network algorithms allow to perform highly accurate semantic segmentation of images, including aerial photos. However, in most of the works authors use high-quality low-noise images. In this work, we study the ability of neural networks to correctly segment images with intensive uncorrelated Gaussian noise. The study brings us three main conclusions. Firstly, it demonstrates that neural network algorithms are capable of working with extreme image distortions without using additional filtration or image recovery techniques. Secondly, the experiments quantitatively show that distortion intensity can be negated with increased training set size. Such process is similar to model’s quality improvement and generalization due to training dataset enlargement. Finally, we quantitatively demonstrate how image aggregation techniques affect training with noised data.


2021 ◽  
Vol 0 (0) ◽  
Author(s):  
Idris Kharroubi ◽  
Thomas Lim ◽  
Xavier Warin

AbstractWe study the approximation of backward stochastic differential equations (BSDEs for short) with a constraint on the gains process. We first discretize the constraint by applying a so-called facelift operator at times of a grid. We show that this discretely constrained BSDE converges to the continuously constrained one as the mesh grid converges to zero. We then focus on the approximation of the discretely constrained BSDE. For that we adopt a machine learning approach. We show that the facelift can be approximated by an optimization problem over a class of neural networks under constraints on the neural network and its derivative. We then derive an algorithm converging to the discretely constrained BSDE as the number of neurons goes to infinity. We end by numerical experiments.


2021 ◽  
Vol 26 (1) ◽  
pp. 200-215
Author(s):  
Muhammad Alam ◽  
Jian-Feng Wang ◽  
Cong Guangpei ◽  
LV Yunrong ◽  
Yuanfang Chen

AbstractIn recent years, the success of deep learning in natural scene image processing boosted its application in the analysis of remote sensing images. In this paper, we applied Convolutional Neural Networks (CNN) on the semantic segmentation of remote sensing images. We improve the Encoder- Decoder CNN structure SegNet with index pooling and U-net to make them suitable for multi-targets semantic segmentation of remote sensing images. The results show that these two models have their own advantages and disadvantages on the segmentation of different objects. In addition, we propose an integrated algorithm that integrates these two models. Experimental results show that the presented integrated algorithm can exploite the advantages of both the models for multi-target segmentation and achieve a better segmentation compared to these two models.


2021 ◽  
Vol 40 (3) ◽  
pp. 1-13
Author(s):  
Lumin Yang ◽  
Jiajie Zhuang ◽  
Hongbo Fu ◽  
Xiangzhi Wei ◽  
Kun Zhou ◽  
...  

We introduce SketchGNN , a convolutional graph neural network for semantic segmentation and labeling of freehand vector sketches. We treat an input stroke-based sketch as a graph with nodes representing the sampled points along input strokes and edges encoding the stroke structure information. To predict the per-node labels, our SketchGNN uses graph convolution and a static-dynamic branching network architecture to extract the features at three levels, i.e., point-level, stroke-level, and sketch-level. SketchGNN significantly improves the accuracy of the state-of-the-art methods for semantic sketch segmentation (by 11.2% in the pixel-based metric and 18.2% in the component-based metric over a large-scale challenging SPG dataset) and has magnitudes fewer parameters than both image-based and sequence-based methods.


2021 ◽  
Vol 4 (1) ◽  
Author(s):  
Peter M. Maloca ◽  
Philipp L. Müller ◽  
Aaron Y. Lee ◽  
Adnan Tufail ◽  
Konstantinos Balaskas ◽  
...  

AbstractMachine learning has greatly facilitated the analysis of medical data, while the internal operations usually remain intransparent. To better comprehend these opaque procedures, a convolutional neural network for optical coherence tomography image segmentation was enhanced with a Traceable Relevance Explainability (T-REX) technique. The proposed application was based on three components: ground truth generation by multiple graders, calculation of Hamming distances among graders and the machine learning algorithm, as well as a smart data visualization (‘neural recording’). An overall average variability of 1.75% between the human graders and the algorithm was found, slightly minor to 2.02% among human graders. The ambiguity in ground truth had noteworthy impact on machine learning results, which could be visualized. The convolutional neural network balanced between graders and allowed for modifiable predictions dependent on the compartment. Using the proposed T-REX setup, machine learning processes could be rendered more transparent and understandable, possibly leading to optimized applications.


2020 ◽  
Vol 39 (4) ◽  
pp. 5521-5534
Author(s):  
Ying Liu ◽  
Zhongqi Fan ◽  
Hongliang Qi

By establishing the evaluation system of emergency management capability for coal mine enterprises, we can identify the problems and shortcomings in coal mine emergency management, improve and improve its emergency management capability for coal mine emergencies. In this paper, the authors analyze the dynamic statistical evaluation of safety emergency management in coal enterprises based on neural network algorithms. Neural networks can form any form of topological structure through neurons, so they can directly simulate fuzzy reasoning in structure, that is to say, the equivalent structure of neural networks and fuzzy systems can be formed. This paper constructs the index system based on accident causes, and verifies the scientific rationality of the system. On this basis, according to the specific situation of coal mine emergency management, we design the evaluation criteria of coal mine emergency management capability evaluation index. Because coal mine accidents have the characteristics of complexity, variability and sudden dynamic, it is necessary to adjust and improve the accidents dynamically at any time. The model combines qualitative and quantitative indicators, and can make an overall evaluation of coal mine emergency management capability. It has the characteristics of clear results and strong fitting of simulation results.


Author(s):  
E. Yu. Shchetinin

The recognition of human emotions is one of the most relevant and dynamically developing areas of modern speech technologies, and the recognition of emotions in speech (RER) is the most demanded part of them. In this paper, we propose a computer model of emotion recognition based on an ensemble of bidirectional recurrent neural network with LSTM memory cell and deep convolutional neural network ResNet18. In this paper, computer studies of the RAVDESS database containing emotional speech of a person are carried out. RAVDESS-a data set containing 7356 files. Entries contain the following emotions: 0 – neutral, 1 – calm, 2 – happiness, 3 – sadness, 4 – anger, 5 – fear, 6 – disgust, 7 – surprise. In total, the database contains 16 classes (8 emotions divided into male and female) for a total of 1440 samples (speech only). To train machine learning algorithms and deep neural networks to recognize emotions, existing audio recordings must be pre-processed in such a way as to extract the main characteristic features of certain emotions. This was done using Mel-frequency cepstral coefficients, chroma coefficients, as well as the characteristics of the frequency spectrum of audio recordings. In this paper, computer studies of various models of neural networks for emotion recognition are carried out on the example of the data described above. In addition, machine learning algorithms were used for comparative analysis. Thus, the following models were trained during the experiments: logistic regression (LR), classifier based on the support vector machine (SVM), decision tree (DT), random forest (RF), gradient boosting over trees – XGBoost, convolutional neural network CNN, recurrent neural network RNN (ResNet18), as well as an ensemble of convolutional and recurrent networks Stacked CNN-RNN. The results show that neural networks showed much higher accuracy in recognizing and classifying emotions than the machine learning algorithms used. Of the three neural network models presented, the CNN + BLSTM ensemble showed higher accuracy.


Sign in / Sign up

Export Citation Format

Share Document