scholarly journals SoC FPGA Accelerated Sub-Optimized Binary Fully Convolutional Neural Network for Robotic Floor Region Segmentation

Sensors ◽  
2020 ◽  
Vol 20 (21) ◽  
pp. 6133
Author(s):  
Chi-Chia Sun ◽  
Afaroj Ahamad ◽  
Pin-He Liu

In this article, a new Binary Fully Convolutional Neural Network (B-FCN) based on Taguchi method sub-optimization for the segmentation of robotic floor regions, which can precisely distinguish floor regions in complex indoor environments is proposed. This methodology is quite suitable for robot vision in an embedded platform and the segmentation accuracy is up to 84.80% on average. A total of 6000 training datasets were used to improve the accuracy and reach convergence. On the other hand, to reach real-time computation, a PYNQ FPGA platform with heterogeneous computing acceleration was used to accelerate the proposed B-FCN architecture. Overall, robots would benefit from better navigation and route planning in our approach. The FPGA synthesis of our binarization method indicates an efficient reduction in the BRAM size to 0.5–1% and also GOPS/W is sufficiently high. Notably, the proposed faster architecture is ideal for low power embedded devices that need to solve the shortest path problem, path searching, and motion planning.

2021 ◽  
Vol 16 (1) ◽  
Author(s):  
Hideaki Hirashima ◽  
Mitsuhiro Nakamura ◽  
Pascal Baillehache ◽  
Yusuke Fujimoto ◽  
Shota Nakagawa ◽  
...  

Abstract Background This study aimed to (1) develop a fully residual deep convolutional neural network (CNN)-based segmentation software for computed tomography image segmentation of the male pelvic region and (2) demonstrate its efficiency in the male pelvic region. Methods A total of 470 prostate cancer patients who had undergone intensity-modulated radiotherapy or volumetric-modulated arc therapy were enrolled. Our model was based on FusionNet, a fully residual deep CNN developed to semantically segment biological images. To develop the CNN-based segmentation software, 450 patients were randomly selected and separated into the training, validation and testing groups (270, 90, and 90 patients, respectively). In Experiment 1, to determine the optimal model, we first assessed the segmentation accuracy according to the size of the training dataset (90, 180, and 270 patients). In Experiment 2, the effect of varying the number of training labels on segmentation accuracy was evaluated. After determining the optimal model, in Experiment 3, the developed software was used on the remaining 20 datasets to assess the segmentation accuracy. The volumetric dice similarity coefficient (DSC) and the 95th-percentile Hausdorff distance (95%HD) were calculated to evaluate the segmentation accuracy for each organ in Experiment 3. Results In Experiment 1, the median DSC for the prostate were 0.61 for dataset 1 (90 patients), 0.86 for dataset 2 (180 patients), and 0.86 for dataset 3 (270 patients), respectively. The median DSCs for all the organs increased significantly when the number of training cases increased from 90 to 180 but did not improve upon further increase from 180 to 270. The number of labels applied during training had a little effect on the DSCs in Experiment 2. The optimal model was built by 270 patients and four organs. In Experiment 3, the median of the DSC and the 95%HD values were 0.82 and 3.23 mm for prostate; 0.71 and 3.82 mm for seminal vesicles; 0.89 and 2.65 mm for the rectum; 0.95 and 4.18 mm for the bladder, respectively. Conclusions We have developed a CNN-based segmentation software for the male pelvic region and demonstrated that the CNN-based segmentation software is efficient for the male pelvic region.


2019 ◽  
Vol 2019 ◽  
pp. 1-10 ◽  
Author(s):  
Lin Teng ◽  
Hang Li ◽  
Shahid Karim

Medical image segmentation is one of the hot issues in the related area of image processing. Precise segmentation for medical images is a vital guarantee for follow-up treatment. At present, however, low gray contrast and blurred tissue boundaries are common in medical images, and the segmentation accuracy of medical images cannot be effectively improved. Especially, deep learning methods need more training samples, which lead to time-consuming process. Therefore, we propose a novelty model for medical image segmentation based on deep multiscale convolutional neural network (CNN) in this article. First, we extract the region of interest from the raw medical images. Then, data augmentation is operated to acquire more training datasets. Our proposed method contains three models: encoder, U-net, and decoder. Encoder is mainly responsible for feature extraction of 2D image slice. The U-net cascades the features of each block of the encoder with those obtained by deconvolution in the decoder under different scales. The decoding is mainly responsible for the upsampling of the feature graph after feature extraction of each group. Simulation results show that the new method can boost the segmentation accuracy. And, it has strong robustness compared with other segmentation methods.


2019 ◽  
Vol 13 (6) ◽  
pp. 796-802 ◽  
Author(s):  
Satoshi Yamane ◽  
◽  
Kouki Matsuo

Welding is an essential technology for joining metal plates. In general, gas metal arc welding (GMAW) generates a large amount of fumes in the welding of thick metal plates. In contrast, the butt joining of thick metal plates can be achieved using plasma arc welding (PAW) with a lower amount of fumes. Further, the improvement of the welding environment is critical in welding. In particular, if there are gaps between the base metals, the welding conditions are adjusted based on the gap. A visual sensor, such as a complementary metal-oxide-semiconductor (CMOS) camera, is useful for observing the welding situation. In this study, such a camera was attached to a plasma torch. During welding, we obtained weld pool images using the camera and detected the gaps by processing the images. As the arc light is very intense, it is difficult to obtain a clear image of the weld pool in PAW. In conventional welding, a constant current is used; however, pulsed welding current is used herein to obtain a clear image. The frequency of the current is 20 Hz, which indicates that the interval time is 50 ms. Moreover, the welding current was reduced to 30 A to minimize the effect of the intense arc light while the shutter of the CMOS camera was opened. The exposure time of the CMOS camera is 1 ms. Furthermore, gaps can be detected through image processing. It is necessary to identify the base metals with or without a gap. It was observed that the gap is darker than the solid area of the base metal. Moreover, a gap can be detected through the binarization method. The center area is not dark in the image of the weld pool without the gap. As the image of the weld pool is uneven without a gap, the binarization method can provide a detection result with some errors. Hence, it is challenging to identify whether there is a gap. A convolutional neural network (CNN) is useful for analyzing images. Thus, we applied a CNN to the weld pool image. If the gap is identified using the CNN, the binarization method is used to obtain the gap width. Hence, in PAW, welding conditions are adjusted based on the gap.


Sensors ◽  
2020 ◽  
Vol 20 (9) ◽  
pp. 2477 ◽  
Author(s):  
Kamal M. Othman ◽  
Ahmad B. Rad

In this paper, we propose a novel algorithm to detect a door and its orientation in indoor settings from the view of a social robot equipped with only a monocular camera. The challenge is to achieve this goal with only a 2D image from a monocular camera. The proposed system is designed through the integration of several modules, each of which serves a special purpose. The detection of the door is addressed by training a convolutional neural network (CNN) model on a new dataset for Social Robot Indoor Navigation (SRIN). The direction of the door (from the robot’s observation) is achieved by three other modules: Depth module, Pixel-Selection module, and Pixel2Angle module, respectively. We include simulation results and real-time experiments to demonstrate the performance of the algorithm. The outcome of this study could be beneficial in any robotic navigation system for indoor environments.


2017 ◽  
Vol 2017 ◽  
pp. 1-12 ◽  
Author(s):  
Jilin Zhang ◽  
Junfeng Xiao ◽  
Jian Wan ◽  
Jianhua Yang ◽  
Yongjian Ren ◽  
...  

With the development of the mobile systems, we gain a lot of benefits and convenience by leveraging mobile devices; at the same time, the information gathered by smartphones, such as location and environment, is also valuable for business to provide more intelligent services for customers. More and more machine learning methods have been used in the field of mobile information systems to study user behavior and classify usage patterns, especially convolutional neural network. With the increasing of model training parameters and data scale, the traditional single machine training method cannot meet the requirements of time complexity in practical application scenarios. The current training framework often uses simple data parallel or model parallel method to speed up the training process, which is why heterogeneous computing resources have not been fully utilized. To solve these problems, our paper proposes a delay synchronization convolutional neural network parallel strategy, which leverages the heterogeneous system. The strategy is based on both synchronous parallel and asynchronous parallel approaches; the model training process can reduce the dependence on the heterogeneous architecture in the premise of ensuring the model convergence, so the convolution neural network framework is more adaptive to different heterogeneous system environments. The experimental results show that the proposed delay synchronization strategy can achieve at least three times the speedup compared to the traditional data parallelism.


2021 ◽  
Vol 7 (3) ◽  
pp. 323
Author(s):  
Patrick Nicholas Hadinata ◽  
Djoni Simanta ◽  
Liyanto Eddy ◽  
Kohei Nagai

Maintenance of infrastructures is a crucial activity to ensure safety using crack detection methods on concrete structures. However, most practice of crack detection is carried out manually, which is unsafe, highly subjective, and time-consuming. Therefore, a more accurate and efficient system needs to be implemented using artificial intelligence. Convolutional neural network (CNN), a subset of artificial intelligence, is used to detect cracks on concrete surfaces through semantic image segmentation. The purpose of this research is to compare the effectiveness of cutting-edge encoder-decoder architectures in detecting cracks on concrete surfaces using U-Net and DeepLabV3+ architectures with potential in biomedical, and sparse multiscale image segmentations, respectively. Neural networks were trained using cloud computing with a high-performance Graphics Processing Unit NVIDIA Tesla V100 and 27.4 GB of RAM. This study used internal and external data. Internal data consisted of simple cracks and were used as the training and validation data. Meanwhile, external data consisted of more complex cracks, which were used for further testing. Both architectures were compared based on four evaluation metrics in terms of accuracy, F1, precision, and recall. U-Net achieved segmentation accuracy = 96.57%, F1 = 87.55%, precision = 88.15%, and recall = 88.94%, while DeepLabV3+ achieved segmentation accuracy = 96.47%, F1 = 85.29%, precision = 92.07%, and recall = 81.84%. Experiment results (internal and external data) indicated that both architectures were accurate and effective in segmenting cracks. Additionally, U-Net and DeepLabV3+ exceeded the performance of previously tested architecture, namely FCN.


2021 ◽  
Vol 2021 ◽  
pp. 1-8
Author(s):  
Yanyan Pan ◽  
Huiping Zhang ◽  
Jinsuo Yang ◽  
Jing Guo ◽  
Zhiguo Yang ◽  
...  

This study aimed to explore the application value of multimodal magnetic resonance imaging (MRI) images based on the deep convolutional neural network (Conv.Net) in the diagnosis of strokes. Specifically, four automatic segmentation algorithms were proposed to segment multimodal MRI images of stroke patients. The segmentation effects were evaluated factoring into DICE, accuracy, sensitivity, and segmentation distance coefficient. It was found that although two-dimensional (2D) full convolutional neural network-based segmentation algorithm can locate and segment the lesion, its accuracy was low; the three-dimensional one exhibited higher accuracy, with various objective indicators improved, and the segmentation accuracy of the training set and the test set was 0.93 and 0.79, respectively, meeting the needs of automatic diagnosis. The asymmetric 3D residual U-Net network had good convergence and high segmentation accuracy, and the 3D deep residual network proposed on its basis had good segmentation coefficients, which can not only ensure segmentation accuracy but also avoid network degradation problems. In conclusion, the Conv.Net model can accurately segment the foci of patients with ischemic stroke and is suggested in clinic.


2020 ◽  
Vol 64 (4) ◽  
pp. 40401-1-40401-9 ◽  
Author(s):  
Ga Young Kim ◽  
Sang Hyeok Lee ◽  
Sung Min Kim

Abstract This study proposed a novel intensity weighting approach using a convolutional neural network (CNN) for fast and accurate optic disc (OD) segmentation in a fundus image. The proposed method mainly consisted of three steps involving CNN-based importance calculation of pixel, image reconstruction, and OD segmentation. In the first step, the CNN model composed of four convolution and pooling layers was designed and trained. Then, the heat map was generated by applying a gradient-weighted class activation map algorithm to the final convolution layer of the model. In the next step, each of the pixels on the image was assigned a weight based on the previously obtained heat map. In addition, the retinal vessel that may interfere with OD segmentation was detected and substituted based on the nearest neighbor pixels. Finally, the OD region was segmented using Otsu’s method. As a result, the proposed method achieved a high segmentation accuracy of 98.61%, which was improved about 4.61% than the result without the weight assignment.


2021 ◽  
Vol 3 ◽  
Author(s):  
James Ren Lee ◽  
Linda Wang ◽  
Alexander Wong

While recent advances in deep learning have led to significant improvements in facial expression classification (FEC), a major challenge that remains a bottleneck for the widespread deployment of such systems is their high architectural and computational complexities. This is especially challenging given the operational requirements of various FEC applications, such as safety, marketing, learning, and assistive living, where real-time requirements on low-cost embedded devices is desired. Motivated by this need for a compact, low latency, yet accurate system capable of performing FEC in real-time on low-cost embedded devices, this study proposes EmotionNet Nano, an efficient deep convolutional neural network created through a human-machine collaborative design strategy, where human experience is combined with machine meticulousness and speed in order to craft a deep neural network design catered toward real-time embedded usage. To the best of the author’s knowledge, this is the very first deep neural network architecture for facial expression recognition leveraging machine-driven design exploration in its design process, and exhibits unique architectural characteristics such as high architectural heterogeneity and selective long-range connectivity not seen in previous FEC network architectures. Two different variants of EmotionNet Nano are presented, each with a different trade-off between architectural and computational complexity and accuracy. Experimental results using the CK + facial expression benchmark dataset demonstrate that the proposed EmotionNet Nano networks achieved accuracy comparable to state-of-the-art FEC networks, while requiring significantly fewer parameters. Furthermore, we demonstrate that the proposed EmotionNet Nano networks achieved real-time inference speeds (e.g., >25 FPS and >70 FPS at 15 and 30 W, respectively) and high energy efficiency (e.g., >1.7 images/sec/watt at 15 W) on an ARM embedded processor, thus further illustrating the efficacy of EmotionNet Nano for deployment on embedded devices.


Sign in / Sign up

Export Citation Format

Share Document