scholarly journals BE-FNet: 3D Bounding Box Estimation Feature Pyramid Network for Accurate and Efficient Maxillary Sinus Segmentation

2020 ◽  
Vol 2020 ◽  
pp. 1-16
Author(s):  
Zhuofu Deng ◽  
Binbin Wang ◽  
Zhiliang Zhu

Maxillary sinus segmentation plays an important role in the choice of therapeutic strategies for nasal disease and treatment monitoring. Difficulties in traditional approaches deal with extremely heterogeneous intensity caused by lesions, abnormal anatomy structures, and blurring boundaries of cavity. 2D and 3D deep convolutional neural networks have grown popular in medical image segmentation due to utilization of large labeled datasets to learn discriminative features. However, for 3D segmentation in medical images, 2D networks are not competent in extracting more significant spacial features, and 3D ones suffer from unbearable burden of computation, which results in great challenges to maxillary sinus segmentation. In this paper, we propose a deep neural network with an end-to-end manner to generalize a fully automatic 3D segmentation. At first, our proposed model serves a symmetrical encoder-decoder architecture for multitask of bounding box estimation and in-region 3D segmentation, which cannot reduce excessive computation requirements but eliminate false positives remarkably, promoting 3D segmentation applied in 3D convolutional neural networks. In addition, an overestimation strategy is presented to avoid overfitting phenomena in conventional multitask networks. Meanwhile, we introduce residual dense blocks to increase the depth of the proposed network and attention excitation mechanism to improve the performance of bounding box estimation, both of which bring little influence to computation cost. Especially, the structure of multilevel feature fusion in the pyramid network strengthens the ability of identification to global and local discriminative features in foreground and background achieving more advanced segmentation results. At last, to address problems of blurring boundary and class imbalance in medical images, a hybrid loss function is designed for multiple tasks. To illustrate the strength of our proposed model, we evaluated it against the state-of-the-art methods. Our model performed better significantly with an average Dice 0.947±0.031, VOE 10.23±5.29, and ASD 2.86±2.11, respectively, which denotes a promising technique with strong robust in practice.

Sensors ◽  
2021 ◽  
Vol 21 (11) ◽  
pp. 3874
Author(s):  
Nagesh Subbanna ◽  
Matthias Wilms ◽  
Anup Tuladhar ◽  
Nils D. Forkert

Recent research in computer vision has shown that original images used for training of deep learning models can be reconstructed using so-called inversion attacks. However, the feasibility of this attack type has not been investigated for complex 3D medical images. Thus, the aim of this study was to examine the vulnerability of deep learning techniques used in medical imaging to model inversion attacks and investigate multiple quantitative metrics to evaluate the quality of the reconstructed images. For the development and evaluation of model inversion attacks, the public LPBA40 database consisting of 40 brain MRI scans with corresponding segmentations of the gyri and deep grey matter brain structures were used to train two popular deep convolutional neural networks, namely a U-Net and SegNet, and corresponding inversion decoders. Matthews correlation coefficient, the structural similarity index measure (SSIM), and the magnitude of the deformation field resulting from non-linear registration of the original and reconstructed images were used to evaluate the reconstruction accuracy. A comparison of the similarity metrics revealed that the SSIM is best suited to evaluate the reconstruction accuray, followed closely by the magnitude of the deformation field. The quantitative evaluation of the reconstructed images revealed SSIM scores of 0.73±0.12 and 0.61±0.12 for the U-Net and the SegNet, respectively. The qualitative evaluation showed that training images can be reconstructed with some degradation due to blurring but can be correctly matched to the original images in the majority of the cases. In conclusion, the results of this study indicate that it is possible to reconstruct patient data used for training of convolutional neural networks and that the SSIM is a good metric to assess the reconstruction accuracy.


2020 ◽  
Vol 28 (2) ◽  
pp. 113-120 ◽  
Author(s):  
Norio Hayashi ◽  
Tomoko Maruyama ◽  
Yusuke Sato ◽  
Haruyuki Watanabe ◽  
Toshihiro Ogura ◽  
...  

Author(s):  
Sangeun Kum ◽  
Juhan Nam

Singing melody extraction is the task that identifies the melody pitch contour of singing voice from polyphonic music. Most of the traditional melody extraction algorithms are based on calculating salient pitch candidates or separating the melody source from the mixture. Recently, classification-based approach based on deep learning has drawn much attentions. In this paper, we present a classification-based singing melody extraction model using deep convolutional neural networks. The proposed model consists of a singing pitch extractor (SPE) and a singing voice activity detector (SVAD). The SPE is trained to predict a high-resolution pitch label of singing voice from a short segment of spectrogram. This allows the model to predict highly continuous curves. The melody contour is smoothed further by post-processing the output of the melody extractor. The SVAD is trained to determine if a long segment of mel-spectrogram contains a singing voice. This often produces voice false alarm errors around the boundary of singing segments. We reduced them by exploiting the output of the SPE. Finally, we evaluate the proposed melody extraction model on several public datasets. The results show that the proposed model is comparable to state-of-the-art algorithms.


Sensors ◽  
2019 ◽  
Vol 19 (6) ◽  
pp. 1343 ◽  
Author(s):  
Akmaljon Palvanov ◽  
Young Cho

Visibility is a complex phenomenon inspired by emissions and air pollutants or by factors, including sunlight, humidity, temperature, and time, which decrease the clarity of what is visible through the atmosphere. This paper provides a detailed overview of the state-of-the-art contributions in relation to visibility estimation under various foggy weather conditions. We propose VisNet, which is a new approach based on deep integrated convolutional neural networks for the estimation of visibility distances from camera imagery. The implemented network uses three streams of deep integrated convolutional neural networks, which are connected in parallel. In addition, we have collected the largest dataset with three million outdoor images and exact visibility values for this study. To evaluate the model’s performance fairly and objectively, the model is trained on three image datasets with different visibility ranges, each with a different number of classes. Moreover, our proposed model, VisNet, evaluated under dissimilar fog density scenarios, uses a diverse set of images. Prior to feeding the network, each input image is filtered in the frequency domain to remove low-level features, and a spectral filter is applied to each input for the extraction of low-contrast regions. Compared to the previous methods, our approach achieves the highest performance in terms of classification based on three different datasets. Furthermore, our VisNet considerably outperforms not only the classical methods, but also state-of-the-art models of visibility estimation.


2020 ◽  
Vol 2020 ◽  
pp. 1-14
Author(s):  
Marco Antonio Aceves-Fernández ◽  
Ricardo Domínguez-Guevara ◽  
Jesus Carlos Pedraza-Ortega ◽  
José Emilio Vargas-Soto

Particulate matter with a diameter less than 10 micrometers (PM10) is today an important subject of study, mainly because of its increasing concentration and its impact on environment and public health. This article summarizes the usage of convolutional neural networks (CNNs) to forecast PM10 concentrations based on atmospheric variables. In this particular case-study, the use of deep convolutional neural networks (both 1D and 2D) was explored to probe the feasibility of these techniques in prediction tasks. Furthermore, in this contribution, an ensemble method called Bagging (BEM) is used to improve the accuracy of the prediction model. Lastly, a well-known technique for PM10 forecasting, called multilayer perceptron (MLP) is used as a comparison to show the feasibility, accuracy, and robustness of the proposed model. In this contribution, it was found that the CNNs outperforms MLP, especially when they are executed using ensemble models.


Author(s):  
Chayan Mondal ◽  
Md. Kamrul Hasan ◽  
Md. Tasnim Jawad ◽  
Aishwariya Dutta ◽  
Md. Rabiul Islam ◽  
...  

Although automated Acute Lymphoblastic Leukemia (ALL) detection is essential, it is challenging due to the morphological correlation between malignant and normal cells. The traditional ALL classification strategy is arduous, time-consuming, often suffers inter-observer variations, and necessitates experienced pathologists. This article has automated the ALL detection task, employing deep Convolutional Neural Networks (CNNs). We explore the weighted ensemble of deep CNNs to recommend a better ALL cell classifier. The weights are estimated from ensemble candidates' corresponding metrics, such as accuracy, F1-score, AUC, and kappa values. Various data augmentations and pre-processing are incorporated for achieving a better generalization of the network. We train and evaluate the proposed model utilizing the publicly available C-NMC-2019 ALL dataset. Our proposed weighted ensemble model has outputted a weighted F1-score of 88.6%, a balanced accuracy of 86.2%, and an AUC of 0.941 in the preliminary test set. The qualitative results displaying the gradient class activation maps confirm that the introduced model has a concentrated learned region. In contrast, the ensemble candidate models, such as Xception, VGG-16, DenseNet-121, MobileNet, and InceptionResNet-V2, separately produce coarse and scatter learned areas for most example cases. Since the proposed ensemble yields a better result for the aimed task, it can experiment in other domains of medical diagnostic applications.


2021 ◽  
Vol 5 (2 (113)) ◽  
pp. 6-21
Author(s):  
Vadym Slyusar ◽  
Mykhailo Protsenko ◽  
Anton Chernukha ◽  
Pavlo Kovalov ◽  
Pavlo Borodych ◽  
...  

Detection and recognition of objects in images is the main problem to be solved by computer vision systems. As part of solving this problem, the model of object recognition in aerial photographs taken from unmanned aerial vehicles has been improved. A study of object recognition in aerial photographs using deep convolutional neural networks has been carried out. Analysis of possible implementations showed that the AlexNet 2012 model (Canada) trained on the ImageNet image set (China) is most suitable for this problem solution. This model was used as a basic one. The object recognition error for this model with the use of the ImageNet test set of images amounted to 15 %. To solve the problem of improving the effectiveness of object recognition in aerial photographs for 10 classes of images, the final fully connected layer was modified by rejection from 1,000 to 10 neurons and additional two-stage training of the resulting model. Additional training was carried out with a set of images prepared from aerial photographs at stage 1 and with a set of VisDrone 2021 (China) images at stage 2. Optimal training parameters were selected: speed (step) (0.0001), number of epochs (100). As a result, a new model under the proposed name of AlexVisDrone was obtained. The effectiveness of the proposed model was checked with a test set of 100 images for each class (the total number of classes was 10). Accuracy and sensitivity were chosen as the main indicators of the model effectiveness. As a result, an increase in recognition accuracy from 7 % (for images from aerial photographs) to 9 % (for the VisDrone 2021 set) was obtained which has indicated that the choice of neural network architecture and training parameters was correct. The use of the proposed model makes it possible to automate the process of object recognition in aerial photographs. In the future, it is advisable to use this model at ground stations of unmanned aerial vehicle complex control when processing aerial photographs taken from unmanned aerial vehicles, in robotic systems, in video surveillance complexes and when designing unmanned vehicle systems


2021 ◽  
Vol 2021 ◽  
pp. 1-10
Author(s):  
Shuo Wang ◽  
Jian Wang ◽  
Yafei Song ◽  
Song Li

The increasing volume and types of malwares bring a great threat to network security. The malware binary detection with deep convolutional neural networks (CNNs) has been proved to be an effective method. However, the existing malware classification methods based on CNNs are unsatisfactory to this day because of their poor extraction ability, insufficient accuracy of malware classification, and high cost of detection time. To solve these problems, a novel approach, namely, multiscale feature fusion convolutional neural networks (MFFCs), was proposed to achieve an effective classification of malware based on malware visualization utilizing deep learning, which can defend against malware variants and confusing malwares. The approach firstly converts malware code binaries into grayscale images, and then, these images will be normalized in size by utilizing the MFFC model to identify malware families. Comparative experiments were carried out to verify the performance of the proposed method. The results indicate that the MFFC stands out among the recent advanced methods with an accuracy of 98.72% and an average cost of 5.34 milliseconds on the Malimg dataset. Our method can effectively identify malware and detect variants of malware families, which has excellent feature extraction capability and higher accuracy with lower detection time.


Sign in / Sign up

Export Citation Format

Share Document