Speckle noise removal based on structural convolutional neural networks with feature fusion for medical image

Maxillary sinus segmentation plays an important role in the choice of therapeutic strategies for nasal disease and treatment monitoring. Difficulties in traditional approaches deal with extremely heterogeneous intensity caused by lesions, abnormal anatomy structures, and blurring boundaries of cavity. 2D and 3D deep convolutional neural networks have grown popular in medical image segmentation due to utilization of large labeled datasets to learn discriminative features. However, for 3D segmentation in medical images, 2D networks are not competent in extracting more significant spacial features, and 3D ones suffer from unbearable burden of computation, which results in great challenges to maxillary sinus segmentation. In this paper, we propose a deep neural network with an end-to-end manner to generalize a fully automatic 3D segmentation. At first, our proposed model serves a symmetrical encoder-decoder architecture for multitask of bounding box estimation and in-region 3D segmentation, which cannot reduce excessive computation requirements but eliminate false positives remarkably, promoting 3D segmentation applied in 3D convolutional neural networks. In addition, an overestimation strategy is presented to avoid overfitting phenomena in conventional multitask networks. Meanwhile, we introduce residual dense blocks to increase the depth of the proposed network and attention excitation mechanism to improve the performance of bounding box estimation, both of which bring little influence to computation cost. Especially, the structure of multilevel feature fusion in the pyramid network strengthens the ability of identification to global and local discriminative features in foreground and background achieving more advanced segmentation results. At last, to address problems of blurring boundary and class imbalance in medical images, a hybrid loss function is designed for multiple tasks. To illustrate the strength of our proposed model, we evaluated it against the state-of-the-art methods. Our model performed better significantly with an average Dice 0.947±0.031, VOE 10.23±5.29, and ASD 2.86±2.11, respectively, which denotes a promising technique with strong robust in practice.

Download Full-text

Lipreading Architecture Based on Multiple Convolutional Neural Networks for Sentence-Level Visual Speech Recognition

Sensors ◽

10.3390/s22010072 ◽

2021 ◽

Vol 22 (1) ◽

pp. 72

Author(s):

Sanghun Jeon ◽

Ahmed Elsharkawy ◽

Mun Sang Kim

Keyword(s):

Neural Networks ◽

Speech Recognition ◽

Convolutional Neural Networks ◽

Visual Information ◽

Feature Fusion ◽

Error Rates ◽

Visual Speech ◽

Technical Limitation ◽

Visual Speech Recognition ◽

3D Cnn

In visual speech recognition (VSR), speech is transcribed using only visual information to interpret tongue and teeth movements. Recently, deep learning has shown outstanding performance in VSR, with accuracy exceeding that of lipreaders on benchmark datasets. However, several problems still exist when using VSR systems. A major challenge is the distinction of words with similar pronunciation, called homophones; these lead to word ambiguity. Another technical limitation of traditional VSR systems is that visual information does not provide sufficient data for learning words such as “a”, “an”, “eight”, and “bin” because their lengths are shorter than 0.02 s. This report proposes a novel lipreading architecture that combines three different convolutional neural networks (CNNs; a 3D CNN, a densely connected 3D CNN, and a multi-layer feature fusion 3D CNN), which are followed by a two-layer bi-directional gated recurrent unit. The entire network was trained using connectionist temporal classification. The results of the standard automatic speech recognition evaluation metrics show that the proposed architecture reduced the character and word error rates of the baseline model by 5.681% and 11.282%, respectively, for the unseen-speaker dataset. Our proposed architecture exhibits improved performance even when visual ambiguity arises, thereby increasing VSR reliability for practical applications.

Download Full-text

An Improved VGG16 Model for Pneumonia Image Classification

Applied Sciences ◽

10.3390/app112311185 ◽

2021 ◽

Vol 11 (23) ◽

pp. 11185

Author(s):

Zhi-Peng Jiang ◽

Yi-Yang Liu ◽

Zhen-En Shao ◽

Ko-Wei Huang

Keyword(s):

Neural Networks ◽

Convolutional Neural Networks ◽

Image Recognition ◽

Medical Image ◽

Best Practice ◽

Data Augmentation ◽

X Rays ◽

Neural Network Models ◽

X Ray ◽

Medical Image Recognition

Image recognition has been applied to many fields, but it is relatively rarely applied to medical images. Recent significant deep learning progress for image recognition has raised strong research interest in medical image recognition. First of all, we found the prediction result using the VGG16 model on failed pneumonia X-ray images. Thus, this paper proposes IVGG13 (Improved Visual Geometry Group-13), a modified VGG16 model for classification pneumonia X-rays images. Open-source thoracic X-ray images acquired from the Kaggle platform were employed for pneumonia recognition, but only a few data were obtained, and datasets were unbalanced after classification, either of which can result in extremely poor recognition from trained neural network models. Therefore, we applied augmentation pre-processing to compensate for low data volume and poorly balanced datasets. The original datasets without data augmentation were trained using the proposed and some well-known convolutional neural networks, such as LeNet AlexNet, GoogLeNet and VGG16. In the experimental results, the recognition rates and other evaluation criteria, such as precision, recall and f-measure, were evaluated for each model. This process was repeated for augmented and balanced datasets, with greatly improved metrics such as precision, recall and F1-measure. The proposed IVGG13 model produced superior outcomes with the F1-measure compared with the current best practice convolutional neural networks for medical image recognition, confirming data augmentation effectively improved model accuracy.

Download Full-text

Medical Image Classification for Coronavirus Disease (COVID-19) Using Convolutional Neural Networks

Iraqi Journal of Science ◽

10.24996/ijs.2021.62.8.27 ◽

2021 ◽

pp. 2740-2747

Author(s):

Ehsan Ali Al-Zubaidi ◽

Maad M. Mijwil

Keyword(s):

Neural Networks ◽

Convolutional Neural Networks ◽

Medical Image ◽

Early Stage ◽

World Health ◽

Elapsed Time ◽

The World ◽

Medical Image Classification ◽

Serious Disease ◽

Health Organization

The coronavirus is a family of viruses that cause different dangerous diseases that lead to death. Two types of this virus have been previously found: SARS-CoV, which causes a severe respiratory syndrome, and MERS-CoV, which causes a respiratory syndrome in the Middle East. The latest coronavirus, originated in the Chinese city of Wuhan, is known as the COVID-19 pandemic. It is a new kind of coronavirus that can harm people and was first discovered in Dec. 2019. According to the statistics of the World Health Organization (WHO), the number of people infected with this serious disease has reached more than seven million people from all over the world. In Iraq, the number of people infected has reached more than twenty-two thousand people until April 2020. In this article, we have applied convolutional neural networks (ConvNets) for the detection of the accuracy of computed tomography (CT) coronavirus images that assist medical staffs in hospitals on categorization chest CT-coronavirus images at an early stage. The ConvNets are able to automatically learn and extract features from the medical image dataset. The objective of this study is to train the GoogleNet ConvNet architecture, using the COVID-CT dataset, to classify 425 CT-coronavirus images. The experimental results show that the validation accuracy of GoogleNet in training the dataset is 82.14% with an elapsed time of 74 minutes and 37 seconds.

Download Full-text

MSF-ACNN: multi-scale feature fusion atrous convolutional neural networks for pedestrian fine-grained attribution detection

Tenth International Conference on Graphics and Image Processing (ICGIP 2018) ◽

10.1117/12.2524259 ◽

2019 ◽

Author(s):

zhenxia Yu ◽

miaomiao lou ◽

lin chen

Keyword(s):

Neural Networks ◽

Convolutional Neural Networks ◽

Feature Fusion ◽

Scale Feature ◽

Fine Grained ◽

Multi Scale

Download Full-text