Speckle noise removal based on structural convolutional neural networks with feature fusion for medical image

Author(s):  
Dazi Li ◽  
Wenjie Yu ◽  
Kunfeng Wang ◽  
Daozhong Jiang ◽  
Qibing Jin
2018 ◽  
Vol 42 (11) ◽  
Author(s):  
Syed Muhammad Anwar ◽  
Muhammad Majid ◽  
Adnan Qayyum ◽  
Muhammad Awais ◽  
Majdi Alnowami ◽  
...  

2019 ◽  
Vol 37 (1) ◽  
pp. 125-135 ◽  
Author(s):  
Sizhe Huang ◽  
Huosheng Xu ◽  
Xuezhi Xia ◽  
Fan Yang ◽  
Fuhao Zou

2020 ◽  
Vol 2020 ◽  
pp. 1-16
Author(s):  
Zhuofu Deng ◽  
Binbin Wang ◽  
Zhiliang Zhu

Maxillary sinus segmentation plays an important role in the choice of therapeutic strategies for nasal disease and treatment monitoring. Difficulties in traditional approaches deal with extremely heterogeneous intensity caused by lesions, abnormal anatomy structures, and blurring boundaries of cavity. 2D and 3D deep convolutional neural networks have grown popular in medical image segmentation due to utilization of large labeled datasets to learn discriminative features. However, for 3D segmentation in medical images, 2D networks are not competent in extracting more significant spacial features, and 3D ones suffer from unbearable burden of computation, which results in great challenges to maxillary sinus segmentation. In this paper, we propose a deep neural network with an end-to-end manner to generalize a fully automatic 3D segmentation. At first, our proposed model serves a symmetrical encoder-decoder architecture for multitask of bounding box estimation and in-region 3D segmentation, which cannot reduce excessive computation requirements but eliminate false positives remarkably, promoting 3D segmentation applied in 3D convolutional neural networks. In addition, an overestimation strategy is presented to avoid overfitting phenomena in conventional multitask networks. Meanwhile, we introduce residual dense blocks to increase the depth of the proposed network and attention excitation mechanism to improve the performance of bounding box estimation, both of which bring little influence to computation cost. Especially, the structure of multilevel feature fusion in the pyramid network strengthens the ability of identification to global and local discriminative features in foreground and background achieving more advanced segmentation results. At last, to address problems of blurring boundary and class imbalance in medical images, a hybrid loss function is designed for multiple tasks. To illustrate the strength of our proposed model, we evaluated it against the state-of-the-art methods. Our model performed better significantly with an average Dice 0.947±0.031, VOE 10.23±5.29, and ASD 2.86±2.11, respectively, which denotes a promising technique with strong robust in practice.


Sensors ◽  
2021 ◽  
Vol 22 (1) ◽  
pp. 72
Author(s):  
Sanghun Jeon ◽  
Ahmed Elsharkawy ◽  
Mun Sang Kim

In visual speech recognition (VSR), speech is transcribed using only visual information to interpret tongue and teeth movements. Recently, deep learning has shown outstanding performance in VSR, with accuracy exceeding that of lipreaders on benchmark datasets. However, several problems still exist when using VSR systems. A major challenge is the distinction of words with similar pronunciation, called homophones; these lead to word ambiguity. Another technical limitation of traditional VSR systems is that visual information does not provide sufficient data for learning words such as “a”, “an”, “eight”, and “bin” because their lengths are shorter than 0.02 s. This report proposes a novel lipreading architecture that combines three different convolutional neural networks (CNNs; a 3D CNN, a densely connected 3D CNN, and a multi-layer feature fusion 3D CNN), which are followed by a two-layer bi-directional gated recurrent unit. The entire network was trained using connectionist temporal classification. The results of the standard automatic speech recognition evaluation metrics show that the proposed architecture reduced the character and word error rates of the baseline model by 5.681% and 11.282%, respectively, for the unseen-speaker dataset. Our proposed architecture exhibits improved performance even when visual ambiguity arises, thereby increasing VSR reliability for practical applications.


2021 ◽  
Vol 11 (23) ◽  
pp. 11185
Author(s):  
Zhi-Peng Jiang ◽  
Yi-Yang Liu ◽  
Zhen-En Shao ◽  
Ko-Wei Huang

Image recognition has been applied to many fields, but it is relatively rarely applied to medical images. Recent significant deep learning progress for image recognition has raised strong research interest in medical image recognition. First of all, we found the prediction result using the VGG16 model on failed pneumonia X-ray images. Thus, this paper proposes IVGG13 (Improved Visual Geometry Group-13), a modified VGG16 model for classification pneumonia X-rays images. Open-source thoracic X-ray images acquired from the Kaggle platform were employed for pneumonia recognition, but only a few data were obtained, and datasets were unbalanced after classification, either of which can result in extremely poor recognition from trained neural network models. Therefore, we applied augmentation pre-processing to compensate for low data volume and poorly balanced datasets. The original datasets without data augmentation were trained using the proposed and some well-known convolutional neural networks, such as LeNet AlexNet, GoogLeNet and VGG16. In the experimental results, the recognition rates and other evaluation criteria, such as precision, recall and f-measure, were evaluated for each model. This process was repeated for augmented and balanced datasets, with greatly improved metrics such as precision, recall and F1-measure. The proposed IVGG13 model produced superior outcomes with the F1-measure compared with the current best practice convolutional neural networks for medical image recognition, confirming data augmentation effectively improved model accuracy.


2021 ◽  
pp. 2740-2747
Author(s):  
Ehsan Ali Al-Zubaidi ◽  
Maad M. Mijwil

     The coronavirus is a family of viruses that cause different dangerous diseases that lead to death. Two types of this virus have been previously found: SARS-CoV, which causes a severe respiratory syndrome, and MERS-CoV, which causes a respiratory syndrome in the Middle East. The latest coronavirus, originated in the Chinese city of Wuhan, is known as the COVID-19 pandemic. It is a new kind of coronavirus that can harm people and was first discovered in Dec. 2019. According to the statistics of the World Health Organization (WHO), the number of people infected with this serious disease has reached more than seven million people from all over the world. In Iraq, the number of people infected has reached more than twenty-two thousand people until April 2020. In this article, we have applied convolutional neural networks (ConvNets) for the detection of the accuracy of computed tomography (CT) coronavirus images that assist medical staffs in hospitals on categorization chest CT-coronavirus images at an early stage. The ConvNets are able to automatically learn and extract features from the medical image dataset. The objective of this study is to train the GoogleNet ConvNet architecture, using the COVID-CT dataset, to classify 425 CT-coronavirus images. The experimental results show that the validation accuracy of GoogleNet in training the dataset is 82.14% with an elapsed time of 74 minutes and 37 seconds.


Sign in / Sign up

Export Citation Format

Share Document