scholarly journals A Saliency-Based Patch Sampling Approach for Deep Artistic Media Recognition

Electronics ◽  
2021 ◽  
Vol 10 (9) ◽  
pp. 1053
Author(s):  
Heekyung Yang ◽  
Kyungha Min

We present a saliency-based patch sampling strategy for recognizing artistic media from artwork images using a deep media recognition model, which is composed of several deep convolutional neural network-based recognition modules. The decisions from the individual modules are merged into the final decision of the model. To sample a suitable patch for the input of the module, we devise a strategy that samples patches with high probabilities of containing distinctive media stroke patterns for artistic media without distortion, as media stroke patterns are key for media recognition. We design this strategy by collecting human-selected ground truth patches and analyzing the distribution of the saliency values of the patches. From this analysis, we build a strategy that samples patches that have a high probability of containing media stroke patterns. We prove that our strategy shows best performance among the existing patch sampling strategies and that our strategy shows a consistent recognition and confusion pattern with the existing strategies.


Author(s):  
Liang Kim Meng ◽  
Azira Khalil ◽  
Muhamad Hanif Ahmad Nizar ◽  
Maryam Kamarun Nisham ◽  
Belinda Pingguan-Murphy ◽  
...  

Background: Bone Age Assessment (BAA) refers to a clinical procedure that aims to identify a discrepancy between biological and chronological age of an individual by assessing the bone age growth. Currently, there are two main methods of executing BAA which are known as Greulich-Pyle and Tanner-Whitehouse techniques. Both techniques involve a manual and qualitative assessment of hand and wrist radiographs, resulting in intra and inter-operator variability accuracy and time-consuming. An automatic segmentation can be applied to the radiographs, providing the physician with more accurate delineation of the carpal bone and accurate quantitative analysis. Methods: In this study, we proposed an image feature extraction technique based on image segmentation with the fully convolutional neural network with eight stride pixel (FCN-8). A total of 290 radiographic images including both female and the male subject of age ranging from 0 to 18 were manually segmented and trained using FCN-8. Results and Conclusion: The results exhibit a high training accuracy value of 99.68% and a loss rate of 0.008619 for 50 epochs of training. The experiments compared 58 images against the gold standard ground truth images. The accuracy of our fully automated segmentation technique is 0.78 ± 0.06, 1.56 ±0.30 mm and 98.02% in terms of Dice Coefficient, Hausdorff Distance, and overall qualitative carpal recognition accuracy, respectively.



2021 ◽  
Vol 18 (1) ◽  
pp. 172988142199332
Author(s):  
Xintao Ding ◽  
Boquan Li ◽  
Jinbao Wang

Indoor object detection is a very demanding and important task for robot applications. Object knowledge, such as two-dimensional (2D) shape and depth information, may be helpful for detection. In this article, we focus on region-based convolutional neural network (CNN) detector and propose a geometric property-based Faster R-CNN method (GP-Faster) for indoor object detection. GP-Faster incorporates geometric property in Faster R-CNN to improve the detection performance. In detail, we first use mesh grids that are the intersections of direct and inverse proportion functions to generate appropriate anchors for indoor objects. After the anchors are regressed to the regions of interest produced by a region proposal network (RPN-RoIs), we then use 2D geometric constraints to refine the RPN-RoIs, in which the 2D constraint of every classification is a convex hull region enclosing the width and height coordinates of the ground-truth boxes on the training set. Comparison experiments are implemented on two indoor datasets SUN2012 and NYUv2. Since the depth information is available in NYUv2, we involve depth constraints in GP-Faster and propose 3D geometric property-based Faster R-CNN (DGP-Faster) on NYUv2. The experimental results show that both GP-Faster and DGP-Faster increase the performance of the mean average precision.



2021 ◽  
Vol 4 (1) ◽  
Author(s):  
Peter M. Maloca ◽  
Philipp L. Müller ◽  
Aaron Y. Lee ◽  
Adnan Tufail ◽  
Konstantinos Balaskas ◽  
...  

AbstractMachine learning has greatly facilitated the analysis of medical data, while the internal operations usually remain intransparent. To better comprehend these opaque procedures, a convolutional neural network for optical coherence tomography image segmentation was enhanced with a Traceable Relevance Explainability (T-REX) technique. The proposed application was based on three components: ground truth generation by multiple graders, calculation of Hamming distances among graders and the machine learning algorithm, as well as a smart data visualization (‘neural recording’). An overall average variability of 1.75% between the human graders and the algorithm was found, slightly minor to 2.02% among human graders. The ambiguity in ground truth had noteworthy impact on machine learning results, which could be visualized. The convolutional neural network balanced between graders and allowed for modifiable predictions dependent on the compartment. Using the proposed T-REX setup, machine learning processes could be rendered more transparent and understandable, possibly leading to optimized applications.



2021 ◽  
Author(s):  
Malte Oeljeklaus

This thesis investigates methods for traffic scene perception with monocular cameras for a basic environment model in the context of automated vehicles. The developed approach is designed with special attention to the computational limitations present in practical systems. For this purpose, three different scene representations are investigated. These consist of the prevalent road topology as the global scene context, the drivable road area and the detection and spatial reconstruction of other road users. An approach is developed that allows for the simultaneous perception of all environment representations based on a multi-task convolutional neural network. The obtained results demonstrate the efficiency of the multi-task approach. In particular, the effects of shareable image features for the perception of the individual scene representations were found to improve the computational performance. Contents Nomenclature VII 1 Introduction 1 1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Outline and contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 2 Related Work and Fundamental Background 8 2.1 Advances in CNN...





2020 ◽  
Vol 162 (12) ◽  
pp. 3067-3080
Author(s):  
Yizhou Wan ◽  
Roushanak Rahmat ◽  
Stephen J. Price

Abstract Background Measurement of volumetric features is challenging in glioblastoma. We investigate whether volumetric features derived from preoperative MRI using a convolutional neural network–assisted segmentation is correlated with survival. Methods Preoperative MRI of 120 patients were scored using Visually Accessible Rembrandt Images (VASARI) features. We trained and tested a multilayer, multi-scale convolutional neural network on multimodal brain tumour segmentation challenge (BRATS) data, prior to testing on our dataset. The automated labels were manually edited to generate ground truth segmentations. Network performance for our data and BRATS data was compared. Multivariable Cox regression analysis corrected for multiple testing using the false discovery rate was performed to correlate clinical and imaging variables with overall survival. Results Median Dice coefficients in our sample were (1) whole tumour 0.94 (IQR, 0.82–0.98) compared to 0.91 (IQR, 0.83–0.94 p = 0.012), (2) FLAIR region 0.84 (IQR, 0.63–0.95) compared to 0.81 (IQR, 0.69–0.8 p = 0.170), (3) contrast-enhancing region 0.91 (IQR, 0.74–0.98) compared to 0.83 (IQR, 0.78–0.89 p = 0.003) and (4) necrosis region were 0.82 (IQR, 0.47–0.97) compared to 0.67 (IQR, 0.42–0.81 p = 0.005). Contrast-enhancing region/tumour core ratio (HR 4.73 [95% CI, 1.67–13.40], corrected p = 0.017) and necrotic core/tumour core ratio (HR 8.13 [95% CI, 2.06–32.12], corrected p = 0.011) were independently associated with overall survival. Conclusion Semi-automated segmentation of glioblastoma using a convolutional neural network trained on independent data is robust when applied to routine clinical data. The segmented volumes have prognostic significance.



2021 ◽  
Vol 11 (1) ◽  
pp. 15-24
Author(s):  
Dequan Guo ◽  
Gexiang Zhang ◽  
Hui Peng ◽  
Jianying Yuan ◽  
Prithwineel Paul ◽  
...  

In recent years, diseases of cardiovascular and cerebrovascular have attracted much attention due to main causes in death in human beings. To reduce mortality, there are lots of efforts which are focused on early diagnosis and prevention. It is an important reference index for cardiovascular diseases through the endovascular membrane in carotid artery by medical ultrasound images. The paper proposes a method which finds the region of interest (ROI) by convolutional neural network, segments and measures intima-media membrane mainly using support vector machine (SVM). Essentially, the task of detecting the membrane is one target detection problem. This paper adopts the strategy, named Yon Only Look Once (YOLO), a new detection algorithm, and follows the convolution neural network algorithm based on end-to-end training. Firstly, sufficient samples are extracted according to certain characteristics in the special region. It can be trained by the SVM classification model. Then the ROI is processed and all the pixels are classified into boundary points and non-boundary points through the classification model. Thirdly, the boundary points are selected to obtain the accurate boundary and calculate the intima-media thickness (IMT). In experiments, two hundred ultrasound images are tested, and the results verify that our algorithm is consistent with the results by ground truth (GT). The detection speed of the algorithm in this paper is in real time, and it has high generalization characteristics. The algorithm computes the intima-media thickness in ultrasound images accurately and quickly with 95% consistence to ground truth.



Author(s):  
Carla Sendra-Balcells ◽  
Ricardo Salvador ◽  
Juan B. Pedro ◽  
M C Biagi ◽  
Charlène Aubinet ◽  
...  

AbstractThe segmentation of structural MRI data is an essential step for deriving geometrical information about brain tissues. One important application is in transcranial electrical stimulation (e.g., tDCS), a non-invasive neuromodulatory technique where head modeling is required to determine the electric field (E-field) generated in the cortex to predict and optimize its effects. Here we propose a deep learning-based model (StarNEt) to automatize white matter (WM) and gray matter (GM) segmentation and compare its performance with FreeSurfer, an established tool. Since good definition of sulci and gyri in the cortical surface is an important requirement for E-field calculation, StarNEt is specifically designed to output masks at a higher resolution than that of the original input T1w-MRI. StarNEt uses a residual network as the encoder (ResNet) and a fully convolutional neural network with U-net skip connections as the decoder to segment an MRI slice by slice. Slice vertical location is provided as an extra input. The model was trained on scans from 425 patients in the open-access ADNI+IXI datasets, and using FreeSurfer segmentation as ground truth. Model performance was evaluated using the Dice Coefficient (DC) in a separate subset (N=105) of ADNI+IXI and in two extra testing sets not involved in training. In addition, FreeSurfer and StarNEt were compared to manual segmentations of the MRBrainS18 dataset, also unseen by the model. To study performance in real use cases, first, we created electrical head models derived from the FreeSurfer and StarNEt segmentations and used them for montage optimization with a common target region using a standard algorithm (Stimweaver) and second, we used StarNEt to successfully segment the brains of minimally conscious state (MCS) patients having suffered from brain trauma, a scenario where FreeSurfer typically fails. Our results indicate that StarNEt matches FreeSurfer performance on the trained tasks while reducing computation time from several hours to a few seconds, and with the potential to evolve into an effective technique even when patients present large brain abnormalities.



Author(s):  
Attila Zoltán Jenei ◽  
Gábor Kiss

In the present study, we attempt to estimate the severity of depression using a Convolutional Neural Network (CNN). The method is special because an auto- and cross-correlation structure has been crafted rather than using an actual image for the input of the network. The importance to investigate the possibility of this research is that depression has become one of the leading mental disorders in the world. With its appearance, it can significantly reduce an individual's quality of life even at an early stage, and in severe cases, it may threaten with suicide. It is therefore important that the disorder be recognized as early as possible. Furthermore, it is also important to determine the disorder severity of the individual, so that a treatment order can be established. During the examination, speech acoustic features were obtained from recordings. Among the features, MFCC coefficients and formant frequencies were used based on preliminary studies. From its subsets, correlation structure was created. We applied this quadratic structure to the input of a convolutional network. Two models were crafted: single and double input versions. Altogether, the lowest RMSE value (10.797) was achieved using the two features, which has a moderate strength correlation of 0.61 (between estimated and original).



Sign in / Sign up

Export Citation Format

Share Document