Binarization of music score with complex background by deep convolutional neural networks

Multimedia Tools and Applications ◽

10.1007/s11042-020-10272-2 ◽

2021 ◽

Author(s):

Minh-Trieu Tran ◽

Quang-Nhat Vo ◽

Guee-Sang Lee

Keyword(s):

Neural Networks ◽

Convolutional Neural Networks ◽

Feature Learning ◽

Ground Truth ◽

Learning Ability ◽

Dense Layer ◽

Deep Convolutional Neural Networks ◽

Music Score ◽

Complex Background ◽

Network Backbone

AbstractBinarization is an important step for most of document analysis systems. Regarding music score images with a complex background, the existence of background clutters with a variety of shapes and colors creates many challenges for the binarization. This paper presents a model for binarization of the complex background music score images by fusion of deep convolutional neural networks. Our model is directly trained from image regions using pixel values as inputs and the binary ground truth as labels. By utilizing the generalization capability of the residual network backbone and useful feature learning ability of dense layer, the proposed network structures can differentiate foreground pixels from background clutters, minimize the possibility of overfitting phenomenon and thus can deal with complex background noises appearing in the music score images. Comparing to traditional algorithms, binary images generated by our method have a cleaner background and better-preserved strokes. The experiments with captured and synthetic music score images show promising results compared to existing methods.

Download Full-text

Illumination-robust face recognition based on deep convolutional neural networks architectures

Indonesian Journal of Electrical Engineering and Computer Science ◽

10.11591/ijeecs.v18.i2.pp1015-1027 ◽

2020 ◽

Vol 18 (2) ◽

pp. 1015

Author(s):

Ridha Ilyas Bendjillali ◽

Mohammed Beladgham ◽

Khaled Merit ◽

Abdelmalik Taleb-Ahmed

Keyword(s):

Neural Networks ◽

Face Recognition ◽

Convolutional Neural Networks ◽

Feature Learning ◽

Histogram Equalization ◽

Detection Algorithm ◽

Deep Convolutional Neural Networks ◽

Biometric Technology ◽

Equalization Algorithm ◽

Robust Face Recognition

<p><span>In the last decade, facial recognition techniques are considered the most important fields of research in biometric technology. In this research paper, we present a Face Recognition (FR) system divided into three steps: The Viola-Jones face detection algorithm, facial image enhancement using Modified Contrast Limited Adaptive Histogram Equalization algorithm (M-CLAHE), and feature learning for classiﬁcation. For learning the features followed by classiﬁcation we used VGG16, ResNet50 and Inception-v3 Convolutional Neural Networks (CNN) architectures for the proposed system. Our experimental work was performed on the Extended Yale B database and CMU PIE face database. Finally, the comparison with the other methods on both databases shows the robustness and effectiveness of the proposed approach. Where the Inception-v3 architecture has achieved a rate of 99, 44% and 99, 89% respectively.</span></p>

Download Full-text

NROI based feature learning for automated tumor stage classification of pulmonary lung nodules using deep convolutional neural networks

Journal of King Saud University - Computer and Information Sciences ◽

10.1016/j.jksuci.2019.11.013 ◽

2019 ◽

Cited By ~ 2

Author(s):

Supriya Suresh ◽

Subaji Mohan

Keyword(s):

Neural Networks ◽

Convolutional Neural Networks ◽

Feature Learning ◽

Tumor Stage ◽

Lung Nodules ◽

Deep Convolutional Neural Networks ◽

Stage Classification

Download Full-text

Water Level Estimation in Sewer Pipes Using Deep Convolutional Neural Networks

Water ◽

10.3390/w12123412 ◽

2020 ◽

Vol 12 (12) ◽

pp. 3412

Author(s):

Joakim Bruslund Haurum ◽

Chris H. Bahnsen ◽

Malte Pedersen ◽

Thomas B. Moeslund

Keyword(s):

Neural Networks ◽

Water Level ◽

Convolutional Neural Networks ◽

Contextual Information ◽

Ground Truth ◽

Water Levels ◽

Deep Convolutional Neural Networks ◽

Regression Problem ◽

Still Images ◽

Visual Appearance

Sewer pipe inspections are currently conducted by professionals who remotely control a robot from above ground. This expensive and slow approach is prone to human mistakes. Therefore, there is both an economic and scientific interest in automating the inspection process by creating systems able to recognize sewer defects. However, the extent of research put into automatic water level estimation in sewers has been limited despite being a prerequisite for further analysis of the pipe as only sections above the water level can be visually inspected. In this work, we utilize a dataset of still images obtained from over 5000 inspections carried out for three different Danish water utilities companies. This dataset is used for training and testing decision tree methods and convolutional neural networks (CNNs) for automatic water level estimation. We pose the estimation problem as a classification and regression problem, and compare the results of both approaches. Furthermore, we compare the effect of using different inspection standards for labeling the ground truth water level. By treating the problem as a classification task and using the 2015 Danish sewer inspection standard, where water levels are clustered based on visual appearance, we achieve an averaged F1 score of 79.29% using a fine-tuned ResNet-50 CNN. This shows the potential of using CNNs for water level estimation. We believe including temporal and contextual information will improve the results further.

Download Full-text

Improving deep convolutional neural networks with unsupervised feature learning

2015 IEEE International Conference on Image Processing (ICIP) ◽

10.1109/icip.2015.7351206 ◽

2015 ◽

Cited By ~ 17

Author(s):

Kien Nguyen ◽

Clinton Fookes ◽

Sridha Sridharan

Keyword(s):

Neural Networks ◽

Convolutional Neural Networks ◽

Feature Learning ◽

Unsupervised Feature Learning ◽

Deep Convolutional Neural Networks

Download Full-text

A novel MapReduce-based deep convolutional neural network algorithm

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-201790 ◽

2021 ◽

pp. 1-13

Author(s):

Xiang-Min Liu ◽

Jian Hu ◽

Deborah Simon Mwakapesa ◽

Y.A. Nanehkaran ◽

Yi-Min Mao ◽

...

Keyword(s):

Neural Networks ◽

Big Data ◽

Convolutional Neural Networks ◽

Large Scale ◽

Feature Learning ◽

Deep Convolutional Neural Networks ◽

Network Training ◽

Load Rate ◽

Data Environment ◽

Neural Networks Optimization

Deep convolutional neural networks (DCNNs), with their complex network structure and powerful feature learning and feature expression capabilities, have been remarkable successes in many large-scale recognition tasks. However, with the expectation of memory overhead and response time, along with the increasing scale of data, DCNN faces three non-rival challenges in a big data environment: excessive network parameters, slow convergence, and inefficient parallelism. To tackle these three problems, this paper develops a deep convolutional neural networks optimization algorithm (PDCNNO) in the MapReduce framework. The proposed method first pruned the network to obtain a compressed network in order to effectively reduce redundant parameters. Next, a conjugate gradient method based on modified secant equation (CGMSE) is developed in the Map phase to further accelerate the convergence of the network. Finally, a load balancing strategy based on regulate load rate (LBRLA) is proposed in the Reduce phase to quickly achieve equal grouping of data and thus improving the parallel performance of the system. We compared the PDCNNO algorithm with other algorithms on three datasets, including SVHN, EMNIST Digits, and ISLVRC2012. The experimental results show that our algorithm not only reduces the space and time overhead of network training but also obtains a well-performing speed-up ratio in a big data environment.

Download Full-text

Respiratory Sounds Feature Learning with Deep Convolutional Neural Networks

2017 IEEE 15th Intl Conf on Dependable, Autonomic and Secure Computing, 15th Intl Conf on Pervasive Intelligence and Computing, 3rd Intl Conf on Big Data Intelligence and Computing and Cyber Science and Technology Congress(DASC/PiCom/DataCom/CyberSciTech) ◽

10.1109/dasc-picom-datacom-cyberscitec.2017.41 ◽

2017 ◽

Cited By ~ 2

Author(s):

Yongpeng Liu ◽

Yusong Lin ◽

Shan Gao ◽

Hongpo Zhang ◽

Zongmin Wang ◽

...

Keyword(s):

Neural Networks ◽

Convolutional Neural Networks ◽

Feature Learning ◽

Respiratory Sounds ◽

Deep Convolutional Neural Networks

Download Full-text

Social Video Advertisement Replacement and its Evaluation in Convolutional Neural Networks

ELCVIA Electronic Letters on Computer Vision and Image Analysis ◽

10.5565/rev/elcvia.1347 ◽

2021 ◽

Vol 20 (1) ◽

pp. 117-136

Author(s):

Cheng Yang ◽

Xiang Yu ◽

Arun Kumar ◽

G.G. Md. Nawaz Ali ◽

Peter Han Joo Chong ◽

...

Keyword(s):

Neural Networks ◽

Convolutional Neural Networks ◽

Evaluation Method ◽

Learning Algorithm ◽

Ground Truth ◽

Recurrent Network ◽

Human Being ◽

Deep Convolutional Neural Networks ◽

Deep Learning Algorithm ◽

Curve Fitting Method

This paper introduces a method to use deep convolutional neural networks (CNNs) to automatically replace advertisement (AD) photo on social (or self-media) videos and provides the suitable evaluation method to compare different CNNs. An AD photo can replace a picture inside a video. However, if a human being occludes the replaced picture in the original video, the newly pasted AD photo will block the human occluded part. The deep learning algorithm is implemented to segment the human being from the video. The segmented human pixels are then pasted back to the occluded area, so that the AD photo replacement becomes natural and perfect appearance in the video. This process requires the predicted occlusion edge to be closed to the ground truth occlusion edge, so that the AD photo can be occluded naturally. Therefore, this research introduces a curve fitting method to measure the predicted occlusion edge’s error. By using this method, three CNN methods are applied and compared for the AD replacement. They are mask of regions convolutional neural network (Mask RCNN), a recurrent network for video object segmentation (ROVS) and DeeplabV3. The experimental results show the comparative segmentation accuracy of the different models and DeeplabV3 shows the best performance.

Download Full-text

Writer-independent feature learning for Offline Signature Verification using Deep Convolutional Neural Networks

2016 International Joint Conference on Neural Networks (IJCNN) ◽

10.1109/ijcnn.2016.7727521 ◽

2016 ◽

Cited By ~ 35

Author(s):

Luiz G. Hafemann ◽

Robert Sabourin ◽

Luiz S. Oliveira

Keyword(s):

Neural Networks ◽

Convolutional Neural Networks ◽

Feature Learning ◽

Signature Verification ◽

Deep Convolutional Neural Networks ◽

Offline Signature Verification

Download Full-text

ACM: Adaptive Cross-Modal Graph Convolutional Neural Networks for RGB-D Scene Recognition

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33019176 ◽

2019 ◽

Vol 33 ◽

pp. 9176-9184 ◽

Cited By ~ 3

Author(s):

Yuan Yuan ◽

Zhitong Xiong ◽

Qi Wang

Keyword(s):

Neural Networks ◽

Convolutional Neural Networks ◽

Feature Learning ◽

Local Features ◽

Scene Recognition ◽

Scene Classification ◽

Deep Convolutional Neural Networks ◽

Scene Representation ◽

Complex Relationships ◽

Rgb Image

RGB image classification has achieved significant performance improvement with the resurge of deep convolutional neural networks. However, mono-modal deep models for RGB image still have several limitations when applied to RGB-D scene recognition. 1) Images for scene classification usually contain more than one typical object with flexible spatial distribution, so the object-level local features should also be considered in addition to global scene representation. 2) Multi-modal features in RGB-D scene classification are still under-utilized. Simply combining these modal-specific features suffers from the semantic gaps between different modalities. 3) Most existing methods neglect the complex relationships among multiple modality features. Considering these limitations, this paper proposes an adaptive crossmodal (ACM) feature learning framework based on graph convolutional neural networks for RGB-D scene recognition. In order to make better use of the modal-specific cues, this approach mines the intra-modality relationships among the selected local features from one modality. To leverage the multi-modal knowledge more effectively, the proposed approach models the inter-modality relationships between two modalities through the cross-modal graph (CMG). We evaluate the proposed method on two public RGB-D scene classification datasets: SUN-RGBD and NYUD V2, and the proposed method achieves state-of-the-art performance.

Download Full-text

AUTOMATIC DIAGNOSIS OF BREAST CANCER IN HISTOLOGY IMAGES USING DEEP CONVOLUTIONAL NEURAL NETWORKS

KỶ YẾU HỘI NGHỊ KHOA HỌC CÔNG NGHỆ QUỐC GIA LẦN THỨ XI NGHIÊN CỨU CƠ BẢN VÀ ỨNG DỤNG CÔNG NGHỆ THÔNG TIN ◽

10.15625/vap.2018.0009 ◽

2018 ◽

Author(s):

Hung Le Minh ◽

Manh Mai Van ◽

Toan Tran Dinh ◽

Tot Tran Dac ◽

Tran Van Lang

Keyword(s):

Breast Cancer ◽

Neural Networks ◽

Convolutional Neural Networks ◽

Deep Convolutional Neural Networks ◽

Automatic Diagnosis

Download Full-text