scholarly journals MLMT-CNN for object detection and segmentation in multi-layer and multi-spectral images

2021 ◽  
Vol 33 (1) ◽  
Author(s):  
Majedaldein Almahasneh ◽  
Adeline Paiement ◽  
Xianghua Xie ◽  
Jean Aboudarham

AbstractPrecisely localising solar Active Regions (AR) from multi-spectral images is a challenging but important task in understanding solar activity and its influence on space weather. A main challenge comes from each modality capturing a different location of the 3D objects, as opposed to typical multi-spectral imaging scenarios where all image bands observe the same scene. Thus, we refer to this special multi-spectral scenario as multi-layer. We present a multi-task deep learning framework that exploits the dependencies between image bands to produce 3D AR localisation (segmentation and detection) where different image bands (and physical locations) have their own set of results. Furthermore, to address the difficulty of producing dense AR annotations for training supervised machine learning (ML) algorithms, we adapt a training strategy based on weak labels (i.e. bounding boxes) in a recursive manner. We compare our detection and segmentation stages against baseline approaches for solar image analysis (multi-channel coronal hole detection, SPOCA for ARs) and state-of-the-art deep learning methods (Faster RCNN, U-Net). Additionally, both detection and segmentation stages are quantitatively validated on artificially created data of similar spatial configurations made from annotated multi-modal magnetic resonance images. Our framework achieves an average of 0.72 IoU (segmentation) and 0.90 F1 score (detection) across all modalities, comparing to the best performing baseline methods with scores of 0.53 and 0.58, respectively, on the artificial dataset, and 0.84 F1 score in the AR detection task comparing to baseline of 0.82 F1 score. Our segmentation results are qualitatively validated by an expert on real ARs.

Agronomy ◽  
2020 ◽  
Vol 10 (1) ◽  
pp. 143 ◽  
Author(s):  
Anand Koirala ◽  
Kerry B. Walsh ◽  
Zhenglin Wang ◽  
Nicholas Anderson

Automated assessment of the number of panicles by developmental stage can provide information on the time spread of flowering and thus inform farm management. A pixel-based segmentation method for the estimation of flowering level from tree images was confounded by the developmental stage. Therefore, the use of a single and a two-stage deep learning framework (YOLO and R2CNN) was considered, using either upright or rotated bounding boxes. For a validation image set and for a total panicle count, the models MangoYOLO(-upright), MangoYOLO-rotated, YOLOv3-rotated, R2CNN(-rotated) and R2CNN-upright achieved weighted F1 scores of 76.5, 76.1, 74.9, 74.0 and 82.0, respectively. For a test set of the images of another cultivar and using a different camera, the R2 for machine vision to human count of panicles per tree was 0.86, 0.80, 0.83, 0.81 and 0.76 for the same models, respectively. Thus, there was no consistent benefit from the use of rotated over the use of upright bounding boxes. The YOLOv3-rotated model was superior in terms of total panicle count, and the R2CNN-upright model was more accurate for panicle stage classification. To demonstrate practical application, panicle counts were made weekly for an orchard of 994 trees, with a peak detection routine applied to document multiple flowering events.


Author(s):  
Anand Koirala ◽  
Kerry Walsh ◽  
Zhenglin Wang ◽  
Nicholas Anderson

A pixel-based segmentation method was demonstrated to be confounded by developmental stage in estimation of flowering of mango. Categorization of panicles into three developmental stages was undertaken with a single and a two-stage deep learning framework (YOLO and R2CNN), using either upright or rotated bounding boxes. For a validation image set and for total panicle count, the models MangoYOLO(-upright), MangoYOLO-rotated, YOLOv3-rotated, R2CNN(-rotated) and R2CNN-upright achieved: (i) RMSEs of 25.6, 16.0, 15.4, 25.8 and 32.3 panicles per tree image, (ii) Mean average precision (mAP) scores of 72.2, 69.1, 65.0, 62.5 and 70.9% and (iii) weighted F1-scores of 76.5, 76.1, 74.9, 74.0 and 82.0, respectively. For a test set of images involving a different orchard and cultivar and use of a different camera, the R2 for machine vision to human count of panicles per tree was 0.86, 0.80, 0.83, 0.81 and 0.76 for the same models, respectively. Thus, models generalised well, but with no consistent benefit from use of rotated over upright bounding boxes. While the YOLOv3-rotated model was superior in terms of total panicle count, the R2CNN-upright model was more accurate for panicle stage classification. To demonstrate practical application, panicle counts were made weekly for an orchard of 994 trees, with a peak detection routine applied to document multiple flowering events.


Author(s):  
Deepika Sivasankaran ◽  
Sai Seena P ◽  
Rajesh R ◽  
Madheswari Kanmani

Sketch based image retrieval (SBIR) is a sub-domain of Content Based Image Retrieval(CBIR) where the user provides a drawing as an input to obtain i.e retrieve images relevant to the drawing given. The main challenge in SBIR is the subjectivity of the drawings drawn by the user as it entirely relies on the user's ability to express information in hand-drawn form. Since many of the SBIR models created aim at using singular input sketch and retrieving photos based on the given single sketch input, our project aims to enable detection and extraction of multiple sketches given together as a single input sketch image. The features are extracted from individual sketches obtained using deep learning architectures such as VGG16 , and classified to its type based on supervised machine learning using Support Vector Machines. Based on the class obtained, photos are retrieved from the database using an opencv library, CVLib , which finds the objects present in a photo image. From the number of components obtained in each photo, a ranking function is performed to rank the retrieved photos, which are then displayed to the user starting from the highest order of ranking up to the least. The system consisting of VGG16 and SVM provides 89% accuracy.


2020 ◽  
Vol 71 (7) ◽  
pp. 868-880
Author(s):  
Nguyen Hong-Quan ◽  
Nguyen Thuy-Binh ◽  
Tran Duc-Long ◽  
Le Thi-Lan

Along with the strong development of camera networks, a video analysis system has been become more and more popular and has been applied in various practical applications. In this paper, we focus on person re-identification (person ReID) task that is a crucial step of video analysis systems. The purpose of person ReID is to associate multiple images of a given person when moving in a non-overlapping camera network. Many efforts have been made to person ReID. However, most of studies on person ReID only deal with well-alignment bounding boxes which are detected manually and considered as the perfect inputs for person ReID. In fact, when building a fully automated person ReID system the quality of the two previous steps that are person detection and tracking may have a strong effect on the person ReID performance. The contribution of this paper are two-folds. First, a unified framework for person ReID based on deep learning models is proposed. In this framework, the coupling of a deep neural network for person detection and a deep-learning-based tracking method is used. Besides, features extracted from an improved ResNet architecture are proposed for person representation to achieve a higher ReID accuracy. Second, our self-built dataset is introduced and employed for evaluation of all three steps in the fully automated person ReID framework.


2020 ◽  
Author(s):  
Raniyaharini R ◽  
Madhumitha K ◽  
Mishaa S ◽  
Virajaravi R

2020 ◽  
Author(s):  
Jinseok Lee

BACKGROUND The coronavirus disease (COVID-19) has explosively spread worldwide since the beginning of 2020. According to a multinational consensus statement from the Fleischner Society, computed tomography (CT) can be used as a relevant screening tool owing to its higher sensitivity for detecting early pneumonic changes. However, physicians are extremely busy fighting COVID-19 in this era of worldwide crisis. Thus, it is crucial to accelerate the development of an artificial intelligence (AI) diagnostic tool to support physicians. OBJECTIVE We aimed to quickly develop an AI technique to diagnose COVID-19 pneumonia and differentiate it from non-COVID pneumonia and non-pneumonia diseases on CT. METHODS A simple 2D deep learning framework, named fast-track COVID-19 classification network (FCONet), was developed to diagnose COVID-19 pneumonia based on a single chest CT image. FCONet was developed by transfer learning, using one of the four state-of-art pre-trained deep learning models (VGG16, ResNet50, InceptionV3, or Xception) as a backbone. For training and testing of FCONet, we collected 3,993 chest CT images of patients with COVID-19 pneumonia, other pneumonia, and non-pneumonia diseases from Wonkwang University Hospital, Chonnam National University Hospital, and the Italian Society of Medical and Interventional Radiology public database. These CT images were split into a training and a testing set at a ratio of 8:2. For the test dataset, the diagnostic performance to diagnose COVID-19 pneumonia was compared among the four pre-trained FCONet models. In addition, we tested the FCONet models on an additional external testing dataset extracted from the embedded low-quality chest CT images of COVID-19 pneumonia in recently published papers. RESULTS Of the four pre-trained models of FCONet, the ResNet50 showed excellent diagnostic performance (sensitivity 99.58%, specificity 100%, and accuracy 99.87%) and outperformed the other three pre-trained models in testing dataset. In additional external test dataset using low-quality CT images, the detection accuracy of the ResNet50 model was the highest (96.97%), followed by Xception, InceptionV3, and VGG16 (90.71%, 89.38%, and 87.12%, respectively). CONCLUSIONS The FCONet, a simple 2D deep learning framework based on a single chest CT image, provides excellent diagnostic performance in detecting COVID-19 pneumonia. Based on our testing dataset, the ResNet50-based FCONet might be the best model, as it outperformed other FCONet models based on VGG16, Xception, and InceptionV3.


2020 ◽  
Vol 7 (1) ◽  
Author(s):  
Manan Binth Taj Noor ◽  
Nusrat Zerin Zenia ◽  
M Shamim Kaiser ◽  
Shamim Al Mamun ◽  
Mufti Mahmud

Abstract Neuroimaging, in particular magnetic resonance imaging (MRI), has been playing an important role in understanding brain functionalities and its disorders during the last couple of decades. These cutting-edge MRI scans, supported by high-performance computational tools and novel ML techniques, have opened up possibilities to unprecedentedly identify neurological disorders. However, similarities in disease phenotypes make it very difficult to detect such disorders accurately from the acquired neuroimaging data. This article critically examines and compares performances of the existing deep learning (DL)-based methods to detect neurological disorders—focusing on Alzheimer’s disease, Parkinson’s disease and schizophrenia—from MRI data acquired using different modalities including functional and structural MRI. The comparative performance analysis of various DL architectures across different disorders and imaging modalities suggests that the Convolutional Neural Network outperforms other methods in detecting neurological disorders. Towards the end, a number of current research challenges are indicated and some possible future research directions are provided.


Sign in / Sign up

Export Citation Format

Share Document