average precision
Recently Published Documents


TOTAL DOCUMENTS

389
(FIVE YEARS 273)

H-INDEX

16
(FIVE YEARS 7)

Author(s):  
N. Shobha Rani ◽  
Manohar N. ◽  
Hariprasad M. ◽  
Pushpa B. R.

<p>Automated reading of handwritten Kannada documents is highly challenging due to the presence of vowels, consonants and its modifiers. The variable nature of handwriting styles aggravates the complexity of machine based reading of handwritten vowels and consonants. In this paper, our investigation is inclined towards design of a deep convolution network with capsule and routing layers to efficiently recognize  Kannada handwritten characters.  Capsule network architecture is built of an input layer,  two convolution layers, primary capsule, routing capsule layers followed by tri-level dense convolution layer and an output layer.  For experimentation, datasets are collected from more than 100 users for creation of training data samples of about 7769 comprising of 49 classes. Test samples of all the 49 classes are again collected separately from 3 to 5 users creating a total of 245 samples for novel patterns. It is inferred from performance evaluation; a loss of 0.66% is obtained in the classification process and for 43 classes precision of 100% is achieved with an accuracy of 99%. An average accuracy of 95% is achieved for all remaining 6 classes with an average precision of 89%.</p>


2022 ◽  
Vol 9 (1) ◽  
Author(s):  
Rasha Alshehhi ◽  
Claus Gebhardt

AbstractMartian dust plays a crucial role in the meteorology and climate of the Martian atmosphere. It heats the atmosphere, enhances the atmospheric general circulation, and affects spacecraft instruments and operations. Compliant with that, studying dust is also essential for future human exploration. In this work, we present a method for the deep-learning-based detection of the areal extent of dust storms in Mars satellite imagery. We use a mask regional convolutional neural network, consisting of a regional-proposal network and a mask network. We apply the detection method to Mars daily global maps of the Mars global surveyor, Mars orbiter camera. We use center coordinates of dust storms from the eight-year Mars dust activity database as ground-truth to train and validate the method. The performance of the regional network is evaluated by the average precision score with $$50\%$$ 50 % overlap ($$mAP_{50}$$ m A P 50 ), which is around $$62.1\%$$ 62.1 % .


Author(s):  
Kaixuan Cui ◽  
Shuchai Su ◽  
Jiawei Cai ◽  
Fengjun Chen

To realize rapid and accurate ripeness detection for walnut on mobile terminals such as mobile phones, we propose a method based on coupling information and lightweight YOLOv4. First, we collected 50 walnuts at each ripeness (Unripe, Mid-ripe, Ripe, Over-ripe) to determine the kernel oil content. Pearson correlation analysis and one-way analysis of variance (ANOVA) prove that the division of walnut ripeness reflects the change in kernel oil content. It is feasible to estimate the kernel oil content by detecting the ripeness of walnut. Next, we achieve ripeness detection based on lightweight YOLOv4. We adopt MobileNetV3 as the backbone feature extractor and adopt depthwise separable convolution to replace the traditional convolution. We design a parallel convolution structure with depthwise convolution stacking (PCSDCS) to reduce parameters and improve feature extraction ability. To enhance the model’s detection ability for walnuts in the growth-intensive areas, we design a Gaussian Soft DIoU non-maximum suppression (GSDIoU-NMS) algorithm. The dataset used for model optimization contains 3600 images, of which 2880 images in the training set, 320 images in the validation set, and 400 images in the test set. We adopt a multi-training strategy based on dynamic learning rate and transfer learning to get training weights. The lightweight YOLOv4 model achieves 94.05%, 90.72%, 88.30%, 76.92 FPS, and 38.14 MB in mean average precision, precision, recall, average detection speed, and weight capacity, respectively. Compared with the Faster R-CNN model, EfficientDet-D1 model, YOLOv3 model, and YOLOv4 model, the lightweight YOLOv4 model improves 8.77%, 4.84%, 5.43%, and 0.06% in mean average precision, 74.60 FPS, 55.60 FPS, 38.83 FPS, and 46.63 FPS in detection speed, respectively. And the lightweight YOLOv4 is 84.4% smaller than the original YOLOv4 model in terms of weight capacity. This paper provides a theoretical reference for the rapid ripeness detection of walnut and exploration for the model’s lightweight.


2022 ◽  
Vol 14 (2) ◽  
pp. 254
Author(s):  
Minjing Shi ◽  
Pengfei He ◽  
Yuli Shi

In this paper, we propose a deep learning-based model to detect extratropical cyclones (ETCs) of the northern hemisphere, while developing a novel workflow of processing images and generating labels for ETCs. We first labeled the cyclone center by adapting an approach from Bonfanti et al. in 2017 and set up criteria of labeling ETCs of three categories: developing, mature, and declining stages. We then gave a framework of labeling and preprocessing the images in our dataset. Once the images and labels were ready to serve as inputs, an object detection model was built with Single Shot Detector (SSD) and adjusted to fit the format of the dataset. We trained and evaluated our model with our labeled dataset on two settings (binary and multiclass classifications), while keeping a record of the results. We found that the model achieves relatively high performance with detecting ETCs of mature stage (mean Average Precision is 86.64%), and an acceptable result for detecting ETCs of all three categories (mean Average Precision 79.34%). The single-shot detector model can succeed in detecting ETCs of different stages, and it has demonstrated great potential in the future applications of ETC detection in other relevant settings.


Author(s):  
Gioele Ciaparrone ◽  
Leonardo Chiariglione ◽  
Roberto Tagliaferri

AbstractFace-based video retrieval (FBVR) is the task of retrieving videos that containing the same face shown in the query image. In this article, we present the first end-to-end FBVR pipeline that is able to operate on large datasets of unconstrained, multi-shot, multi-person videos. We adapt an existing audiovisual recognition dataset to the task of FBVR and use it to evaluate our proposed pipeline. We compare a number of deep learning models for shot detection, face detection, and face feature extraction as part of our pipeline on a validation dataset made of more than 4000 videos. We obtain 97.25% mean average precision on an independent test set, composed of more than 1000 videos. The pipeline is able to extract features from videos at $$\sim $$ ∼ 7 times the real-time speed, and it is able to perform a query on thousands of videos in less than 0.5 s.


2022 ◽  
Vol 12 (1) ◽  
pp. 0-0

In this paper, the authors propose and readapt a new concept-based approach of query expansion in the context of Arabic information retrieval. The purpose is to represent the query by a set of weighted concepts in order to identify better the user's information need. Firstly, concepts are extracted from the initially retrieved documents by the Pseudo-Relevance Feedback method, and then they are integrated into a semantic weighted tree in order to detect more information contained in the related concepts connected by semantic relations to the primary concepts. The authors use the “Arabic WordNet” as a resource to extract, disambiguate concepts and build the semantic tree. Experimental results demonstrate that measure of MAP (Mean Average Precision) is about 10% of improvement using the open source Lucene as IR System on a collection formed from the Arabic BBC news.


2022 ◽  
Vol 11 (01) ◽  
pp. 22-26
Author(s):  
Hui Xiang ◽  
Junyan Han ◽  
Hanqing Wang ◽  
Hao Li ◽  
Shangqing Li ◽  
...  

Aiming at the problems of low detection accuracy and poor recognition effect of small-scale targets in traditional vehicle and pedestrian detection methods, a vehicle and pedestrian detection method based on improved YOLOv4-Tiny is proposed. On the basis of YOLOv4-Tiny, the 8-fold down sampling feature layer was added for feature fusion, the PANet structure was used to perform bidirectional fusion for the deep and shallow features from the output feature layer of backbone network, and the detection head for small targets was added. The results show that the mean average precision of the improved method has reached 85.93%, and the detection performance is similar to that of YOLOv4. Compared with the YOLOv4-Tiny, the mean average precision of the improved method is increased by 24.45%, and the detection speed reaches 67.83FPS, which means that the detection effect is significantly improved and can meet the real-time requirements.


Sensors ◽  
2021 ◽  
Vol 22 (1) ◽  
pp. 298
Author(s):  
César Melo ◽  
Sandra Dixe ◽  
Jaime C. Fonseca ◽  
António H. J. Moreira ◽  
João Borges

COVID-19 was responsible for devastating social, economic, and political effects all over the world. Although the health authorities imposed restrictions provided relief and assisted with trying to return society to normal life, it is imperative to monitor people’s behavior and risk factors to keep virus transmission levels as low as possible. This article focuses on the application of deep learning algorithms to detect the presence of masks on people in public spaces (using RGB cameras), as well as the detection of the caruncle in the human eye area to make an accurate measurement of body temperature (using thermal cameras). For this task, synthetic data generation techniques were used to create hybrid datasets from public ones to train state-of-the-art algorithms, such as YOLOv5 object detector and a keypoint detector based on Resnet-50. For RGB mask detection, YOLOv5 achieved an average precision of 82.4%. For thermal masks, glasses, and caruncle detection, YOLOv5 and keypoint detector achieved an average precision of 96.65% and 78.7%, respectively. Moreover, RGB and thermal datasets were made publicly available.


2021 ◽  
Vol 2021 ◽  
pp. 1-8
Author(s):  
Chun-ke Zhang ◽  
Lu Liu ◽  
Wen-jie Wu ◽  
Yi-qin Wang ◽  
Hai-xia Yan ◽  
...  

Background. Cardiovascular diseases have been always the most common cause of morbidity and mortality worldwide. Health monitoring of high-risk and suspected patients is essential. Currently, invasive coronary angiography is still the most direct and accurate method of determining the severity of coronary artery lesions, but it may not be the optimal clinical choice for suspected patients who had clinical symptoms of coronary heart disease (CHD) such as chest pain but no coronary artery lesion. Modern medical research indicates that radial pulse waves contain substantial pathophysiologic information about the cardiovascular and circulation systems; therefore, analysis of these waves could be a noninvasive technique for assessing cardiovascular disease. Objective. The objective of this study was to analyze the radial pulse wave to construct models for assessing the extent of coronary artery lesions based on pulse features and investigate the latent value of noninvasive detection technology based on pulse wave in the evaluation of cardiovascular disease, so as to promote the development of wearable devices and mobile medicine. Method. This study included 529 patients suspected of CHD who had undergone coronary angiography. Patients were sorted into a control group with no lesions, a 1 or 2 lesion group, and a multiple (3 or more) lesion group as determined by coronary angiography. The linear time-domain features and the nonlinear multiscale entropy features of their radial pulse wave signals were compared, and these features were used to construct models for identifying the range of coronary artery lesions using the k -nearest neighbor (KNN), decision tree (DT), and random forest (RF) machine learning algorithms. The average precision of these algorithms was then compared. Results. (1) Compared with the control group, the group with 1 or 2 lesions had increases in their radial pulse wave time-domain features H2/H1, H3/H1, and W2 ( P < 0.05 ), whereas the group with multiple lesions had decreases in MSE1, MSE2, MSE3, MSE4, and MSE5 ( P < 0.05 ). (2) Compared with the 1 or 2 lesion group, the multiple lesion group had increases in T1/T ( P < 0.05 ) and decreases in T and W1 ( P < 0.05 ). (3) The RF model for identifying numbers of coronary artery lesions had a higher average precision than the models built with KNN or DT. Furthermore, average precision of the model was highest (80.98%) if both time-domain features and multiscale entropy features of radial pulse signals were used to construct the model. Conclusion. Pulse wave signal can identify the range of coronary artery lesions with acceptable accuracy; this result is promising valuable for assessing the severity of coronary artery lesions. The technique could be used to development of mobile medical treatments or remote home monitoring systems for patients suspected or those at high risk of coronary atherosclerotic heart disease.


2021 ◽  
pp. 1-15
Author(s):  
Gang Sha ◽  
Junsheng Wu ◽  
Bin Yu

Purpose: Reading spinal CT (Computed Tomography) images is very important in the diagnosis of spondylosis, which is time-consuming and prones to make biases. In this paper, we propose a framework based on Faster-RCNN to improve detection performances of three spinal fracture lesions: cfracture (cervical fracture), tfracture (thoracic fracture) and lfracture (lumbar fracture). Methods: First, we use ResNet50 to replace VGG16 in backbone network in Faster-RCNN to increase depth of training network. Second, we utilize soft-NMS (Non-Maximum Suppression) instead of NMS to avoid missed detection of overlapped lesions. Third, we simplify RPN (Region Proposal Network) to accelerate training speed and reduce missed detection. Finally, we modify the classifier layer in Faster-RCNN and choose appropriate length-width ratio by changing anchor sizes in sliding window, then adopt multi-scale strategy in training to improve efficiency and accuracy. Results: The experimental results show that the proposed scheme has a good performance, mAP (mean average precision) is 90.6%, IOU (Intersection of Union) is 88.5 and detection time is 0.053 second per CT image, which means our proposed method can accurately detect spinal fracture lesions. Conclusion: Our proposed method can provide assistance and scientific references for both doctors and patients in clinically.


Sign in / Sign up

Export Citation Format

Share Document