scholarly journals Walnut Ripeness Detection Based on Coupling Information and Lightweight YOLOv4

Author(s):  
Kaixuan Cui ◽  
Shuchai Su ◽  
Jiawei Cai ◽  
Fengjun Chen

To realize rapid and accurate ripeness detection for walnut on mobile terminals such as mobile phones, we propose a method based on coupling information and lightweight YOLOv4. First, we collected 50 walnuts at each ripeness (Unripe, Mid-ripe, Ripe, Over-ripe) to determine the kernel oil content. Pearson correlation analysis and one-way analysis of variance (ANOVA) prove that the division of walnut ripeness reflects the change in kernel oil content. It is feasible to estimate the kernel oil content by detecting the ripeness of walnut. Next, we achieve ripeness detection based on lightweight YOLOv4. We adopt MobileNetV3 as the backbone feature extractor and adopt depthwise separable convolution to replace the traditional convolution. We design a parallel convolution structure with depthwise convolution stacking (PCSDCS) to reduce parameters and improve feature extraction ability. To enhance the model’s detection ability for walnuts in the growth-intensive areas, we design a Gaussian Soft DIoU non-maximum suppression (GSDIoU-NMS) algorithm. The dataset used for model optimization contains 3600 images, of which 2880 images in the training set, 320 images in the validation set, and 400 images in the test set. We adopt a multi-training strategy based on dynamic learning rate and transfer learning to get training weights. The lightweight YOLOv4 model achieves 94.05%, 90.72%, 88.30%, 76.92 FPS, and 38.14 MB in mean average precision, precision, recall, average detection speed, and weight capacity, respectively. Compared with the Faster R-CNN model, EfficientDet-D1 model, YOLOv3 model, and YOLOv4 model, the lightweight YOLOv4 model improves 8.77%, 4.84%, 5.43%, and 0.06% in mean average precision, 74.60 FPS, 55.60 FPS, 38.83 FPS, and 46.63 FPS in detection speed, respectively. And the lightweight YOLOv4 is 84.4% smaller than the original YOLOv4 model in terms of weight capacity. This paper provides a theoretical reference for the rapid ripeness detection of walnut and exploration for the model’s lightweight.

Author(s):  
Anthony Anggrawan ◽  
Azhari

Information searching based on users’ query, which is hopefully able to find the documents based on users’ need, is known as Information Retrieval. This research uses Vector Space Model method in determining the similarity percentage of each student’s assignment. This research uses PHP programming and MySQL database. The finding is represented by ranking the similarity of document with query, with mean average precision value of 0,874. It shows how accurate the application with the examination done by the experts, which is gained from the evaluation with 5 queries that is compared to 25 samples of documents. If the number of counted assignments has higher similarity, thus the process of similarity counting needs more time, it depends on the assignment’s number which is submitted.


2018 ◽  
Vol 10 (1) ◽  
pp. 57-64 ◽  
Author(s):  
Rizqa Raaiqa Bintana ◽  
Chastine Fatichah ◽  
Diana Purwitasari

Community-based question answering (CQA) is formed to help people who search information that they need through a community. One condition that may occurs in CQA is when people cannot obtain the information that they need, thus they will post a new question. This condition can cause CQA archive increased because of duplicated questions. Therefore, it becomes important problems to find semantically similar questions from CQA archive towards a new question. In this study, we use convolutional neural network methods for semantic modeling of sentence to obtain words that they represent the content of documents and new question. The result for the process of finding the same question semantically to a new question (query) from the question-answer documents archive using the convolutional neural network method, obtained the mean average precision value is 0,422. Whereas by using vector space model, as a comparison, obtained mean average precision value is 0,282. Index Terms—community-based question answering, convolutional neural network, question retrieval


Crop Science ◽  
1963 ◽  
Vol 3 (4) ◽  
pp. 354-355 ◽  
Author(s):  
D. E. Alexander ◽  
R. D. Seif

2021 ◽  
pp. 1-11
Author(s):  
Tingting Zhao ◽  
Xiaoli Yi ◽  
Zhiyong Zeng ◽  
Tao Feng

YTNR (Yunnan Tongbiguan Nature Reserve) is located in the westernmost part of China’s tropical regions and is the only area in China with the tropical biota of the Irrawaddy River system. The reserve has abundant tropical flora and fauna resources. In order to realize the real-time detection of wild animals in this area, this paper proposes an improved YOLO (You only look once) network. The original YOLO model can achieve higher detection accuracy, but due to the complex model structure, it cannot achieve a faster detection speed on the CPU detection platform. Therefore, the lightweight network MobileNet is introduced to replace the backbone feature extraction network in YOLO, which realizes real-time detection on the CPU platform. In response to the difficulty in collecting wild animal image data, the research team deployed 50 high-definition cameras in the study area and conducted continuous observations for more than 1,000 hours. In the end, this research uses 1410 images of wildlife collected in the field and 1577 wildlife images from the internet to construct a research data set combined with the manual annotation of domain experts. At the same time, transfer learning is introduced to solve the problem of insufficient training data and the network is difficult to fit. The experimental results show that our model trained on a training set containing 2419 animal images has a mean average precision of 93.6% and an FPS (Frame Per Second) of 3.8 under the CPU. Compared with YOLO, the mean average precision is increased by 7.7%, and the FPS value is increased by 3.


Electronics ◽  
2021 ◽  
Vol 10 (2) ◽  
pp. 197
Author(s):  
Meng-ting Fang ◽  
Zhong-ju Chen ◽  
Krzysztof Przystupa ◽  
Tao Li ◽  
Michal Majka ◽  
...  

Examination is a way to select talents, and a perfect invigilation strategy can improve the fairness of the examination. To realize the automatic detection of abnormal behavior in the examination room, the method based on the improved YOLOv3 (The third version of the You Only Look Once algorithm) algorithm is proposed. The YOLOv3 algorithm is improved by using the K-Means algorithm, GIoUloss, focal loss, and Darknet32. In addition, the frame-alternate dual-thread method is used to optimize the detection process. The research results show that the improved YOLOv3 algorithm can improve both the detection accuracy and detection speed. The frame-alternate dual-thread method can greatly increase the detection speed. The mean Average Precision (mAP) of the improved YOLOv3 algorithm on the test set reached 88.53%, and the detection speed reached 42 Frames Per Second (FPS) in the frame-alternate dual-thread detection method. The research results provide a certain reference for automated invigilation.


Entropy ◽  
2021 ◽  
Vol 23 (11) ◽  
pp. 1507
Author(s):  
Feiyu Zhang ◽  
Luyang Zhang ◽  
Hongxiang Chen ◽  
Jiangjian Xie

Deep convolutional neural networks (DCNNs) have achieved breakthrough performance on bird species identification using a spectrogram of bird vocalization. Aiming at the imbalance of the bird vocalization dataset, a single feature identification model (SFIM) with residual blocks and modified, weighted, cross-entropy function was proposed. To further improve the identification accuracy, two multi-channel fusion methods were built with three SFIMs. One of these fused the outputs of the feature extraction parts of three SFIMs (feature fusion mode), the other fused the outputs of the classifiers of three SFIMs (result fusion mode). The SFIMs were trained with three different kinds of spectrograms, which were calculated through short-time Fourier transform, mel-frequency cepstrum transform and chirplet transform, respectively. To overcome the shortage of the huge number of trainable model parameters, transfer learning was used in the multi-channel models. Using our own vocalization dataset as a sample set, it is found that the result fusion mode model outperforms the other proposed models, the best mean average precision (MAP) reaches 0.914. Choosing three durations of spectrograms, 100 ms, 300 ms and 500 ms for comparison, the results reveal that the 300 ms duration is the best for our own dataset. The duration is suggested to be determined based on the duration distribution of bird syllables. As for the performance with the training dataset of BirdCLEF2019, the highest classification mean average precision (cmAP) reached 0.135, which means the proposed model has certain generalization ability.


2021 ◽  
Vol 13 (22) ◽  
pp. 4675
Author(s):  
William Yamada ◽  
Wei Zhao ◽  
Matthew Digman

An automatic method of obtaining geographic coordinates of bales using monovision un-crewed aerial vehicle imagery was developed utilizing a data set of 300 images with a 20-megapixel resolution containing a total of 783 labeled bales of corn stover and soybean stubble. The relative performance of image processing with Otsu’s segmentation, you only look once version three (YOLOv3), and region-based convolutional neural networks was assessed. As a result, the best option in terms of accuracy and speed was determined to be YOLOv3, with 80% precision, 99% recall, 89% F1 score, 97% mean average precision, and a 0.38 s inference time. Next, the impact of using lower-cost cameras was evaluated by reducing image quality to one megapixel. The lower-resolution images resulted in decreased performance, with 79% precision, 97% recall, 88% F1 score, 96% mean average precision, and 0.40 s inference time. Finally, the output of the YOLOv3 trained model, density-based spatial clustering, photogrammetry, and map projection were utilized to predict the geocoordinates of the bales with a root mean squared error of 2.41 m.


2021 ◽  
Author(s):  
Komuravelli Prashanth ◽  
Kalidas Yeturu

<div>There are millions of scanned documents worldwide in around 4 thousand languages. Searching for information in a scanned document requires a text layer to be available and indexed. Preparation of a text layer requires recognition of character and sub-region patterns and associating with a human interpretation. Developing an optical character recognition (OCR) system for each and every language is a very difficult task if not impossible. There is a strong need for systems that add on top of the existing OCR technologies by learning from them and unifying disparate multitude of many a system. In this regard, we propose an algorithm that leverages the fact that we are dealing with scanned documents of handwritten text regions from across diverse domains and language settings. We observe that the text regions have consistent bounding box sizes and any large font or tiny font scenarios can be handled in preprocessing or postprocessing phases. The image subregions are smaller in size in scanned text documents compared to subregions formed by common objects in general purpose images. We propose and validate the hypothesis that a much simpler convolution neural network (CNN) having very few layers and less number of filters can be used for detecting individual subregion classes. For detection of several hundreds of classes, multiple such simpler models can be pooled to operate simultaneously on a document. The advantage of going by pools of subregion specific models is the ability to deal with incremental addition of hundreds of newer classes over time, without disturbing the previous models in the continual learning scenario. Such an approach has distinctive advantage over using a single monolithic model where subregions classes share and interfere via a bulky common neural network. We report here an efficient algorithm for building a subregion specific lightweight CNN models. The training data for the CNN proposed, requires engineering synthetic data points that consider both pattern of interest and non-patterns as well. We propose and validate the hypothesis that an image canvas in which optimal amount of pattern and non-pattern can be formulated using a means squared error loss function to influence filter for training from the data. The CNN hence trained has the capability to identify the character-object in presence of several other objects on a generalized test image of a scanned document. In this setting some of the key observations are in a CNN, learning a filter depends not only on the abundance of patterns of interest but also on the presence of a non-pattern context. Our experiments have led to some of the key observations - (i) a pattern cannot be over-expressed in isolation, (ii) a pattern cannot be under-xpressed as well, (iii) a non-pattern can be of salt and pepper type noise and finally (iv) it is sufficient to provide a non-pattern context to a modest representation of a pattern to result in strong individual sub-region class models. We have carried out studies and reported \textit{mean average precision} scores on various data sets including (1) MNIST digits(95.77), (2) E-MNIST capital alphabet(81.26), (3) EMNIST small alphabet(73.32) (4) Kannada digits(95.77), (5) Kannada letters(90.34), (6) Devanagari letters(100) (7) Telugu words(93.20) (8) Devanagari words(93.20) and also on medical prescriptions and observed high-performance metrics of mean average precision over 90%. The algorithm serves as a kernel in the automatic annotation of digital documents in diverse scenarios such as annotation of ancient manuscripts and hand-written health records.</div>


2019 ◽  
Vol 2019 ◽  
pp. 1-10
Author(s):  
Jintao Wang ◽  
Mingxia Shen ◽  
Longshen Liu ◽  
Yi Xu ◽  
Cedric Okinda

Digestive diseases are one of the common broiler diseases that significantly affect production and animal welfare in broiler breeding. Droppings examination and observation are the most precise techniques to detect the occurrence of digestive disease infections in birds. This study proposes an automated broiler digestive disease detector based on a deep Convolutional Neural Network model to classify fine-grained abnormal broiler droppings images as normal and abnormal (shape, color, water content, and shape&water). Droppings images were collected from 10,000 25-35-day-old Ross broiler birds reared in multilayer cages with automatic droppings conveyor belts. For comparative purposes, Faster R-CNN and YOLO-V3 deep Convolutional Neural Networks were developed. The performance of YOLO-V3 was improved by optimizing the anchor box. Faster R-CNN achieved 99.1% recall and 93.3% mean average precision, while YOLO-V3 achieved 88.7% recall and 84.3% mean average precision on the testing data set. The proposed detector can provide technical support for the detection of digestive diseases in broiler production by automatically and nonintrusively recognizing and classifying chicken droppings.


Sign in / Sign up

Export Citation Format

Share Document