Automatic accounting of Baikal diatomic algae: approaches and prospects

Author(s):  
Кonstantin А. Elshin ◽  
Еlena I. Molchanova ◽  
Мarina V. Usoltseva ◽  
Yelena V. Likhoshway

Using the TensorFlow Object Detection API, an approach to identifying and registering Baikal diatom species Synedra acus subsp. radians has been tested. As a result, a set of images was formed and training was conducted. It is shown that аfter 15000 training iterations, the total value of the loss function was obtained equal to 0,04. At the same time, the classification accuracy is equal to 95%, and the accuracy of construction of the bounding box is also equal to 95%.

2021 ◽  
Vol 13 (21) ◽  
pp. 4291
Author(s):  
Luyang Zhang ◽  
Haitao Wang ◽  
Lingfeng Wang ◽  
Chunhong Pan ◽  
Qiang Liu ◽  
...  

Rotated object detection is an extension of object detection that uses an oriented bounding box instead of a general horizontal bounding box to define the object position. It is widely used in remote sensing images, scene text, and license plate recognition. The existing rotated object detection methods usually add an angle prediction channel in the bounding box prediction branch, and smooth L1 loss is used as the regression loss function. However, we argue that smooth L1 loss causes a sudden change in loss and slow convergence due to the angle solving mechanism of openCV (the angle between the horizontal line and the first side of the bounding box in the counter-clockwise direction is defined as the rotation angle), and this problem exists in most existing regression loss functions. To solve the above problems, we propose a decoupling modulation mechanism to overcome the problem of sudden changes in loss. On this basis, we also proposed a constraint mechanism, the purpose of which is to accelerate the convergence of the network and ensure optimization toward the ideal direction. In addition, the proposed decoupling modulation mechanism and constraint mechanism can be integrated into the popular regression loss function individually or together, which further improves the performance of the model and makes the model converge faster. The experimental results show that our method achieves 75.2% performance on the aerial image dataset DOTA (OBB task), and saves more than 30% of computing resources. The method also achieves a state-of-the-art performance in HRSC2016, and saved more than 40% of computing resources, which confirms the applicability of the approach.


Sensors ◽  
2021 ◽  
Vol 21 (9) ◽  
pp. 2939
Author(s):  
Yong Hong ◽  
Jin Liu ◽  
Zahid Jahangir ◽  
Sheng He ◽  
Qing Zhang

This paper provides an efficient way of addressing the problem of detecting or estimating the 6-Dimensional (6D) pose of objects from an RGB image. A quaternion is used to define an object′s three-dimensional pose, but the pose represented by q and the pose represented by -q are equivalent, and the L2 loss between them is very large. Therefore, we define a new quaternion pose loss function to solve this problem. Based on this, we designed a new convolutional neural network named Q-Net to estimate an object’s pose. Considering that the quaternion′s output is a unit vector, a normalization layer is added in Q-Net to hold the output of pose on a four-dimensional unit sphere. We propose a new algorithm, called the Bounding Box Equation, to obtain 3D translation quickly and effectively from 2D bounding boxes. The algorithm uses an entirely new way of assessing the 3D rotation (R) and 3D translation rotation (t) in only one RGB image. This method can upgrade any traditional 2D-box prediction algorithm to a 3D prediction model. We evaluated our model using the LineMod dataset, and experiments have shown that our methodology is more acceptable and efficient in terms of L2 loss and computational time.


Author(s):  
Donggeun Kim ◽  
San Kim ◽  
Siheon Jeong ◽  
Ji‐Wan Ham ◽  
Seho Son ◽  
...  

Author(s):  
Hui-Shen Yuan ◽  
Si-Bao Chen ◽  
Bin Luo ◽  
Hao Huang ◽  
Qiang Li

2021 ◽  
Vol 13 (22) ◽  
pp. 4517
Author(s):  
Falin Wu ◽  
Jiaqi He ◽  
Guopeng Zhou ◽  
Haolun Li ◽  
Yushuang Liu ◽  
...  

Object detection in remote sensing images plays an important role in both military and civilian remote sensing applications. Objects in remote sensing images are different from those in natural images. They have the characteristics of scale diversity, arbitrary directivity, and dense arrangement, which causes difficulties in object detection. For objects with a large aspect ratio and that are oblique and densely arranged, using an oriented bounding box can help to avoid deleting some correct detection bounding boxes by mistake. The classic rotational region convolutional neural network (R2CNN) has advantages for text detection. However, R2CNN has poor performance in the detection of slender objects with arbitrary directivity in remote sensing images, and its fault tolerance rate is low. In order to solve this problem, this paper proposes an improved R2CNN based on a double detection head structure and a three-point regression method, namely, TPR-R2CNN. The proposed network modifies the original R2CNN network structure by applying a double fully connected (2-fc) detection head and classification fusion. One detection head is for classification and horizontal bounding box regression, the other is for classification and oriented bounding box regression. The three-point regression method (TPR) is proposed for oriented bounding box regression, which determines the positions of the oriented bounding box by regressing the coordinates of the center point and the first two vertices. The proposed network was validated on the DOTA-v1.5 and HRSC2016 datasets, and it achieved a mean average precision (mAP) of 3.90% and 15.27%, respectively, from feature pyramid network (FPN) baselines with a ResNet-50 backbone.


Author(s):  
Yuqing Zhao ◽  
Jinlu Jia ◽  
Di Liu ◽  
Yurong Qian

Aerial image-based target detection has problems such as low accuracy in multiscale target detection situations, slow detection speed, missed targets and falsely detected targets. To solve this problem, this paper proposes a detection algorithm based on the improved You Only Look Once (YOLO)v3 network architecture from the perspective of model efficiency and applies it to multiscale image-based target detection. First, the K-means clustering algorithm is used to cluster an aerial dataset and optimize the anchor frame parameters of the network to improve the effectiveness of target detection. Second, the feature extraction method of the algorithm is improved, and a feature fusion method is used to establish a multiscale (large-, medium-, and small-scale) prediction layer, which mitigates the problem of small target information loss in deep networks and improves the detection accuracy of the algorithm. Finally, label regularization processing is performed on the predicted value, the generalized intersection over union (GIoU) is used as the bounding box regression loss function, and the focal loss function is integrated into the bounding box confidence loss function, which not only improves the target detection accuracy but also effectively reduces the false detection rate and missed target rate of the algorithm. An experimental comparison on the RSOD and NWPU VHR-10 aerial datasets shows that the detection effect of high-efficiency YOLO (HE-YOLO) is significantly improved compared with that of YOLOv3, and the average detection accuracies are increased by 8.92% and 7.79% on the two datasets, respectively. The algorithm not only shows better detection performance for multiscale targets but also reduces the missed target rate and false detection rate and has good robustness and generalizability.


Sign in / Sign up

Export Citation Format

Share Document