scholarly journals An Automatic Method for Interest Point Detection

Author(s):  
I. G. Zubov

Introduction. Computer vision systems are finding widespread application in various life domains. Monocularcamera based systems can be used to solve a wide range of problems. The availability of digital cameras and large sets of annotated data, as well as the power of modern computing technologies, render monocular image analysis a dynamically developing direction in the field of machine vision. In order for any computer vision system to describe objects and predict their actions in the physical space of a scene, the image under analysis should be interpreted from the standpoint of the basic 3D scene. This can be achieved by analysing a rigid object as a set of mutually arranged parts, which represents a powerful framework for reasoning about physical interaction.Objective. Development of an automatic method for detecting interest points of an object in an image.Materials and methods. An automatic method for identifying interest points of vehicles, such as license plates, in an image is proposed. This method allows localization of interest points by analysing the inner layers of convolutional neural networks trained for the classification of images and detection of objects in an image. The proposed method allows identification of interest points without incurring additional costs of data annotation and training.Results. The conducted experiments confirmed the correctness of the proposed method in identifying interest points. Thus, the accuracy of identifying a point on a license plate achieved 97%.Conclusion. A new method for detecting interest points of an object by analysing the inner layers of convolutional neural networks is proposed. This method provides an accuracy similar to or exceeding that of other modern methods.

Computer vision is a scientific field that deals with how computers can acquire significant level comprehension from computerized images or videos. One of the keystones of computer vision is object detection that aims to identify relevant features from video or image to detect objects. Backbone is the first stage in object detection algorithms that play a crucial role in object detection. Object detectors are usually provided with backbone networks designed for image classification. Object detection performance is highly based on features extracted by backbones, for instance, by simply replacing a backbone with its extended version, a large accuracy metric grows up. Additionally, the backbone's importance is demonstrated by its efficiency in real-time object detection. In this paper, we aim to accumulate the crucial role of the deep learning era and convolutional neural networks in particular in object detection tasks. We have analyzed and have been concentrating on a wide range of reviews on convolutional neural networks used as the backbone of object detection models. Building, therefore, a review of backbones that help researchers and scientists to use it as a guideline for their works.


2021 ◽  
Vol 6 ◽  
pp. 93-101
Author(s):  
Andrey Litvynchuk ◽  
◽  
Lesia Baranovska ◽  
◽  

Face recognition is one of the main tasks of computer vision, which is relevant due to its practical significance and great interest of wide range of scientists. It has many applications, which has led to a huge amount of research in this area. And although research in the field has been going on since the beginning of the computer vision, good results could be achieved only with the help of convolutional neural networks. In this work, a comparative analysis of facial recognition methods before convolutional neural networks was performed. A metric learning approach, augmentations and learning rate schedulers are considered. There were performed bunch of experiments and comparative analysis of the considered methods of improvement of convolutional neural networks. As a result a universal algorithm for training the face recognition model was obtained. In this work, we used SE-ResNet50 as the only neural network for experiments. Metric learning is a method by which it is possible to achieve good accuracy in face recognition. Overfitting is a big problem of neural networks, in particular because they have too many parameters and usually not enough data to guarantee the generalization of the model. Additional data labeling can be time-consuming and expensive, so there is such an approach as augmentation. Augmentations artificially increase the training dataset, so as expected, this method improved the results relative to the original experiment in all experiments. Different degrees and more aggressive forms of augmentation in this work led to better results. As expected, the best learning rate scheduler was cosine scheduler with warm-ups and restarts. This schedule has few parameters, so it is also easy to use. In general, using different approaches, we were able to obtain an accuracy of 93,5 %, which is 22 % better than the baseline experiment. In the following studies, it is planned to consider improving not only the model of facial recognition, but also detection. The accuracy of face detection directly depends on the quality of face recognition.


2021 ◽  
pp. PP. 18-50
Author(s):  
Ahmed A. Elngar ◽  
◽  
◽  
◽  
◽  
...  

Computer vision is one of the fields of computer science that is one of the most powerful and persuasive types of artificial intelligence. It is similar to the human vision system, as it enables computers to recognize and process objects in pictures and videos in the same way as humans do. Computer vision technology has rapidly evolved in many fields and contributed to solving many problems, as computer vision contributed to self-driving cars, and cars were able to understand their surroundings. The cameras record video from different angles around the car, then a computer vision system gets images from the video, and then processes the images in real-time to find roadside ends, detect other cars, and read traffic lights, pedestrians, and objects. Computer vision also contributed to facial recognition; this technology enables computers to match images of people’s faces to their identities. which these algorithms detect facial features in images and then compare them with databases. Computer vision also play important role in Healthcare, in which algorithms can help automate tasks such as detecting Breast cancer, finding symptoms in x-ray, cancerous moles in skin images, and MRI scans. Computer vision also contributed to many fields such as image classification, object discovery, motion recognition, subject tracking, and medicine. The rapid development of artificial intelligence is making machine learning more important in his field of research. Use algorithms to find out every bit of data and predict the outcome. This has become an important key to unlocking the door to AI. If we had looked to deep learning concept, we find deep learning is a subset of machine learning, algorithms inspired by structure and function of the human brain called artificial neural networks, learn from large amounts of data. Deep learning algorithm perform a task repeatedly, each time tweak it a little to improve the outcome. So, the development of computer vision was due to deep learning. Now we'll take a tour around the convolution neural networks, let us say that convolutional neural networks are one of the most powerful supervised deep learning models (abbreviated as CNN or ConvNet). This name ;convolutional ; is a token from a mathematical linear operation between matrixes called convolution. CNN structure can be used in a variety of real-world problems including, computer vision, image recognition, natural language processing (NLP), anomaly detection, video analysis, drug discovery, recommender systems, health risk assessment, and time-series forecasting. If we look at convolutional neural networks, we see that CNN are similar to normal neural networks, the only difference between CNN and ANN is that CNNs are used in the field of pattern recognition within images mainly. This allows us to encode the features of an image into the structure, making the network more suitable for image-focused tasks, with reducing the parameters required to set-up the model. One of the advantages of CNN that it has an excellent performance in machine learning problems. So, we will use CNN as a classifier for image classification. So, the objective of this paper is that we will talk in detail about image classification in the following sections.


2019 ◽  
Vol 31 (8) ◽  
pp. 1320 ◽  
Author(s):  
Hanli Zhao ◽  
Junru Liu ◽  
Lei Jiang ◽  
Jianbing Shen ◽  
Mingxiao Hu

2018 ◽  
Vol 7 (2.7) ◽  
pp. 614 ◽  
Author(s):  
M Manoj krishna ◽  
M Neelima ◽  
M Harshali ◽  
M Venu Gopala Rao

The image classification is a classical problem of image processing, computer vision and machine learning fields. In this paper we study the image classification using deep learning. We use AlexNet architecture with convolutional neural networks for this purpose. Four test images are selected from the ImageNet database for the classification purpose. We cropped the images for various portion areas and conducted experiments. The results show the effectiveness of deep learning based image classification using AlexNet.  


The vehicle classification and detecting its license plate are important tasks in intelligent security and transportation systems. However, theexisting methods of vehicle classification and detection are highly complex which provides coarse-grained outcomesbecause of underfitting or overfitting of the model. Due toadvanced accomplishmentsof the Deep Learning, it was efficiently implemented to image classification and detection of objects. This proposed paper come up with a new approach which makes use of convolutional neural networks concept in Deep Learning.It consists of two steps: i) vehicle classification ii) vehicle license plate recognition. Numerous classicmodules of neural networks hadbeen implemented in training and testing the vehicle classification and detection of license plate model, such as CNN (convolutional neural networks), TensorFlow, and Tesseract-OCR. The suggestedtechnique candetermine the vehicle type, number plate and other alternative dataeffectively. This model provides security and log details regarding vehicles by using AI Surveillance. It guides the surveillance operators and assists human resources. With the help of the original dataset (training) and enriched dataset (testing), this customized model(algorithm) can achieve best outcomewith a standard accuracy of around 97.32% inclassification and detection of vehicles. By enlarging the quantity of the training dataset, the loss function and mislearning rate declines progressively. Therefore, this proposedmodelwhich uses DeepLearning hadbetterperformance and flexibility. When compared to outstandingtechniques in the strategicImage datasets, this deep learning modelscan gethigher competitor outcomes. Eventually, the proposed system suggests modern methods for advancementof the customized model and forecasts the progressivegrowth of deep learningperformance in the explorationof artificial intelligence (AI) &machine learning (ML) techniques.


Aerospace ◽  
2020 ◽  
Vol 7 (12) ◽  
pp. 171
Author(s):  
Anil Doğru ◽  
Soufiane Bouarfa ◽  
Ridwan Arizar ◽  
Reyhan Aydoğan

Convolutional Neural Networks combined with autonomous drones are increasingly seen as enablers of partially automating the aircraft maintenance visual inspection process. Such an innovative concept can have a significant impact on aircraft operations. Though supporting aircraft maintenance engineers detect and classify a wide range of defects, the time spent on inspection can significantly be reduced. Examples of defects that can be automatically detected include aircraft dents, paint defects, cracks and holes, and lightning strike damage. Additionally, this concept could also increase the accuracy of damage detection and reduce the number of aircraft inspection incidents related to human factors like fatigue and time pressure. In our previous work, we have applied a recent Convolutional Neural Network architecture known by MASK R-CNN to detect aircraft dents. MASK-RCNN was chosen because it enables the detection of multiple objects in an image while simultaneously generating a segmentation mask for each instance. The previously obtained F1 and F2 scores were 62.67% and 59.35%, respectively. This paper extends the previous work by applying different techniques to improve and evaluate prediction performance experimentally. The approach uses include (1) Balancing the original dataset by adding images without dents; (2) Increasing data homogeneity by focusing on wing images only; (3) Exploring the potential of three augmentation techniques in improving model performance namely flipping, rotating, and blurring; and (4) using a pre-classifier in combination with MASK R-CNN. The results show that a hybrid approach combining MASK R-CNN and augmentation techniques leads to an improved performance with an F1 score of (67.50%) and F2 score of (66.37%).


Author(s):  
Matthew L. Dering ◽  
Conrad S. Tucker

The authors of this work present a computer vision approach that discovers and classifies objects in a video stream, towards an automated system for managing End of Life (EOL) waste streams. Currently, the sorting stage of EOL waste management is an extremely manual and tedious process that increases the costs of EOL options and minimizes its attractiveness as a profitable enterprise solution. There have been a wide range of EOL methodologies proposed in the engineering design community that focus on determining the optimal EOL strategies of reuse, recycle, remanufacturing and resynthesis. However, many of these methodologies assume a product/component disassembly cost based on human labor, which hereby increases the cost of EOL waste management. For example, recent EOL options such as resynthesis, rely heavily on the optimal sorting and combining of components in a novel way to form new products. This process however, requires considerable manual labor that may make this option less attractive, given products with highly complex interactions and components. To mitigate these challenges, the authors propose a computer vision system that takes live video streams of incoming EOL waste and i) automatically identifies and classifies products/components of interest and ii) predicts the EOL process that will be needed for a given product/component that is classified. A case study involving an EOL waste stream video is used to demonstrate the predictive accuracy of the proposed methodology in identifying and classifying EOL objects.


Sign in / Sign up

Export Citation Format

Share Document