A Hybrid Approach for Video Indexing Using Computer Vision and Speech Recognition

2020 ◽  
pp. 213-225
Author(s):  
Saksham Jain ◽  
Akshit Pradhan ◽  
Vijay Kumar
Sensors ◽  
2019 ◽  
Vol 19 (19) ◽  
pp. 4254
Author(s):  
Tiago Davi Oliveira de Araújo ◽  
Carlos Gustavo Resque dos Santos ◽  
Rodrigo Santos do Amor Divino Lima ◽  
Bianchi Serique Meiguins

The adaptability between different environments remains a challenge for Mobile Augmented Reality (MAR). If not done seamlessly, such transitions may cause discontinuities in navigation, consequently disorienting users and undermining the acceptance of this technology. The transition between environments is hard because there are currently no localization techniques that work well in any place: sensor-based applications can be harmed by obstacles that hamper sensor communication (e.g., GPS) and by infrastructure limitations (e.g., Wi-Fi), and image-based applications can be affected by lighting conditions that impair computer vision techniques. Hence, this paper presents an adaptive model to perform transitions between different types of environments for MAR applications. The model has a hybrid approach, choosing the best combination of long-range sensors, short-range sensors, and computer vision techniques to perform fluid transitions between environments that mitigate problems in location, orientation, and registration. To assess the model, we developed a MAR application and conducted a navigation test with volunteers to validate transitions between outdoor and indoor environments, followed by a short interview. The results show that the transitions were well succeeded, since the application self-adapted to the studied environments, seamlessly changing sensors when needed.


Author(s):  
Oksana Chulanova

The article discusses the capabilities of artificial intelligence technologies - technologies based on the use of artificial intelligence, including natural language processing, intellectual decision support, computer vision, speech recognition and synthesis, and promising methods of artificial intelligence. The results of the author's study and the analysis of artificial intelligence technologies and their capabilities for optimizing work with staff are presented. A study conducted by the author allowed us to develop an author's concept of integrating artificial intelligence technologies into work with personnel in the digital paradigm.


2020 ◽  
Vol 10 (18) ◽  
pp. 6460
Author(s):  
Junaid Younas ◽  
Shoaib Ahmed Siddiqui ◽  
Mohsin Munir ◽  
Muhammad Imran Malik ◽  
Faisal Shafait ◽  
...  

We propose a novel hybrid approach that fuses traditional computer vision techniques with deep learning models to detect figures and formulas from document images. The proposed approach first fuses the different computer vision based image representations, i.e., color transform, connected component analysis, and distance transform, termed as Fi-Fo image representation. The Fi-Fo image representation is then fed to deep models for further refined representation-learning for detecting figures and formulas from document images. The proposed approach is evaluated on a publicly available ICDAR-2017 Page Object Detection (POD) dataset and its corrected version. It produces the state-of-the-art results for formula and figure detection in document images with an f1-score of 0.954 and 0.922, respectively. Ablation study results reveal that the Fi-Fo image representation helps in achieving superior performance in comparison to raw image representation. Results also establish that the hybrid approach helps deep models to learn more discriminating and refined features.


2020 ◽  
Vol 4 (2) ◽  
pp. 121
Author(s):  
Nova Resfita ◽  
Rahmadi Kurnia ◽  
Fitrilina Fitrilina

The development of computer vision has expanded widely as there is a vast number of its applications in various aspects of daily life. One of its implementations is integrating the image processing technique on a prototype coffee machine based on the speech recognition system. This study aims to detect the requested coffee colour spoken by users which are black, middle and light. The sensor used in this research is a digital PC camera and the applied method is Multilevel Colour Thresholding. Of all experiments conducted, the image processing technique can work perfectly as the camera is able to identify the requested colour of the coffee solution. Furthermore, the system might be developed by improving the multilevel colour thresholding technique as well as advancing the hardware design in order to establish more robust coffee machine based on the requested colour.


Author(s):  
Anish Tatke ◽  
Madhura Patil ◽  
Anuj Khot ◽  
Parul Jadhav ◽  
Dr Vishwanath Karad

As waste segregation becomes an important issue in our lives, with the use of technology like deep neural networks and computer vision, we can make the process efficient and robust by image segmentation and classification. These systems on the rise need accurate and efficient segmentation and recognition mechanisms and this demand coincides with the increase of computational capabilities of modern computer architectures and more effective algorithms for image recognition. This paper does a comparative analysis of various different approaches and methods like Simple CNN, ResNet50, VGG16, etc in brief. The comparative analysis and study explains the performance of every approach, this paper concludes that ResNet50 gives excellent performance. VGG16 network also provides good performance which meets the needs of daily use.


Sign in / Sign up

Export Citation Format

Share Document