Advancements in Computer Vision Applications in Intelligent Systems and Multimedia Technologies - Advances in Computational Intelligence and Robotics
Latest Publications


TOTAL DOCUMENTS

14
(FIVE YEARS 14)

H-INDEX

0
(FIVE YEARS 0)

Published By IGI Global

9781799844440, 9781799844457

Author(s):  
Mohammed Erritali ◽  
Youssef Chouni ◽  
Youssef Ouadid

The main difficulty in developing a successful optical character recognition (OCR) system lies in the confusion between the characters. In the case of Amazigh writing (Tifinagh alphabets), some characters have similarities based on rotation or scale. Most of the researchers attempted to solve this problem by combining multiple descriptors and / or classifiers which increased the recognition rate, but at the expense of processing time that becomes more prohibitive. Thus, reducing the confusion of characters and their recognition times is the major challenge of OCR systems. In this chapter, the authors present an off-line OCR system for Tifinagh characters.


Author(s):  
Tobias Lampprecht ◽  
David Salb ◽  
Marek Mauser ◽  
Huub van de Wetering ◽  
Michael Burch ◽  
...  

Formula One races provide a wealth of data worth investigating. Although the time-varying data has a clear structure, it is pretty challenging to analyze it for further properties. Here the focus is on a visual classification for events, drivers, as well as time periods. As a first step, the Formula One data is visually encoded based on a line plot visual metaphor reflecting the dynamic lap times, and finally, a classification of the races based on the visual outcomes gained from these line plots is presented. The visualization tool is web-based and provides several interactively linked views on the data; however, it starts with a calendar-based overview representation. To illustrate the usefulness of the approach, the provided Formula One data from several years is visually explored while the races took place in different locations. The chapter discusses algorithmic, visual, and perceptual limitations that might occur during the visual classification of time-series data such as Formula One races.


Author(s):  
Mohit Dua ◽  
Abhinav Mudgal ◽  
Mukesh Bhakar ◽  
Priyal Dhiman ◽  
Bhagoti Choudhary

In this chapter, a human detection system based on unsupervised learning method K-means clustering followed by deep learning approach You Only Look Once (YOLO) on thermal imagery has been proposed. Generally, images in the visible spectrum are used to conduct such human detection, which are not suitable for nighttime due to low visibility, hence for evaluation of our system. Hence, long wave infrared (LWIR) images have been used to implement the proposed work in this chapter. The system follows a two-step approach of generating anchor boxes using K-means clustering and then using those anchor boxes in 252 layered single shot detector (YOLO) to predict proper boundary boxes. The dataset of such images is provided by FLIR company. The dataset contains 6822 images for training purposes and 757 images for the validation. This proposed system can be used for real-time object detection as YOLO can achieve much higher rate of processing when compared to traditional method like HAAR cascade classifier in long wave infrared imagery (LWIR).


Author(s):  
Abderrahim Bajit

Region of interest (ROI) image and video compression techniques have been widely used in visual communication applications in an effort to deliver good quality images and videos at limited bandwidths. Foveated imaging exploits the fact that the spatial resolution of the human visual system (HVS) is highest around the point of fixation (foveation point) and decreases dramatically with increasing eccentricity. Exploiting this fact, the authors have developed an appropriate metric for the assessment of ROI coded images, adapted to foveation image coding based on psycho-visual quality optimization tools, which objectively enable us to assess the visual quality measurement with respect to the region of interest (ROI) of the human observer. The proposed metric yields a quality factor called foveation probability score (FPS) that correlates well with visual error perception and demonstrating very good perceptual quality evaluation.


Author(s):  
Sanjida Nasreen Tumpa ◽  
K. N. Pavan Kumar ◽  
Madeena Sultana ◽  
Gee-Sern Jison Hsu ◽  
Orly Yadid-Pecht ◽  
...  

Smart societies of the future will increasingly rely on harvesting rich information generated by day-to-day activities and interactions of its inhabitants. Among the multitude of such interactions, web-based social networking activities became an integral part of everyday human communication. Flickr, Facebook, Twitter, and LinkedIn are currently used by millions of users worldwide as a source of information, which is growing exponentially over time. In addition to idiosyncratic personal characteristics, web-based social data include person-to-person communication, online activity patterns, and temporal information, among others. However, analysis of social interaction-based data has been studied from the perspective of person identification only recently. In this chapter, the authors elaborate on the concept of using interaction-based features from online social networking platforms as a part of social behavioral biometrics research domain. They place this research in the context of smart societies and discuss novel social biometric features and their potential use in various applications.


Author(s):  
Abhishek Das ◽  
Mihir Narayan Mohanty

In this chapter, the authors have given a detailed review on optical character recognition. Various methods are used in this field with different accuracy levels. Still there are some difficulties in recognizing handwritten characters because of different writing styles of different individuals even in a particular language. A comparative study is given to understand different types of optical character recognition along with different methods used in each type. Implementation of neural network in different forms is found in most of the works. Different image processing techniques like OCR with CNN, RNN, combination of CNN and RNN, etc. are observed in recent research works.


Author(s):  
Andrew Jong ◽  
Melody Moh ◽  
Teng-Sheng Moh

This chapter elaborates on using generative adversarial networks (GAN) for virtual try-on applications. It presents the first comprehensive survey on this topic. Virtual try-on represents a practical application of GANs and pixel translation, which improves on the techniques of virtual try-on prior to these new discoveries. This survey details the importance of virtual try-on systems and the history of virtual try-on; shows how GANs, pixel translation, and perceptual losses have influenced the field; and summarizes the latest research in creating virtual try-on systems. Additionally, the authors present the future directions of research to improve virtual try-on systems by making them usable, faster, more effective. By walking through the steps of virtual try-on from start to finish, the chapter aims to expose readers to key concepts shared by many GAN applications and to give readers a solid foundation to pursue further topics in GANs.


Author(s):  
Amal Bouti ◽  
Mohamed Adnane Mahraz ◽  
Jamal Riffi ◽  
Hamid Tairi

In this chapter, the authors report a system for detection and classification of road signs. This system consists of two parts. The first part detects the road signs in real time. The second part classifies the German traffic signs (GTSRB) dataset and makes the prediction using the road signs detected in the first part to test the effectiveness. The authors used HOG and SVM in the detection part to detect the road signs captured by the camera. Then they used a convolutional neural network based on the LeNet model in which some modifications were added in the classification part. The system obtains an accuracy rate of 96.85% in the detection part and 96.23% in the classification part.


Author(s):  
Muhammad Sarfraz

Detecting corner points for the digital images is based on determining significant geometrical locations. Corner points lead and guide for providing significant clues for shape analysis and representation. They actually provide significant features of an object, which can be used in different phases of processing. In shape analysis problems, for example, a shape can be efficiently reformulated in a compact way and with sufficient accuracy if the corners are properly located. This chapter selects seven well referred algorithms from the literature to review, compare, and analyze empirically. It provides an overview of these selected algorithms so that users can easily pick an appropriate one for their specific applications and requirements.


Author(s):  
Ichrak Khoulqi ◽  
Najlae Idrissi ◽  
Muhammad Sarfraz

Breast cancer is one of the significant issues in medical sciences today. Specifically, women are suffering most worldwide. Early diagnosis can result to control the growth of the tumor. However, there is a need of high precision of diagnosis for right treatment. This chapter contributes toward an achievement of a computer-aided diagnosis (CAD) system. It deals with mammographic images and enhances their quality. Then, the enhanced images are segmented for pectoral muscle (PM) in the Medio-Lateral-Oblique (MLO) view of the mammographic images. The segmentation approach uses the tool of Gaussian Mixture Model-Expectation Maximization (GMM-EM). A standard database of Mini-MIAS with 322 images has been utilized for the implementation and experimentation of the proposed technique. The metrics of structural similarity measure and DICE coefficient have been utilized to verify the quality of segmentation based on the ground truth. The proposed technique is quite robust and accurate, it supersedes various existing techniques when compared in the same context.


Sign in / Sign up

Export Citation Format

Share Document