Improving real-time CNN-based pupil detection through domain-specific data augmentation

Author(s):  
Shaharam Eivazi ◽  
Thiago Santini ◽  
Alireza Keshavarzi ◽  
Thomas Kübler ◽  
Andrea Mazzei
2019 ◽  
Vol 49 (6) ◽  
pp. 1676-1683 ◽  
Author(s):  
Michael Gadermayr ◽  
Kexin Li ◽  
Madlaine Müller ◽  
Daniel Truhn ◽  
Nils Krämer ◽  
...  

2019 ◽  
Vol 2019 ◽  
pp. 1-14 ◽  
Author(s):  
Yong He ◽  
Hong Zeng ◽  
Yangyang Fan ◽  
Shuaisheng Ji ◽  
Jianjian Wu

In this paper, we proposed an approach to detect oilseed rape pests based on deep learning, which improves the mean average precision (mAP) to 77.14%; the result increased by 9.7% with the original model. We adopt this model to mobile platform to let every farmer able to use this program, which will diagnose pests in real time and provide suggestions on pest controlling. We designed an oilseed rape pest imaging database with 12 typical oilseed rape pests and compared the performance of five models, SSD w/Inception is chosen as the optimal model. Moreover, for the purpose of the high mAP, we have used data augmentation (DA) and added a dropout layer. The experiments are performed on the Android application we developed, and the result shows that our approach surpasses the original model obviously and is helpful for integrated pest management. This application has improved environmental adaptability, response speed, and accuracy by contrast with the past works and has the advantage of low cost and simple operation, which are suitable for the pest monitoring mission of drones and Internet of Things (IoT).


2021 ◽  
Vol 11 (15) ◽  
pp. 7148
Author(s):  
Bedada Endale ◽  
Abera Tullu ◽  
Hayoung Shi ◽  
Beom-Soo Kang

Unmanned aerial vehicles (UAVs) are being widely utilized for various missions: in both civilian and military sectors. Many of these missions demand UAVs to acquire artificial intelligence about the environments they are navigating in. This perception can be realized by training a computing machine to classify objects in the environment. One of the well known machine training approaches is supervised deep learning, which enables a machine to classify objects. However, supervised deep learning comes with huge sacrifice in terms of time and computational resources. Collecting big input data, pre-training processes, such as labeling training data, and the need for a high performance computer for training are some of the challenges that supervised deep learning poses. To address these setbacks, this study proposes mission specific input data augmentation techniques and the design of light-weight deep neural network architecture that is capable of real-time object classification. Semi-direct visual odometry (SVO) data of augmented images are used to train the network for object classification. Ten classes of 10,000 different images in each class were used as input data where 80% were for training the network and the remaining 20% were used for network validation. For the optimization of the designed deep neural network, a sequential gradient descent algorithm was implemented. This algorithm has the advantage of handling redundancy in the data more efficiently than other algorithms.


Author(s):  
HyeonJung Park ◽  
Youngki Lee ◽  
JeongGil Ko

In this work we present SUGO, a depth video-based system for translating sign language to text using a smartphone's front camera. While exploiting depth-only videos offer benefits such as being less privacy-invasive compared to using RGB videos, it introduces new challenges which include dealing with low video resolutions and the sensors' sensitiveness towards user motion. We overcome these challenges by diversifying our sign language video dataset to be robust to various usage scenarios via data augmentation and design a set of schemes to emphasize human gestures from the input images for effective sign detection. The inference engine of SUGO is based on a 3-dimensional convolutional neural network (3DCNN) to classify a sequence of video frames as a pre-trained word. Furthermore, the overall operations are designed to be light-weight so that sign language translation takes place in real-time using only the resources available on a smartphone, with no help from cloud servers nor external sensing components. Specifically, to train and test SUGO, we collect sign language data from 20 individuals for 50 Korean Sign Language words, summing up to a dataset of ~5,000 sign gestures and collect additional in-the-wild data to evaluate the performance of SUGO in real-world usage scenarios with different lighting conditions and daily activities. Comprehensively, our extensive evaluations show that SUGO can properly classify sign words with an accuracy of up to 91% and also suggest that the system is suitable (in terms of resource usage, latency, and environmental robustness) to enable a fully mobile solution for sign language translation.


2021 ◽  
pp. 004051752110342
Author(s):  
Sifundvolesihle Dlamini ◽  
Chih-Yuan Kao ◽  
Shun-Lian Su ◽  
Chung-Feng Jeffrey Kuo

We introduce a real-time machine vision system we developed with the aim of detecting defects in functional textile fabrics with good precision at relatively fast detection speeds to assist in textile industry quality control. The system consists of image acquisition hardware and image processing software. The software we developed uses data preprocessing techniques to break down raw images to smaller suitable sizes. Filtering is employed to denoise and enhance some features. To generalize and multiply the data to create robustness, we use data augmentation, which is followed by labeling where the defects in the images are labeled and tagged. Lastly, we utilize YOLOv4 for localization where the system is trained with weights of a pretrained model. Our software is deployed with the hardware that we designed to implement the detection system. The designed system shows strong performance in defect detection with precision of [Formula: see text], and recall and [Formula: see text] scores of [Formula: see text] and [Formula: see text], respectively. The detection speed is relatively fast at [Formula: see text] fps with a prediction speed of [Formula: see text] ms. Our system can automatically locate functional textile fabric defects with high confidence in real time.


2020 ◽  
Author(s):  
Geoffrey Schau ◽  
Erik Burlingame ◽  
Young Hwan Chang

AbstractDeep learning systems have emerged as powerful mechanisms for learning domain translation models. However, in many cases, complete information in one domain is assumed to be necessary for sufficient cross-domain prediction. In this work, we motivate a formal justification for domain-specific information separation in a simple linear case and illustrate that a self-supervised approach enables domain translation between data domains while filtering out domain-specific data features. We introduce a novel approach to identify domainspecific information from sets of unpaired measurements in complementary data domains by considering a deep learning cross-domain autoencoder architecture designed to learn shared latent representations of data while enabling domain translation. We introduce an orthogonal gate block designed to enforce orthogonality of input feature sets by explicitly removing non-sharable information specific to each domain and illustrate separability of domain-specific information on a toy dataset.


2021 ◽  
Vol 13 (3) ◽  
pp. 809-820
Author(s):  
V. Sowmya ◽  
R. Radha

Vehicle detection and recognition require demanding advanced computational intelligence and resources in a real-time traffic surveillance system for effective traffic management of all possible contingencies. One of the focus areas of deep intelligent systems is to facilitate vehicle detection and recognition techniques for robust traffic management of heavy vehicles. The following are such sophisticated mechanisms: Support Vector Machine (SVM), Convolutional Neural Networks (CNN), Regional Convolutional Neural Networks (R-CNN), You Only Look Once (YOLO) model, etcetera. Accordingly, it is pivotal to choose the precise algorithm for vehicle detection and recognition, which also addresses the real-time environment. In this study, a comparison of deep learning algorithms, such as the Faster R-CNN, YOLOv2, YOLOv3, and YOLOv4, are focused on diverse aspects of the features. Two entities for transport heavy vehicles, the buses and trucks, constitute detection and recognition elements in this proposed work. The mechanics of data augmentation and transfer-learning is implemented in the model; to build, execute, train, and test for detection and recognition to avoid over-fitting and improve speed and accuracy. Extensive empirical evaluation is conducted on two standard datasets such as COCO and PASCAL VOC 2007. Finally, comparative results and analyses are presented based on real-time.


Sign in / Sign up

Export Citation Format

Share Document