scholarly journals Egocentric-View Fingertip Detection for Air Writing Based on Convolutional Neural Networks

Sensors ◽  
2021 ◽  
Vol 21 (13) ◽  
pp. 4382
Author(s):  
Yung-Han Chen ◽  
Chi-Hsuan Huang ◽  
Sin-Wun Syu ◽  
Tien-Ying Kuo ◽  
Po-Chyi Su

This research investigated real-time fingertip detection in frames captured from the increasingly popular wearable device, smart glasses. The egocentric-view fingertip detection and character recognition can be used to create a novel way of inputting texts. We first employed Unity3D to build a synthetic dataset with pointing gestures from the first-person perspective. The obvious benefits of using synthetic data are that they eliminate the need for time-consuming and error-prone manual labeling and they provide a large and high-quality dataset for a wide range of purposes. Following that, a modified Mask Regional Convolutional Neural Network (Mask R-CNN) is proposed, consisting of a region-based CNN for finger detection and a three-layer CNN for fingertip location. The process can be completed in 25 ms per frame for 640×480 RGB images, with an average error of 8.3 pixels. The speed is high enough to enable real-time “air-writing”, where users are able to write characters in the air to input texts or commands while wearing smart glasses. The characters can be recognized by a ResNet-based CNN from the fingertip trajectories. Experimental results demonstrate the feasibility of this novel methodology.

Author(s):  
Dinh-Son Tran ◽  
Ngoc-Huynh Ho ◽  
Hyung-Jeong Yang ◽  
Soo-Hyung Kim ◽  
Guee Sang Lee

AbstractA real-time fingertip-gesture-based interface is still challenging for human–computer interactions, due to sensor noise, changing light levels, and the complexity of tracking a fingertip across a variety of subjects. Using fingertip tracking as a virtual mouse is a popular method of interacting with computers without a mouse device. In this work, we propose a novel virtual-mouse method using RGB-D images and fingertip detection. The hand region of interest and the center of the palm are first extracted using in-depth skeleton-joint information images from a Microsoft Kinect Sensor version 2, and then converted into a binary image. Then, the contours of the hands are extracted and described by a border-tracing algorithm. The K-cosine algorithm is used to detect the fingertip location, based on the hand-contour coordinates. Finally, the fingertip location is mapped to RGB images to control the mouse cursor based on a virtual screen. The system tracks fingertips in real-time at 30 FPS on a desktop computer using a single CPU and Kinect V2. The experimental results showed a high accuracy level; the system can work well in real-world environments with a single CPU. This fingertip-gesture-based interface allows humans to easily interact with computers by hand.


2019 ◽  
Vol 13 (2) ◽  
pp. 136-141 ◽  
Author(s):  
Abhisek Sethy ◽  
Prashanta Kumar Patra ◽  
Deepak Ranjan Nayak

Background: In the past decades, handwritten character recognition has received considerable attention from researchers across the globe because of its wide range of applications in daily life. From the literature, it has been observed that there is limited study on various handwritten Indian scripts and Odia is one of them. We revised some of the patents relating to handwritten character recognition. Methods: This paper deals with the development of an automatic recognition system for offline handwritten Odia character recognition. In this case, prior to feature extraction from images, preprocessing has been done on the character images. For feature extraction, first the gray level co-occurrence matrix (GLCM) is computed from all the sub-bands of two-dimensional discrete wavelet transform (2D DWT) and thereafter, feature descriptors such as energy, entropy, correlation, homogeneity, and contrast are calculated from GLCMs which are termed as the primary feature vector. In order to further reduce the feature space and generate more relevant features, principal component analysis (PCA) has been employed. Because of the several salient features of random forest (RF) and K- nearest neighbor (K-NN), they have become a significant choice in pattern classification tasks and therefore, both RF and K-NN are separately applied in this study for segregation of character images. Results: All the experiments were performed on a system having specification as windows 8, 64-bit operating system, and Intel (R) i7 – 4770 CPU @ 3.40 GHz. Simulations were conducted through Matlab2014a on a standard database named as NIT Rourkela Odia Database. Conclusion: The proposed system has been validated on a standard database. The simulation results based on 10-fold cross-validation scenario demonstrate that the proposed system earns better accuracy than the existing methods while requiring least number of features. The recognition rate using RF and K-NN classifier is found to be 94.6% and 96.4% respectively.


2021 ◽  
Vol 40 (3) ◽  
pp. 1-12
Author(s):  
Hao Zhang ◽  
Yuxiao Zhou ◽  
Yifei Tian ◽  
Jun-Hai Yong ◽  
Feng Xu

Reconstructing hand-object interactions is a challenging task due to strong occlusions and complex motions. This article proposes a real-time system that uses a single depth stream to simultaneously reconstruct hand poses, object shape, and rigid/non-rigid motions. To achieve this, we first train a joint learning network to segment the hand and object in a depth image, and to predict the 3D keypoints of the hand. With most layers shared by the two tasks, computation cost is saved for the real-time performance. A hybrid dataset is constructed here to train the network with real data (to learn real-world distributions) and synthetic data (to cover variations of objects, motions, and viewpoints). Next, the depth of the two targets and the keypoints are used in a uniform optimization to reconstruct the interacting motions. Benefitting from a novel tangential contact constraint, the system not only solves the remaining ambiguities but also keeps the real-time performance. Experiments show that our system handles different hand and object shapes, various interactive motions, and moving cameras.


2020 ◽  
pp. 1-13
Author(s):  
Yundong Li ◽  
Yi Liu ◽  
Han Dong ◽  
Wei Hu ◽  
Chen Lin

The intrusion detection of railway clearance is crucial for avoiding railway accidents caused by the invasion of abnormal objects, such as pedestrians, falling rocks, and animals. However, detecting intrusions using deep learning methods from infrared images captured at night remains a challenging task because of the lack of sufficient training samples. To address this issue, a transfer strategy that migrates daytime RGB images to the nighttime style of infrared images is proposed in this study. The proposed method consists of two stages. In the first stage, a data generation model is trained on the basis of generative adversarial networks using RGB images and a small number of infrared images, and then, synthetic samples are generated using a well-trained model. In the second stage, a single shot multibox detector (SSD) model is trained using synthetic data and utilized to detect abnormal objects from infrared images at nighttime. To validate the effectiveness of the proposed method, two groups of experiments, namely, railway and non-railway scenes, are conducted. Experimental results demonstrate the effectiveness of the proposed method, and an improvement of 17.8% is achieved for object detection at nighttime.


Micromachines ◽  
2021 ◽  
Vol 12 (3) ◽  
pp. 284
Author(s):  
Yihsiang Chiu ◽  
Chen Wang ◽  
Dan Gong ◽  
Nan Li ◽  
Shenglin Ma ◽  
...  

This paper presents a high-accuracy complementary metal oxide semiconductor (CMOS) driven ultrasonic ranging system based on air coupled aluminum nitride (AlN) based piezoelectric micromachined ultrasonic transducers (PMUTs) using time of flight (TOF). The mode shape and the time-frequency characteristics of PMUTs are simulated and analyzed. Two pieces of PMUTs with a frequency of 97 kHz and 96 kHz are applied. One is used to transmit and the other is used to receive ultrasonic waves. The Time to Digital Converter circuit (TDC), correlating the clock frequency with sound velocity, is utilized for range finding via TOF calculated from the system clock cycle. An application specific integrated circuit (ASIC) chip is designed and fabricated on a 0.18 μm CMOS process to acquire data from the PMUT. Compared to state of the art, the developed ranging system features a wide range and high accuracy, which allows to measure the range of 50 cm with an average error of 0.63 mm. AlN based PMUT is a promising candidate for an integrated portable ranging system.


1995 ◽  
Vol 389 ◽  
Author(s):  
K. C. Saraswat ◽  
Y. Chen ◽  
L. Degertekin ◽  
B. T. Khuri-Yakub

ABSTRACTA highly flexible Rapid Thermal Multiprocessing (RTM) reactor is described. This flexibility is the result of several new innovations: a lamp system, an acoustic thermometer and a real-time control system. The new lamp has been optimally designed through the use of a “virtual reactor” methodology to obtain the best possible wafer temperature uniformity. It consists of multiple concentric rings composed of light bulbs with horizontal filaments. Each ring is independently and dynamically controlled providing better control over the spatial and temporal optical flux profile resulting in excellent temperature uniformity over a wide range of process conditions. An acoustic thermometer non-invasively allows complete wafer temperature tomography under all process conditions - a critically important measurement never obtained before. For real-time equipment and process control a model based multivariable control system has been developed. Extensive integration of computers and related technology for specification, communication, execution, monitoring, control, and diagnosis demonstrates the programmability of the RTM.


2018 ◽  
Vol 25 (4) ◽  
pp. 1135-1143 ◽  
Author(s):  
Faisal Khan ◽  
Suresh Narayanan ◽  
Roger Sersted ◽  
Nicholas Schwarz ◽  
Alec Sandy

Multi-speckle X-ray photon correlation spectroscopy (XPCS) is a powerful technique for characterizing the dynamic nature of complex materials over a range of time scales. XPCS has been successfully applied to study a wide range of systems. Recent developments in higher-frame-rate detectors, while aiding in the study of faster dynamical processes, creates large amounts of data that require parallel computational techniques to process in near real-time. Here, an implementation of the multi-tau and two-time autocorrelation algorithms using the Hadoop MapReduce framework for distributed computing is presented. The system scales well with regard to the increase in the data size, and has been serving the users of beamline 8-ID-I at the Advanced Photon Source for near real-time autocorrelations for the past five years.


Sensors ◽  
2021 ◽  
Vol 21 (2) ◽  
pp. 555
Author(s):  
Jui-Sheng Chou ◽  
Chia-Hsuan Liu

Sand theft or illegal mining in river dredging areas has been a problem in recent decades. For this reason, increasing the use of artificial intelligence in dredging areas, building automated monitoring systems, and reducing human involvement can effectively deter crime and lighten the workload of security guards. In this investigation, a smart dredging construction site system was developed using automated techniques that were arranged to be suitable to various areas. The aim in the initial period of the smart dredging construction was to automate the audit work at the control point, which manages trucks in river dredging areas. Images of dump trucks entering the control point were captured using monitoring equipment in the construction area. The obtained images and the deep learning technique, YOLOv3, were used to detect the positions of the vehicle license plates. Framed images of the vehicle license plates were captured and were used as input in an image classification model, C-CNN-L3, to identify the number of characters on the license plate. Based on the classification results, the images of the vehicle license plates were transmitted to a text recognition model, R-CNN-L3, that corresponded to the characters of the license plate. Finally, the models of each stage were integrated into a real-time truck license plate recognition (TLPR) system; the single character recognition rate was 97.59%, the overall recognition rate was 93.73%, and the speed was 0.3271 s/image. The TLPR system reduces the labor force and time spent to identify the license plates, effectively reducing the probability of crime and increasing the transparency, automation, and efficiency of the frontline personnel’s work. The TLPR is the first step toward an automated operation to manage trucks at the control point. The subsequent and ongoing development of system functions can advance dredging operations toward the goal of being a smart construction site. By intending to facilitate an intelligent and highly efficient management system of dredging-related departments by providing a vehicle LPR system, this paper forms a contribution to the current body of knowledge in the sense that it presents an objective approach for the TLPR system.


Sign in / Sign up

Export Citation Format

Share Document