Speech extraction with RGB-intensity gradient on rolling-shutter video

2021 ◽  
Vol 263 (5) ◽  
pp. 1095-1106
Author(s):  
Tsubasa Yoshizawa ◽  
Atsushi Yoshida ◽  
Kenta Iwai ◽  
Takanobu Nishiura

Recent studies have been proposed to extract speech from the captured video of objects vibrating by sound waves. Among them, from the viewpoint of equipment cost, the method of extracting speech from the video captured by rolling-shutter cameras, which are widely used in consumer digital single-lens reflex cameras, has been attracting attention. The conventional method with the rolling-shutter video uses a grayscale video for processing based on phase images. However, a grayscale video has a smaller dynamic range than an RGB video, and thus the speech extraction accuracy of the conventional method degrades. Therefore, this paper proposes a speech extraction method based on RGB-intensity gradients on an RGB video to improve speech extraction accuracy. The proposed method extracts the speech by calculating the similarity of R, G, and B intensity gradients, and using these three intensity gradients expands the dynamic range. The experimental results on the quality and intelligibility of the extracted speech show our proposed method outperforms the conventional method.

Sensors ◽  
2021 ◽  
Vol 21 (5) ◽  
pp. 1683
Author(s):  
Winai Jaikla ◽  
Fabian Khateb ◽  
Tomasz Kulej ◽  
Koson Pitaksuttayaprot

This paper proposes the simulated and experimental results of a universal filter using the voltage differencing differential difference amplifier (VDDDA). Unlike the previous complementary metal oxide semiconductor (CMOS) structures of VDDDA that is present in the literature, the present one is compact and simple, owing to the employment of the multiple-input metal oxide semiconductor (MOS) transistor technique. The presented filter employs two VDDDAs, one resistor and two grounded capacitors, and it offers low-pass: LP, band-pass: BP, band-reject: BR, high-pass: HP and all-pass: AP responses with a unity passband voltage gain. The proposed universal voltage mode filter has high input impedances and low output impedance. The natural frequency and bandwidth are orthogonally controlled by using separated transconductance without affecting the passband voltage gain. For a BP filter, the root mean square (RMS) of the equivalent output noise is 46 µV, and the third intermodulation distortion (IMD3) is −49.5 dB for an input signal with a peak-to peak of 600 mV, which results in a dynamic range (DR) of 73.2 dB. The filter was designed and simulated in the Cadence environment using a 0.18-µm CMOS process from Taiwan semiconductor manufacturing company (TSMC). In addition, the experimental results were obtained by using the available commercial components LM13700 and AD830. The simulation results are in agreement with the experimental one that confirmed the advantages of the filter.


Author(s):  
Guanghua Wu ◽  
Yibo Ke ◽  
Lin Zhang ◽  
Meng Tao

Abstract Acoustic metamaterials have high potential in diverse applications, including acoustic cloaking, sound tunneling, wavefront reshaping, and sound insulation. In the present study, new metamaterials consisting of spatial coiled units are designed and fabricated to manipulate sound waves in the range 0-1600 Hz. The effective acoustic properties and band diagrams are studied. The simulation and experimental results demonstrate that the metamaterials provide an effective and feasible approach to design acoustic device such as sound cloaking and insulators.


Polymers ◽  
2020 ◽  
Vol 12 (1) ◽  
pp. 186 ◽  
Author(s):  
Mohamed Abbas ◽  
Mohammed Alqahtani ◽  
Ali Algahtani ◽  
Amir Kessentini ◽  
Hassen Loukil ◽  
...  

Intravenous delivery is the fastest conventional method of delivering drugs to their targets in seconds, whereas intramuscular and subcutaneous injections provide a slower continuous delivery of drugs. In recent years, nanoparticle-based drug-delivery systems have gained considerable attention. During the progression of nanoparticles into the blood, the sound waves generated by the particles create acoustic pressure that affects the movement of nanoparticles. To overcome this issue, the impact of sound pressure levels on the development of nanoparticles was studied herein. In addition, a composite nanostructure was developed using different types of nanoscale substances to overcome the effect of sound pressure levels in the drug-delivery process. The results demonstrate the efficacy of the proposed nanostructure based on a group of different nanoparticles. This study suggests five materials, namely, polyimide, acrylic plastic, Aluminum 3003-H18, Magnesium AZ31B, and polysilicon for the design of the proposed structure. The best results were obtained in the case of the movement of these molecules at lower frequencies. The performance of acrylic plastic is better than other materials; the sound pressure levels reached minimum values at frequencies of 1, 10, 20, and 60 nHz. Furthermore, an experimental setup was designed to validate the proposed idea using advanced biomedical imaging technologies. The experimental results demonstrate the possibilities of detecting, tracking, and evaluating the movement behaviors of nanoparticles. The experimental results also demonstrate that the lowest sound pressure levels were observed at lower frequency levels, thus proving the validity of the proposed computational model assumptions. The outcome of this study will pave the way to understand the interaction behaviors of nanoparticles with the surrounding biological environments, including the sound pressure effect, which could lead to the useof such an effect in facilitating directional and tactic movements of the micro- and nano-motors.


Terminology ◽  
2000 ◽  
Vol 6 (2) ◽  
pp. 195-210 ◽  
Author(s):  
Hiroshi Nakagawa

The NTCIR1 TMREC group called for participation of the term recognition task which is a part of NTCIR1 held in 1999. As an activity of TMREC, they have provided us with the test collection of the term recognition task. The goal of this task is to automatically recognize and extract terms from the text corpus which consists of 1,870 abstracts gathered from the NACSIS Academic Conference Database. This article describes the term extraction method we have proposed to extract terms consisting of simple and compound nouns and the experimental evaluation of the proposed method with this NTCIR TMREC test collection. The basic idea of scoring a simple noun N of our term extraction method is to count how many nouns are conjoined with N to make compound nouns. Then we extend this score to measure the score of compound nouns because most of technical terms are compound nouns. Our method has a parameter to tune the degree of preference either for longer compound nouns or for shorter compound nouns. As for term candidates, in addition to noun sequences, we may add variations such as patterns of "A no B" that roughly means "B of A" or "A’ś B" and/or "A na B" where "A na" is an adjective. Experimental results of our method are promising, namely recall of 0.83, precision of 0.46 and F-value of 0.59 for exactly matched extracted terms when we take into account top scoring 16,000 extracted terms.


2019 ◽  
Vol 9 (18) ◽  
pp. 3935 ◽  
Author(s):  
Kazushige Okayasu ◽  
Kota Yoshida ◽  
Masataka Fuchida ◽  
Akio Nakamura

This study aims to propose a vision-based method to classify mosquito species. To investigate the efficiency of the method, we compared two different classification methods: The handcraft feature-based conventional method and the convolutional neural network-based deep learning method. For the conventional method, 12 types of features were adopted for handcraft feature extraction, while a support vector machine method was adopted for classification. For the deep learning method, three types of architectures were adopted for classification. We built a mosquito image dataset, which included 14,400 images with three types of mosquito species. The dataset comprised 12,000 images for training, 1500 images for testing, and 900 images for validating. Experimental results revealed that the accuracy of the conventional method using the scale-invariant feature transform algorithm was 82.4% at maximum, whereas the accuracy of the deep learning method was 95.5% in a residual network using data augmentation. From the experimental results, deep learning can be considered to be effective for classifying the mosquito species of the proposed dataset. Furthermore, data augmentation improves the accuracy of mosquito species’ classification.


2011 ◽  
Vol 314-316 ◽  
pp. 1483-1486
Author(s):  
Qing Ju Tang ◽  
Jun Yan Liu ◽  
Yang Wang

The non-destructive pulsed phase thermography technique was used to detect metal specimen with flat blind-bottom holes and composite specimen with sticky areas. An experimental platform was built base on the analysis of the pulsed phase thermography testing principle. Experimental results show the different testing effect of the original thermography, amplitude and phase images.


2013 ◽  
Vol 427-429 ◽  
pp. 1874-1878
Author(s):  
Guo De Wang ◽  
Zhi Sheng Jing ◽  
Guo Wei Qin ◽  
Shan Chao Tu

Wear particles recognition is a key link in the process of Ferrography analysis. Different kinds of wear particles vary greatly in texture, texture feature is one of the most important feature in wear particles recognition. Local Binary Pattern (LBP) is an efficient operator for texture description. The binary sequence of traditional LBP operator is obtained by the comparison between the gray value of the neighborhood and the gray value of the center pixel of the neighborhood, the comparison is too simple to cause the loss of the texture. In this paper, an improved LBP operator is presented for texture feature extraction and it is applied to the recognition of severe sliding particles, fatigue spall particles and laminar particles. The experimental results show that our method is an effective feature extraction method and obtains better recognition accuracy compared with other methods.


Sign in / Sign up

Export Citation Format

Share Document