scholarly journals U2-VC: one-shot voice conversion using two-level nested U-structure

Author(s):  
Fangkun Liu ◽  
Hui Wang ◽  
Renhua Peng ◽  
Chengshi Zheng ◽  
Xiaodong Li

AbstractVoice conversion is to transform a source speaker to the target one, while keeping the linguistic content unchanged. Recently, one-shot voice conversion gradually becomes a hot topic for its potentially wide range of applications, where it has the capability to convert the voice from any source speaker to any other target speaker even when both the source speaker and the target speaker are unseen during training. Although a great progress has been made in one-shot voice conversion, the naturalness of the converted speech remains a challenging problem. To further improve the naturalness of the converted speech, this paper proposes a two-level nested U-structure (U2-Net) voice conversion algorithm called U2-VC. The U2-Net can extract both local feature and multi-scale feature of log-mel spectrogram, which can help to learn the time-frequency structures of the source speech and the target speech. Moreover, we adopt sandwich adaptive instance normalization (SaAdaIN) in decoder for speaker identity transformation to retain more content information of the source speech while maintaining the speaker similarity between the converted speech and the target speech. Experiments on VCTK dataset show that U2-VC outperforms many SOTA approaches including AGAIN-VC and AdaIN-VC in terms of both objective and subjective measurements.

2018 ◽  
Vol 2018 ◽  
pp. 1-14
Author(s):  
XueTing Wang ◽  
Cong Jin ◽  
Wei Zhao

Speech synthesis is an important research content in the field of human-computer interaction and has a wide range of applications. As one of its branches, singing synthesis plays an important role. Beijing Opera is a famous traditional Chinese opera, and it is called Chinese quintessence. The singing of Beijing Opera carries some features of speech but it has its own unique pronunciation rules and rhythms which differ from ordinary speech and singing. In this paper, we propose three models for the synthesis of Beijing Opera. Firstly, the speech signals of the source speaker and the target speaker are extracted by using the straight algorithm. And then through the training of GMM, we complete the voice control model to input the voice to be converted and output the voice after the voice conversion. Finally, by modeling the fundamental frequency, duration, and frequency separately, a melodic control model is constructed using GAN to realize the synthesis of the Beijing Opera fragment. We connect the fragments and superimpose the background music to achieve the synthesis of Beijing Opera. The experimental results show that the synthesized Beijing Opera has some audibility and can basically complete the composition of Beijing Opera. We also extend our models to human-AI cooperative music generation: given a target voice of human, we can generate a Beijing Opera which is sung by a new target voice.


2020 ◽  
Vol 5 (3) ◽  
pp. 229-233
Author(s):  
Olaide Ayodeji Agbolade

This research presents a neural network based voice conversion model. While it is a known fact that voiced sounds and prosody are the most important component of the voice conversion framework, what is not known is their objective contributions particularly in a noisy and uncontrolled environment. This model uses a 3 layer feedforward neural network to map the Linear prediction analysis coefficients of a source speaker to the acoustic vector space of the target speaker with a view to objectively determine the contributions of the voiced, unvoiced and supra-segmental components of sounds to the voice conversion model. Results showed that vowels “a”, “i”, “o” have the most significant contribution in the conversion success. The voiceless sounds were also found to be most affected by the noisy training data. An average noise level of 40 dB above the noise floor were found to degrade the voice conversion success by 55.14 percent relative to the voiced sounds. The result also show that for cross-gender voice conversion, prosody conversion is more significant in scenarios where a female is the target speaker.


Author(s):  
Songxiang Liu ◽  
Jinghua Zhong ◽  
Lifa Sun ◽  
Xixin Wu ◽  
Xunying Liu ◽  
...  

2019 ◽  
Vol 16 (1) ◽  
pp. 3-16 ◽  
Author(s):  
Reshma Nagpal ◽  
Jitender Bhalla ◽  
Shamsher S. Bari

Background:A lot of advancement has been made in the area of β-lactams in recent times. Most of the research is targeted towards the synthesis of novel β-lactams, their functionalization and exploring their biological potential. The C-3 functionalization of β-lactams has continued to attract considerable interest of the scientific community due to their utility as versatile intermediates in organic synthesis and their therapeutic applications. This has led to the significant increase in efforts towards developing efficient and economic strategies for C-3 functionalized β-lactams.Objective:The present review aims to highlight recent advancement made in C-3 functionalization of β-lactams.Conclusion:To summarize, functionalization of β-lactams at C-3 is an essential aspect of β-lactam chemistry in order to improve/modify its synthetic utility as well as biological potential. The C-3 carbocation equivalent method has emerged as an important and convenient strategy for C-3 functionalization of β-lactam heterocycles which provides a wide range of β-lactams viz. 3-alkylated β-lactams, 3-aryl/heteroarylated β-lactams, 3- alkoxylated β-lactams. On the other hand, base mediated functionalization of β-lactams via carbanion intermediate is another useful approach but their scope is limited by the requirement of stringent reaction conditions. In addition to this, organometallic reagent mediated α-alkylation of 3-halo/3-keto-β-lactams also emerged as interesting methods for the synthesis of functionalized β-lactams having good yields and diastereoselectivities.


2004 ◽  
Vol 50 (11) ◽  
pp. 2019-2027 ◽  
Author(s):  
Scott C Johnson ◽  
David J Marshall ◽  
Gerda Harms ◽  
Christie M Miller ◽  
Christopher B Sherrill ◽  
...  

Abstract Background: All states require some kind of testing for newborns, but the policies are far from standardized. In some states, newborn screening may include genetic tests for a wide range of targets, but the costs and complexities of the newer genetic tests inhibit expansion of newborn screening. We describe the development and technical evaluation of a multiplex platform that may foster increased newborn genetic screening. Methods: MultiCode® PLx involves three major steps: PCR, target-specific extension, and liquid chip decoding. Each step is performed in the same reaction vessel, and the test is completed in ∼3 h. For site-specific labeling and room-temperature decoding, we use an additional base pair constructed from isoguanosine and isocytidine. We used the method to test for mutations within the cystic fibrosis transmembrane conductance regulator (CFTR) gene. The developed test was performed manually and by automated liquid handling. Initially, 225 samples with a range of genotypes were tested retrospectively with the method. A prospective study used samples from >400 newborns. Results: In the retrospective study, 99.1% of samples were correctly genotyped with no incorrect calls made. In the perspective study, 95% of the samples were correctly genotyped for all targets, and there were no incorrect calls. Conclusions: The unique genetic multiplexing platform was successfully able to test for 31 targets within the CFTR gene and provides accurate genotype assignments in a clinical setting.


Pharmaceutics ◽  
2021 ◽  
Vol 13 (2) ◽  
pp. 189
Author(s):  
Zhanying Zheng ◽  
Sharon Shui Yee Leung ◽  
Raghvendra Gupta

Dry powder inhaler (DPI) is a device used to deliver a drug in dry powder form to the lungs. A wide range of DPI products is currently available, with the choice of DPI device largely depending on the dose, dosing frequency and powder properties of formulations. Computational fluid dynamics (CFD), together with various particle motion modelling tools, such as discrete particle methods (DPM) and discrete element methods (DEM), have been increasingly used to optimise DPI design by revealing the details of flow patterns, particle trajectories, de-agglomerations and depositions within the device and the delivery paths. This review article focuses on the development of the modelling methodologies of flow and particle behaviours in DPI devices and their applications to device design in several emerging fields. Various modelling methods, including the most recent multi-scale approaches, are covered and the latest simulation studies of different devices are summarised and critically assessed. The potential and effectiveness of the modelling tools in optimising designs of emerging DPI devices are specifically discussed, such as those with the features of high-dose, pediatric patient compatibility and independency of patients’ inhalation manoeuvres. Lastly, we summarise the challenges that remain to be addressed in DPI-related fluid and particle modelling and provide our thoughts on future research direction in this field.


Micromachines ◽  
2021 ◽  
Vol 12 (3) ◽  
pp. 284
Author(s):  
Yihsiang Chiu ◽  
Chen Wang ◽  
Dan Gong ◽  
Nan Li ◽  
Shenglin Ma ◽  
...  

This paper presents a high-accuracy complementary metal oxide semiconductor (CMOS) driven ultrasonic ranging system based on air coupled aluminum nitride (AlN) based piezoelectric micromachined ultrasonic transducers (PMUTs) using time of flight (TOF). The mode shape and the time-frequency characteristics of PMUTs are simulated and analyzed. Two pieces of PMUTs with a frequency of 97 kHz and 96 kHz are applied. One is used to transmit and the other is used to receive ultrasonic waves. The Time to Digital Converter circuit (TDC), correlating the clock frequency with sound velocity, is utilized for range finding via TOF calculated from the system clock cycle. An application specific integrated circuit (ASIC) chip is designed and fabricated on a 0.18 μm CMOS process to acquire data from the PMUT. Compared to state of the art, the developed ranging system features a wide range and high accuracy, which allows to measure the range of 50 cm with an average error of 0.63 mm. AlN based PMUT is a promising candidate for an integrated portable ranging system.


2021 ◽  
Vol 13 (11) ◽  
pp. 2233
Author(s):  
Rasa Janušaitė ◽  
Laurynas Jukna ◽  
Darius Jarmalavičius ◽  
Donatas Pupienis ◽  
Gintautas Žilinskas

Satellite remote sensing is a valuable tool for coastal management, enabling the possibility to repeatedly observe nearshore sandbars. However, a lack of methodological approaches for sandbar detection prevents the wider use of satellite data in sandbar studies. In this paper, a novel fully automated approach to extract nearshore sandbars in high–medium-resolution satellite imagery using a GIS-based algorithm is proposed. The method is composed of a multi-step workflow providing a wide range of data with morphological nearshore characteristics, which include nearshore local relief, extracted sandbars, their crests and shoreline. The proposed processing chain involves a combination of spectral indices, ISODATA unsupervised classification, multi-scale Relative Bathymetric Position Index (RBPI), criteria-based selection operations, spatial statistics and filtering. The algorithm has been tested with 145 dates of PlanetScope and RapidEye imagery using a case study of the complex multiple sandbar system on the Curonian Spit coast, Baltic Sea. The comparison of results against 4 years of in situ bathymetric surveys shows a strong agreement between measured and derived sandbar crest positions (R2 = 0.999 and 0.997) with an average RMSE of 5.8 and 7 m for PlanetScope and RapidEye sensors, respectively. The accuracy of the proposed approach implies its feasibility to study inter-annual and seasonal sandbar behaviour and short-term changes related to high-impact events. Algorithm-provided outputs enable the possibility to evaluate a range of sandbar characteristics such as distance from shoreline, length, width, count or shape at a relevant spatiotemporal scale. The design of the method determines its compatibility with most sandbar morphologies and suitability to other sandy nearshores. Tests of the described technique with Sentinel-2 MSI and Landsat-8 OLI data show that it can be applied to publicly available medium resolution satellite imagery of other sensors.


2009 ◽  
Vol 2009 ◽  
pp. 1-12 ◽  
Author(s):  
Rolf K. Eckhoff

Right from the early days of the process industries, continuous efforts have been made to develop and improve measures for prevention and mitigation of dust explosions in these industries. Nevertheless this hazard continues to threaten industries that manufacture, use and/or handle powders and dusts of a wide range of combustible materials. To improve methods for predicting explosion development in real industrial plant has been one major challenge. Hence, during the last years comprehensive numerical simulation codes, for addressing this problem, have been developed. Progress has also been made in other areas, for example, ignition source prevention. The importance of adopting inherently safer process design, by building on firm knowledge in powder science and technology, and of systematic education/training of personnel, is also emphasized.


Sign in / Sign up

Export Citation Format

Share Document