voice data
Recently Published Documents


TOTAL DOCUMENTS

742
(FIVE YEARS 82)

H-INDEX

28
(FIVE YEARS 4)

Author(s):  
Muneera Altayeb ◽  
Amani Al-Ghraibah

<span>Determining and classifying pathological human sounds are still an interesting area of research in the field of speech processing. This paper explores different methods of voice features extraction, namely: Mel frequency cepstral coefficients (MFCCs), zero-crossing rate (ZCR) and discrete wavelet transform (DWT). A comparison is made between these methods in order to identify their ability in classifying any input sound as a normal or pathological voices using support vector machine (SVM). Firstly, the voice signal is processed and filtered, then vocal features are extracted using the proposed methods and finally six groups of features are used to classify the voice data as healthy, hyperkinetic dysphonia, hypokinetic dysphonia, or reflux laryngitis using separate classification processes. The classification results reach 100% accuracy using the MFCC and kurtosis feature group. While the other classification accuracies range between~60% to~97%. The Wavelet features provide very good classification results in comparison with other common voice features like MFCC and ZCR features. This paper aims to improve the diagnosis of voice disorders without the need for surgical interventions and endoscopic procedures which consumes time and burden the patients. Also, the comparison between the proposed feature extraction methods offers a good reference for further researches in the voice classification area.</span>


Author(s):  
Taiwo Samuel Aina

Abstract: The goal of this project is to design and analyse a radio over fibre system for a four-story hospital with 20 rooms on each floor. The number of ONUs per floor is 20, and it was assumed that each room had an ONU capable of providing network access to voice, data, video, and biometrics. We build an 80-channel WDM optical transmitter using the WDM method. The proposed system includes a transmitter with 20 input channels, a multiplexer, a DE multiplexer, a 45-kilometer optical fibre, and an amplifier. The proposed model was simulated, and the results were evaluated in WDM systems using an optical amplifier. The receiver performance analysis of the Optical Communication System is shown by the BER simulation run and the eye diagram graphic, with the threshold set at 0.00120739. Furthermore, the eye height is 0.00141402, and the minimum BER is 5.59009e-006. When the simulated and calculated values of received power and total power loss are compared, the system is efficient. Keywords: Radio over fibre, Optical Amplifier, WDM system, DE multiplexer, Multiplexer, BER, Optical transmitter


Author(s):  
Zhiwu Cui ◽  
Ke Zhou ◽  
Jian Chen

The existing acquisition system has the problem of imperfect communication link, which leads to the weak signal receiving strength of the system. This paper designs an intelligent voice acquisition system based on cloud resource scheduling model. Hardware: select S3C6410 as hardware platform, optimize audio access port, connect IIS serial bus and other components; Software part: extract the frequency agility characteristics of intelligent voice signal, predict the future sample value, establish the communication link with cloud resource scheduling model, obtain the communication rate information, code and generate digital voice data, set the transmission function of intelligent acquisition system with overlay algorithm. Experimental results: the average signal receiving strength of the designed system and the other two intelligent voice intelligent acquisition systems is 106.40 dBm, 91.33 dBm and 90.23 dBm, which proves that the intelligent acquisition system integrated with cloud resource scheduling model has higher use value.


2022 ◽  
Vol 2 ◽  
Author(s):  
Laura Michaella B. Ribeiro ◽  
Ivan Müller ◽  
Leandro Buss Becker

The use of different types-of-services (ToS), such as voice, data, and video, has become increasingly present in the execution of applications involving networks composed of multiple UAVs. These applications usually require the UAVs to share different ToS in a dynamic and ad-hoc manner, such that they can support the execution of cooperative/collaborative tasks. The use of heterogeneous communication has showed gains in maintaining the connection among highly mobile nodes, while increasing the reliable transmission of data, as is necessary in MANETS, VANETs and, more recently, FANETs. The aim of this paper is to present a performance evaluation of a heterogeneous interface manager (IM), which applies a heuristic to choose the best among several single- and multi-band wireless communication interfaces, including IEEE 802.11n, IEEE 802.11p, IEEE 802.11ac, and IEEE 802.11ax. Simulated scenarios with three, five, and eight UAV nodes are developed by integrating NS-3 and Gazebo simulation tools. The IM performance is analyzed by applying different numbers of interfaces and comparing with interfaces applied homogeneously by defining two set of results, in terms of application and MAC and PHY metrics, respectively. Finally, we also evaluate the associated performance, considering voice, data, and video streaming ToS. The results indicate that the combination of different interfaces has a very powerful effect on maintaining or increasing the communication intensity.


Author(s):  
Akihito Yamauchi ◽  
Hiroshi Imagawa ◽  
Hisayuki Yokonishi ◽  
Ken-Ichi Sakakibara ◽  
Niro Tayama
Keyword(s):  

2021 ◽  
Vol 6 (9 (114)) ◽  
pp. 15-23
Author(s):  
Saif Mohammed Ali ◽  
Haider Mshali ◽  
Amer S. Elameer ◽  
Mustafa Musa Jaber ◽  
Sura Khalil Abd

As an effectual simple wireless equivalent created in the telecommunications (telephone) industry, Wireless Asynchronous Transfer Mode (WATM) is utilized to stream unified traffics like video, data, and voice data. In the asynchronous data transfer mode, voice data transfer a packet with the same medium, and data share the networks and burst data. Effective WATM data transmission requires an extensive array of designs, techniques used for control, and simulation methodologies. The congestion of the network is among the key challenges that lower the entire WATM performance during this procedure, in addition to the delay in cell and the overload of traffic. The congestions cause cell loss, and it requires expensive switches compared to the LAN. Consequently, in this current study, the application of an effectual switching model together with a control mechanism that possesses multiple accesses is employed. The multiple access process and switching model are utilized to establish an effective data sharing process with minimum complexity. The switching model uses the synchronous inputs and output ports with buffering to ensure the data sharing process. The traffic in the network is decreased, and the loss of packets in the cells is efficiently kept to a minimum by the proposed technique. The system being discussed is employed through the utilization of software employed using OPNET 10.5 simulation, with the valuation of the WATM along with the investigational outcomes accordingly. The system's efficiency is assessed by throughput, latency, cell loss probability value (CLP), overhead network, and packet loss. Thus, the system ensures the minimum packet loss (0.1 %) and high data transmission rate (96.6 %)


Author(s):  
Shreyashi Chowdhury ◽  
Asoke Nath

Natural language processing (NLP) is a subfield of linguistics, computer science, and artificial intelligence concerned with the interactions between computers and human language, in particular how to program computers to process and analyse large amounts of natural language data. The goal is a computer capable of "understanding" the contents of documents, including the contextual nuances of the language within them.NLP combines computational linguistics—rule-based modelling of human language—with statistical, machine learning, and deep learning models. Together, these technologies enable computers to process human language in the form of text or voice data and to ‘understand’ its full meaning, complete with the speaker or writer’s intent and sentiment. Challenges in natural language processing frequently involve speech recognition, natural language understanding, and natural language generation. This paper discusses on the various scope and challenges , current trends and future scopes of Natural Language Processing.


2021 ◽  
Vol 9 (1) ◽  
Author(s):  
Maria Faurholt-Jepsen ◽  
Darius Adam Rohani ◽  
Jonas Busk ◽  
Maj Vinberg ◽  
Jakob Eyvind Bardram ◽  
...  

Abstract Background Voice features have been suggested as objective markers of bipolar disorder (BD). Aims To investigate whether voice features from naturalistic phone calls could discriminate between (1) BD, unaffected first-degree relatives (UR) and healthy control individuals (HC); (2) affective states within BD. Methods Voice features were collected daily during naturalistic phone calls for up to 972 days. A total of 121 patients with BD, 21 UR and 38 HC were included. A total of 107.033 voice data entries were collected [BD (n  = 78.733), UR (n  = 8004), and HC (n  =  20.296)]. Daily, patients evaluated symptoms using a smartphone-based system. Affective states were defined according to these evaluations. Data were analyzed using random forest machine learning algorithms. Results Compared to HC, BD was classified with a sensitivity of 0.79 (SD 0.11)/AUC  = 0.76 (SD 0.11) and UR with a sensitivity of 0.53 (SD 0.21)/AUC of 0.72 (SD 0.12). Within BD, compared to euthymia, mania was classified with a specificity of 0.75 (SD 0.16)/AUC  =  0.66 (SD 0.11). Compared to euthymia, depression was classified with a specificity of 0.70 (SD 0.16)/AUC  =  0.66 (SD 0.12). In all models the user dependent models outperformed the user independent models. Models combining increased mood, increased activity and insomnia compared to periods without performed best with a specificity of 0.78 (SD 0.16)/AUC  =  0.67 (SD 0.11). Conclusions Voice features from naturalistic phone calls may represent a supplementary objective marker discriminating BD from HC and a state marker within BD.


2021 ◽  
Vol 18 ◽  
pp. 192-198
Author(s):  
Meili Dai

With the increasingly frequent international exchanges, English has become a common language for communication between countries. Under this research background, in order to correct students’ wrong English pronunciation, an intelligent correction system for students’ English pronunciation errors based on speech recognition technology is designed. In order to provide a relatively stable hardware correction platform for voice data information, the sensor equipment is optimized and combined with the processor and intelligent correction circuit. On this basis, the MLP (Multilayer Perceptron) error correction function is defined, with the help of the known recognition confusion calculation results, the actual input speech error is processed by gain mismatch, and the software execution environment of the system is built. Combined with the related hardware structure, the intelligent correction system of students’ English pronunciation error based on speech recognition technology is successfully applied, and the comparative experiment is designed the practical application value of the system is highlighted.


2021 ◽  
Vol 2089 (1) ◽  
pp. 012066
Author(s):  
Rajeev Shrivastava ◽  
Mangal Singh ◽  
RakhiThakur ◽  
Kalluri Saidatta Subrahmanya Ravi Teja

Abstract Steganography can be described as approach of masking an undisclosed message with a normal message which is known as the Carrier message signal. DSP techniques, such as LSB encoding, have historically been implemented for secret information hiding. Utilization ofsteganography functions of deep neural networks for voice data is something this paper will present. This paper also demonstrate that the steganography techniques suggested for vision are less suitable for speech signals this paper present a implementation technique that involves the use of ISTFT and STFT as differentiablelayers in the network. Empirically, the efficacy of the proposed methods based on multiple datasets of speech should be demonstrated and the outcome are examined quantitatively and qualitatively. Using of multiple decoders or a single conditional decoder helps to hide multiple signals in a single carrier signal. Finally, under various channel distortion situations, this model Qualitative studies indicate that human listeners cannot detect changes made to the carrier and hence the decoded messages are highly intelligible.


Sign in / Sign up

Export Citation Format

Share Document