sound files
Recently Published Documents


TOTAL DOCUMENTS

71
(FIVE YEARS 30)

H-INDEX

6
(FIVE YEARS 1)

2021 ◽  
Vol 9 (2) ◽  
pp. 1-17
Author(s):  
Edward Kelly

This paper presents a family of objects for manipulating polyrhythmic sequences and isorhythmic relationships, in both the signal and event domains. These work together and are tightly synchronised to an audio phase signal, so that relative temporal relationships can be tempo-manipulated in a linear fashion. Many permutations of polyrhythmic sequences including incomplete tuplets, scrambled elements, interleaved tuplets and any complex franctional relation can be realised. Similarly, these many be driven with controllable isorhythmic generators derived from a single driver, so that sequences of different fractionally related lengths may be combined and synchronised. It is possible to use signals to drive audio playback that are directly generated, so that disparate sound files may be combined into sequences. A set of sequenced parameters are included to facilitate this process.


2021 ◽  
Author(s):  
Shanmuk Srinivas Amiripalli ◽  
Potnuru Likhitha ◽  
Sisankita Patnaik ◽  
Suresh Babu K ◽  
Rampay Venkatarao

Speech emotion detection has been extremely relevant in today’s digital culture in recent years. RAVDESS, TESS, and SAVEE Datasets were used to train the model in our project. To determine the precision of each algorithm with each dataset, we looked at ten separate Machine Learning Algorithms. Following that, we cleaned the datasets by using the mask feature to eliminate unnecessary background noise, and then we applied all 10 algorithms to this clean speech dataset to improve accuracy. Then we look at the accuracies of all ten algorithms and see which one is the greatest. Finally, by using the algorithm, we could calculate the number of sound files correlated with each of the emotions described in those datasets.


Journal ◽  
2021 ◽  
Vol 11 (2) ◽  
pp. 24-36
Author(s):  
Subhashim Goswami

This article is an ethnographic account of a course I designed and taught in my university to mostly non-humanities, engineering and science undergraduate students from diverse backgrounds. In it, I consider the possibility of a pedagogical approach to teach what it means to construct a field in anthropological terms during a classroom based teaching module. I suggest that one can approach the construction of a field within the classroom by using disturbance as a pedagogical tool. Drawing from Anna Tsing’s formulation of “disturbance as an analytical tool” I demonstrate how we can construct a field pedagogically by disturbing the certitude of the known and by reimagining the modes of seeing and hearing the familiar. The ethnographic elucidation of this paper is essentially work produced from this class – images created from within the university, influenced by a question asked by students and accompanying soundscapes produced by students’ themselves – which demonstrates the possibility of constructing a field by, in a sense, hearing images and seeing sounds.  This article contains embedded sound files and is best downloaded and opened with Adobe Acrobat or similar.  A link is also provided in the text for viewing the sounds and images online for those who open the file though online PDF viewers.  


2021 ◽  
pp. 105566562110515
Author(s):  
Tim Bressmann

The Nasometer is a popular instrument for the acoustic assessment of nasality. In light of the currently ongoing COVID-19 global pandemic, clinicians may have wondered about the infection control procedures for the Nasometer. The current research investigated whether nasalance scores are affected if the Nasometer 6450 microphone casings are covered with a material such as rolled polyvinyl chloride household wrap. For the experiment, pre-recorded sound files from two speakers were played back through a set of small loudspeakers. Nasalance scores from two baselines and three wrap cover conditions were compared. While there was no statistically significant condition effect in a repeated-measures analysis of variance, the within-condition cumulative differences in nasalance scores were 2 for the initial baseline, 42 for wrap cover 1, 24 for wrap cover 2, 78 for wrap cover 3, and 8 for the final baseline. Mean differences between the wrap cover and the baseline conditions were 8.2 to 15.3 times larger, and cumulative differences were 8.3 to 16.6 times larger than between the two baselines. Based on the higher cumulative and mean differences observed, clinicians should not cover Nasometer microphones with household wrap as this increases variability of nasalance scores. Since there is evidence that the COVID-19 virus can survive for some time on metal surfaces, clinicians should be mindful of the fact that the Nasometer microphone housings can only be cleaned superficially and should be handled with gloves to minimize any possible risk of touch transfer of pathogens to the next speaker or the clinician.


2021 ◽  
Author(s):  
Andrew Caines ◽  
Joseph Waters ◽  
Sherry Xu ◽  
Mark Elliott ◽  
Hye-won Lee ◽  
...  

We develop a web-application for practice of listening skills by learners of English, which allows users to listen to pre-recorded sound-files and respond to multiple-choice questions. They receive feedback as to the accuracy of their responses, and they are navigated through the set of items in one of two ways according to the group they are randomly assigned to. Members of a control group are guided from one item to the next depending on success or failure with each new item, and the difficulty ratings of the remaining items. For the experiment group, item selection is made in an adaptive fashion: selecting items through automatic predictions based on individual performance and observations of other students’ interactions with the platform, as well as known item attributes obtained through tagging. Based on the cognitive literature, we also provide listeners with the option of controlling the speed of presentation of the listening items.


2021 ◽  
Author(s):  
Karanvir Singh Gill ◽  
Chantal Percival ◽  
Meighen Roes ◽  
Leo Arreaza ◽  
Abhijit Chinchani ◽  
...  

An analysis of an internationally shared functional magnetic resonance imaging (fMRI) data involving healthy participants and schizophrenia patients extracted brain networks involved in listening to radio speech and capture hallucination experiences. A multidimensional analysis technique demonstrated that for radio-speech sound files, a brain network matching known auditory perception networks emerged, and importantly, displayed speech-duration-dependent hemodynamic responses (HDRs), confirming fMRI detection of these speech events. In the hallucination-capture data, although a sensorimotor (response) network emerged, it did not show hallucination-duration-dependent HDRs. We conclude that although fMRI retrieved the brain network involved in generating the motor responses indicating the start and end of an experienced hallucination, the hallucination event itself was not detected. Previous reports on brain networks detected by fMRI during hallucination capture is reviewed in this context.


Author(s):  
M. HOLOVIN ◽  
◽  
N. HOLOVINA ◽  

The paper presents a steganographic method of hiding textual information in an audio file. Hiding is implemented by a program in Python. The introduction of individual letters of the text into the sound is carried out by the method of «the least significant bit». The program can be used for both educational and practical purposes. The commonly used wave library was used to work with sound files. It is not a library specialized for cryptographic and steganographic needs. Its use and the conciseness of the program code makes it possible to visualize the mechanism of hiding information in the classroom and demonstrate in the process of creating a program its debugging and testing. It is also important for educational purposes that working within the library allows you to see the state of an empty and filled audio container at the level of individual bits. To assess the practical value of the program, it was tested with texts of different lengths and with sound containers of different grades. In particular, the sound of a tuning fork, the sound of a guitar string, classical music, rap, jazz, and an audiobook were used. The experiment showed the correct reproduction of texts. It was found that if you listen carefully to the «pure sound» of the tuning fork, when the container is overloaded with information, suspicions of a text bookmark may arise. A text bookmark in the sound, in which the volume, tempo and frequency change quickly, does not reveal the suspicion of a possible bookmark. However, if the party who intercepted the masked message has guesses about how to bookmark the text, then the text is easily removed. Therefore, the use of the program for practical purposes requires additional manipulations in the code, in particular related to the order of text input and the choice of location. Additional text encryption is also desirable. Analysis of sound and its manipulation at the level of individual bits also has educational value in the sense that it gives an idea of the noise level, the magnitude of the useful physical signal and the sensitivity of the human ear. Key words: Python language, steganography, hiding information, masking information in an audio file, educational example.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Emad A. Mohammed ◽  
Mohammad Keyhani ◽  
Amir Sanati-Nezhad ◽  
S. Hossein Hejazi ◽  
Behrouz H. Far

AbstractThis work develops a robust classifier for a COVID-19 pre-screening model from crowdsourced cough sound data. The crowdsourced cough recordings contain a variable number of coughs, with some input sound files more informative than the others. Accurate detection of COVID-19 from the sound datasets requires overcoming two main challenges (i) the variable number of coughs in each recording and (ii) the low number of COVID-positive cases compared to healthy coughs in the data. We use two open datasets of crowdsourced cough recordings and segment each cough recording into non-overlapping coughs. The segmentation enriches the original data without oversampling by splitting the original cough sound files into non-overlapping segments. Splitting the sound files enables us to increase the samples of the minority class (COVID-19) without changing the feature distribution of the COVID-19 samples resulted from applying oversampling techniques. Each cough sound segment is transformed into six image representations for further analyses. We conduct extensive experiments with shallow machine learning, Convolutional Neural Network (CNN), and pre-trained CNN models. The results of our models were compared to other recently published papers that apply machine learning to cough sound data for COVID-19 detection. Our method demonstrated a high performance using an ensemble model on the testing dataset with area under receiver operating characteristics curve = 0.77, precision = 0.80, recall = 0.71, F1 measure = 0.75, and Kappa = 0.53. The results show an improvement in the prediction accuracy of our COVID-19 pre-screening model compared to the other models.


Author(s):  
Diauddin Ismail

In everyday life, it is not uncommon when we hear the sound of chanting the holy verses of the Al Al Qur’an  which are read in mosques before prayer time or in other conditions we seem interested in knowing what Surah and which verse is being recited. This is due to the love of Muslims themselves for the Al Qur’an  but not all Muslims memorize the entire contents of the Al Qur’an . Based on the limitations and the magnitude of curiosity about Surah and Verse information, the writer is interested in developing a computer system that can recognize and provide information on the recited Surah and Verse. Advances in computer technology not only make it easier for humans to carry out activities. One of the human intelligences that are planted into computer technology is to recognize the verses of the Al Al Qur’an  Surah Al-Falaq through voice. Ada-Boost method is one method to identify or recognize voice classification, and by using this method the success rate in recognizing verse numbers reaches 72%. This system can only recognize the number of verses of the Al Al Qur’an  Surah Al-Falaq, recorded sound files with the .wav file extension and built using the Delphi programming language.


Author(s):  
Pakhshan I. Hamad ◽  
Dlakhshan Y. Othman ◽  
Himdad A. Muhammad

This study is an attempt to investigate pronunciation material and tasks and/or activities along with teaching methods in Sunrise 7-9 series adopted by the ministry of education in KRG to be taught in basic schools all over Kurdistan region of Iraq. The rationale of the study is the obvious deficiency of basic school students’ performance, more specifically of student's pronunciation and speaking skill. Pronunciation is regarded as one of the basic components in learning English, hence conducting this study is deemed necessary. This paper focuses on investigating pronunciation material, tasks/exercises and teaching methods used from the teachers' and researchers' perspectives. To this end, a questionnaire was administered to 51 English teachers who have taught these books for at least three years in Basic Schools in Erbil city and an observation checklist was designed to observe 30 lesson periods of teachers. The data are analyzed through SPSS program to find out the frequencies, percentages, mean and standard deviation. The results reveal that most of the teachers are (not quite) satisfied with the pronunciation material, more than half of teachers are not satisfied with the tasks and exercises due to time and lack of equipment, almost none of the teachers use effective and suitable teaching methods and the sound files found within the teachers' book of the series. Based on the results, some new pronunciation material, new tasks, exercises, teaching strategies and also some recommendations are presented.


Sign in / Sign up

Export Citation Format

Share Document