Effective word count estimation for long duration daily naturalistic audio recordings

2016 ◽  
Vol 84 ◽  
pp. 15-23 ◽  
Author(s):  
Ali Ziaei ◽  
Abhijeet Sangwan ◽  
John H.L. Hansen
2020 ◽  
Author(s):  
Okko Räsänen ◽  
Shreyas Seshadri ◽  
Marvin Lavechin ◽  
Alejandrina Cristia ◽  
Marisa Casillas

Recordings captured by wearable microphones are a standard method for investigating young children’s language environments. A key measure to quantify from such data is the amount of speech present in children’s home environments. To this end, the LENA recorder and software—a popular system for measuring linguistic input—estimates the number of adult words that children may hear over the course of a recording. However, word count estimation is challenging to do in a language-independent manner; the relationship between observable acoustic patterns and language-specific lexical entities is far from uniform across human languages. In this paper, we ask whether some alternative linguistic units, namely phone(me)s or syllables, could be measured instead of, or in parallel with, words in order to achieve improved cross-linguistic applicability and comparability of an automated system for measuring child language input. We discuss the advantages and disadvantages of measuring different units from theoretical and technical points of view. We also investigate the practical applicability of measuring such units using a novel system called Automatic LInguistic unit Count Estimator (ALICE) together with audio from seven child-centered daylong audio corpora from diverse cultural and linguistic environments. We show that language-independent measurement of phoneme counts is somewhat more accurate than syllables or words, but all three are highly correlated with human annotations on the same data. We share an open-source implementation of ALICE for use by the language research community, allowing automatic phoneme, syllable, and word count estimation from child-centered audio recordings.


PLoS ONE ◽  
2018 ◽  
Vol 13 (3) ◽  
pp. e0193345 ◽  
Author(s):  
Yvonne F. Phillips ◽  
Michael Towsey ◽  
Paul Roe

2019 ◽  
Author(s):  
Okko Räsänen ◽  
Shreyas Seshadri ◽  
julien karadayi ◽  
Eric Riebling ◽  
John P Bunce ◽  
...  

Automatic word count estimation (WCE) from audio recordings can be used to quantify the amount of verbal communication in a recording environment. One key application of WCE is to measure language input heard by infants and toddlers in their natural environments, as captured by daylong recordings from microphones worn by the infants. Although WCE is nearly trivial for high-quality signals in high-resource languages, daylong recordings are substantially more challenging due to the unconstrained acoustic environments and the presence of near- and far-field speech. Moreover, many use cases of interest involve languages for which reliable ASR systems or even well-defined lexicons are not available. A good WCE system should also perform similarly for low- and high-resource languages in order to enable unbiased comparisons across different cultures and environments. Unfortunately, the current state-of- the-art solution, the LENA system, is based on proprietary software and has only been optimized for American English, limiting its applicability. In this paper, we build on existing work on WCE and present the steps we have taken towards a freely available system for WCE that can be adapted to different languages or dialects with a limited amount of orthographically transcribed speech data. Our system is based on language-independent syllabification of speech, followed by a language-dependent mapping from syllable counts (and a number of other acoustic features) to the corresponding word count estimates. We evaluate our system on samples from daylong infant recordings from six different corpora consisting of several languages and socioeconomic environments, all manually annotated with the same protocol to allow direct comparison. We compare a number of alternative techniques for the two key components in our system: speech activity detection and automatic syllabification of speech. As a result, we show that our system can reach relatively consistent WCE accuracy across multiple corpora and languages (with some limitations). In addition, the system outperforms LENA on three of the four corpora consisting of different varieties of English. We also demonstrate how an automatic neural network-based syllabifier, when trained on multiple languages, generalizes well to novel languages beyond the training data, outperforming two previously proposed unsupervised syllabifiers as a feature extractor for WCE.


1999 ◽  
Vol 13 (4) ◽  
pp. 418-443 ◽  
Author(s):  
Rebecca J. Lloyd ◽  
Pierre Trudel

A case study design was used to (a) describe the process and identify the content of the verbal interactions between an eminent mental training consultant and five elite level athletes during ten sessions, and to (b) compare the analyzed sessions with the consultant’s published approach on mental training. The sources of information included the audio recordings of the mental training sessions, the interviews with the consultant, the interviews with the athletes, and two articles published by the consultant. An adapted version of the Flanders’ (1965) Interaction Analysis in the Classroom was used to systematically code the process, and a content analysis was performed on the transcripts of the mental training sessions and interviews. During the sessions, the consultant’s verbal behaviors accounted for 39% of the total coded behaviors leaving 60% for the athletes and 1% for silence. The content analysis revealed that up to 24 topics were addressed in each session (often the athletes would “unload”) where certain issues had a more frequent word count. The analysis of the content and process revealed that the consultant follows an athlete-centered approach that corresponds to the consultant’s published perspective.


2018 ◽  
Vol 9 (9) ◽  
pp. 1948-1958 ◽  
Author(s):  
Mahnoosh Kholghi ◽  
Yvonne Phillips ◽  
Michael Towsey ◽  
Laurianne Sitbon ◽  
Paul Roe

2019 ◽  
Vol 113 ◽  
pp. 63-80 ◽  
Author(s):  
Okko Räsänen ◽  
Shreyas Seshadri ◽  
Julien Karadayi ◽  
Eric Riebling ◽  
John Bunce ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document