Joint Prediction of Punctuation and Disfluency in Speech Transcripts

Author(s):  
Binghuai Lin ◽  
Liyuan Wang
Author(s):  
Maria Lucia Parrella ◽  
Giuseppina Albano ◽  
Cira Perna ◽  
Michele La Rocca

AbstractMissing data reconstruction is a critical step in the analysis and mining of spatio-temporal data. However, few studies comprehensively consider missing data patterns, sample selection and spatio-temporal relationships. To take into account the uncertainty in the point forecast, some prediction intervals may be of interest. In particular, for (possibly long) missing sequences of consecutive time points, joint prediction regions are desirable. In this paper we propose a bootstrap resampling scheme to construct joint prediction regions that approximately contain missing paths of a time components in a spatio-temporal framework, with global probability $$1-\alpha $$ 1 - α . In many applications, considering the coverage of the whole missing sample-path might appear too restrictive. To perceive more informative inference, we also derive smaller joint prediction regions that only contain all elements of missing paths up to a small number k of them with probability $$1-\alpha $$ 1 - α . A simulation experiment is performed to validate the empirical performance of the proposed joint bootstrap prediction and to compare it with some alternative procedures based on a simple nominal coverage correction, loosely inspired by the Bonferroni approach, which are expected to work well standard scenarios.


2021 ◽  
Author(s):  
Hong-hui Xu ◽  
Xin-qing Wang ◽  
Dong Wang ◽  
Bao-guo Duan ◽  
Ting Rui

2019 ◽  
Vol 3 (1) ◽  
pp. 102-117
Author(s):  
Ika Meilyana Warohmah ◽  
Atiqa Sabardila

This study aims to describe the linguistic form in the speech of the Muhammadiyah Surakarta University of Indonesia (MPBI-UMS) Master of Indonesian Language students who portray themselves as junior high school principals. The data in this study are in the form of words, phrases, clauses, and sentences in the speeches of students acting as principals. The source of research data is in the form of student speech transcripts. The data collection technique uses the technique to see and note. The data analysis technique uses the equivalent and final method, while the data validity technique uses theory and source validation. The results in the study show that in the speech of Indonesian language education students the graduate school master's program of Muhammadiyah University of Surakarta covers five fields. First, the field of phonology includes (a) 10 forms of pronunciation error, (b) two forms of capital letters misuse, (c) five forms of italics, and (d) six forms of spelling writing errors. Second, morphology is found in five prepositions. Third, the field of syntax includes (a) four forms of pleonasm errors, (b) four forms of conjunction errors, and (c) four forms of misuse of the redundant word. Fourth, the field of pragmatics includes (a) one form of implicature, (b) one form of expressive speech acts, and (c) two forms of directive speech acts. Fifth, the field of sosiolinguistics includes (a) five forms of code switching, and (b) two forms of code mixing. Sixth, nonformal variety fields are found in one form.


Sign in / Sign up

Export Citation Format

Share Document