scholarly journals An automatic scoring method for Chinese-English spoken translation based on attention LSTM

Author(s):  
Xiaobin Guo
2020 ◽  
Vol 51 (2) ◽  
pp. 479-493
Author(s):  
Jenny A. Roberts ◽  
Evelyn P. Altenberg ◽  
Madison Hunter

Purpose The results of automatic machine scoring of the Index of Productive Syntax from the Computerized Language ANalysis (CLAN) tools of the Child Language Data Exchange System of TalkBank (MacWhinney, 2000) were compared to manual scoring to determine the accuracy of the machine-scored method. Method Twenty transcripts of 10 children from archival data of the Weismer Corpus from the Child Language Data Exchange System at 30 and 42 months were examined. Measures of absolute point difference and point-to-point accuracy were compared, as well as points erroneously given and missed. Two new measures for evaluating automatic scoring of the Index of Productive Syntax were introduced: Machine Item Accuracy (MIA) and Cascade Failure Rate— these measures further analyze points erroneously given and missed. Differences in total scores, subscale scores, and individual structures were also reported. Results Mean absolute point difference between machine and hand scoring was 3.65, point-to-point agreement was 72.6%, and MIA was 74.9%. There were large differences in subscales, with Noun Phrase and Verb Phrase subscales generally providing greater accuracy and agreement than Question/Negation and Sentence Structures subscales. There were significantly more erroneous than missed items in machine scoring, attributed to problems of mistagging of elements, imprecise search patterns, and other errors. Cascade failure resulted in an average of 4.65 points lost per transcript. Conclusions The CLAN program showed relatively inaccurate outcomes in comparison to manual scoring on both traditional and new measures of accuracy. Recommendations for improvement of the program include accounting for second exemplar violations and applying cascaded credit, among other suggestions. It was proposed that research on machine-scored syntax routinely report accuracy measures detailing erroneous and missed scores, including MIA, so that researchers and clinicians are aware of the limitations of a machine-scoring program. Supplemental Material https://doi.org/10.23641/asha.11984364


2020 ◽  
Vol 63 (4) ◽  
pp. 1240-1253
Author(s):  
Victoria S. Henbest ◽  
Lisa Fitton ◽  
Krystal L. Werfel ◽  
Kenn Apel

Purpose Spelling is a skill that relies on an individual's linguistic awareness, the ability to overtly manipulate language. The ability to accurately spell is important for academic and career success into adulthood. The spelling skills of adults have received some attention in the literature, but there is limited information regarding which approach for analyzing adults' spelling is optimal for guiding instruction or intervention for those who struggle. Thus, we aimed to examine the concurrent validity of four different scoring methods for measuring adults' spellings (a dichotomous scoring method and three continuous methods) and to determine whether adults' linguistic awareness skills differentially predict spelling outcomes based on the scoring method employed. Method Sixty undergraduate college students who were determined to be average readers as measured by a word reading and contextual word reading task were administered a spelling task as well as morphological, orthographic, phonemic, and syntactic awareness tasks. Results All four scoring methods were highly correlated suggesting high concurrent validity among the measures. Two linguistic awareness skills, morphological awareness and syntactic awareness, predicted spelling performance on both the dichotomous and continuous scoring methods. Contrastively, phonemic awareness and orthographic awareness predicted spelling performance only when spelling was scored using a continuous measure error analysis. Conclusions The results of this study confirm that multiple linguistic awareness skills are important for spelling in adults who are average readers. The results also highlight the need for using continuous measures of spelling when planning intervention or instruction, particularly in the areas of orthographic and phonemic awareness.


2017 ◽  
Vol 38 (2) ◽  
pp. 83-93
Author(s):  
Jeffrey M. Cucina ◽  
Nicholas L. Vasilopoulos ◽  
Arwen H. DeCostanza

Abstract. Varimax rotated principal component scores (VRPCS) have previously been offered as a possible solution to the non-orthogonality of scores for the Big Five factors. However, few researchers have examined the reliability and validity of VRPCS. To address this gap, we use a lab study and a field study to investigate whether using VRPCS increase orthogonality, reliability, and criterion-related validity. Compared to the traditional unit-weighting scoring method, the use of VRPCS enhanced the reliability and discriminant validity of the Big Five factors, although there was little improvement in criterion-related validity. Results are discussed in terms of the benefit of using VRPCS instead of traditional unit-weighted sum scores.


2019 ◽  
Vol 35 (6) ◽  
pp. 855-867 ◽  
Author(s):  
John T. Kulas ◽  
Rachael Klahr ◽  
Lindsey Knights

Abstract. Many investigators have noted “reverse-coding” method factors when exploring response pattern structure with psychological inventory data. The current article probes for the existence of a confound in these investigations, whereby an item’s level of saturation with socially desirable content tends to covary with the item’s substantive scale keying. We first investigate its existence, demonstrating that 15 of 16 measures that have been previously implicated as exhibiting a reverse-scoring method effect can also be reasonably characterized as exhibiting a scoring key/social desirability confound. A second set of analyses targets the extent to which the confounding variable may confuse interpretation of factor analytic results and documents strong social desirability associations. The results suggest that assessment developers perhaps consider the social desirability scale value of indicators when constructing scale aggregates (and possibly scales when investigating inter-construct associations). Future investigations would ideally disentangle the confound via experimental manipulation.


Sign in / Sign up

Export Citation Format

Share Document