Automatically Measuring Question Authenticity in Real-World Classrooms

Analyzing the quality of classroom talk is central to educational research and improvement efforts. In particular, the presence of authentic teacher questions, where answers are not predetermined by the teacher, helps constitute and serves as a marker of productive classroom discourse. Further, authentic questions can be cultivated to improve teaching effectiveness and consequently student achievement. Unfortunately, current methods to measure question authenticity do not scale because they rely on human observations or coding of teacher discourse. To address this challenge, we set out to use automatic speech recognition, natural language processing, and machine learning to train computers to detect authentic questions in real-world classrooms automatically. Our methods were iteratively refined using classroom audio and human-coded observational data from two sources: (a) a large archival database of text transcripts of 451 observations from 112 classrooms; and (b) a newly collected sample of 132 high-quality audio recordings from 27 classrooms, obtained under technical constraints that anticipate large-scale automated data collection and analysis. Correlations between human-coded and computer-coded authenticity at the classroom level were sufficiently high ( r = .602 for archival transcripts and .687 for audio recordings) to provide a valuable complement to human coding in research efforts.

Download Full-text

Curriculum Learning for Natural Answer Generation

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2018/587 ◽

2018 ◽

Cited By ~ 11

Author(s):

Cao Liu ◽

Shizhu He ◽

Kang Liu ◽

Jun Zhao

Keyword(s):

Full Advantage ◽

Real World ◽

Large Scale ◽

Question Answering ◽

Generative Models ◽

The Internet ◽

Basic Model ◽

The Arts ◽

Learning Data

By reason of being able to obtain natural language responses, natural answers are more favored in real-world Question Answering (QA) systems. Generative models learn to automatically generate natural answers from large-scale question answer pairs (QA-pairs). However, they are suffering from the uncontrollable and uneven quality of QA-pairs crawled from the Internet. To address this problem, we propose a curriculum learning based framework for natural answer generation (CL-NAG), which is able to take full advantage of the valuable learning data from a noisy and uneven-quality corpus. Specifically, we employ two practical measures to automatically measure the quality (complexity) of QA-pairs. Based on the measurements, CL-NAG firstly utilizes simple and low-quality QA-pairs to learn a basic model, and then gradually learns to produce better answers with richer contents and more complete syntaxes based on more complex and higher-quality QA-pairs. In this way, all valuable information in the noisy and uneven-quality corpus could be fully exploited. Experiments demonstrate that CL-NAG outperforms the state-of-the-arts, which increases 6.8% and 8.7% in the accuracy for simple and complex questions, respectively.

Download Full-text

Antarctica and environmental change: closing remarks

Philosophical Transactions of the Royal Society B Biological Sciences ◽

10.1098/rstb.1992.0153 ◽

1992 ◽

Vol 338 (1285) ◽

pp. 329-334

Keyword(s):

Data Collection ◽

Numerical Modelling ◽

Large Scale ◽

Rapid Development ◽

Polar Regions ◽

Data Collection And Analysis ◽

The Difference ◽

Improved Technology ◽

The Antarctic

The purpose of this contribution is to summarize the papers and discussions, to bring out the highlights, and to focus on outstanding problems and uncertainties. Sixteen years ago Sir Vivian Fuchs and I organized a similar meeting on research in the Antarctic. Since then there has been an explosion of interest in all branches of environm ental science in this region. There have been major advances in theory, and improved technology made possible by the rapid development of electronics has made data collection and analysis easier; but above all the difference between the two meetings is in the development of large-scale numerical modelling as a tool. Also there has been an increasing realization of the value of comparisons between the two polar regions, which is brought out by the contributions to this meeting. The meeting has been distinguished by the quality of the science, the clarity of exposition and excellent visual presentations. It is also striking how much crossfertilization between disciplines has occurred

Download Full-text

On the use of cellular telephony for audio interaction with animals

Biology Letters ◽

10.1098/rsbl.2007.0386 ◽

2007 ◽

Vol 3 (6) ◽

pp. 603-606

Author(s):

Dale Joachim ◽

Eben Goodale

Keyword(s):

Large Scale ◽

Animal Communication ◽

Cellular Phones ◽

Cellular Telephones ◽

Audio Quality ◽

Strix Varia ◽

Cellular Telephony ◽

Audio Recordings ◽

Megascops Asio

Playback is an important method of surveying animals, assessing habitats and studying animal communication. However, conventional playback methods require on-site observers and therefore become labour-intensive when covering large areas. Such limitations could be circumvented by the use of cellular telephony, a ubiquitous technology with increasing biological applications. In addressing concerns about the low audio quality of cellular telephones, this paper presents experimental data to show that owls of two species ( Strix varia and Megascops asio ) respond similarly to calls played through cellular telephones as to calls played through conventional playback technology. In addition, the telephone audio recordings are of sufficient quality to detect most of the two owl species' responses. These findings are a first important step towards large-scale applications where networks of cellular phones conduct real-time monitoring tasks.

Download Full-text

Data Quality in Cooperative Information Systems

Encyclopedia of Data Warehousing and Mining ◽

10.4018/978-1-59140-557-3.ch057 ◽

2011 ◽

pp. 297-301

Author(s):

Carla Marchetti ◽

Massimo Mecella ◽

Monica Scannapieco ◽

Antoninio Virgillito

Keyword(s):

Information System ◽

Information Systems ◽

Data Quality ◽

Real World ◽

Large Scale ◽

Quality Certification ◽

Geographically Distributed ◽

Cooperative Information Systems

A Cooperative Information System (CIS) is a large-scale information system that interconnects various systems of different and autonomous organizations, geographically distributed and sharing common objectives (De Michelis et al., 1997). Among the different resources that are shared by organizations, data are fundamental; in real world scenarios, organization A may not request data from organization B, if it does not trust B’s data (i.e., if A does not know that the quality of the data that B can provide is high). As an example, in an e-government scenario in which public administrations cooperate in order to fulfill service requests from citizens and enterprises (Batini & Mecella, 2001), administrations very often prefer asking citizens for data rather than from other administrations that have stored the same data, because the quality of such data is not known. Therefore, lack of cooperation may occur due to lack of quality certification.

Download Full-text

A survey on VV&A of large-scale simulations

International Journal of Crowd Science ◽

10.1108/ijcs-01-2019-0004 ◽

2019 ◽

Vol 3 (1) ◽

pp. 63-86 ◽

Cited By ~ 1

Author(s):

Yanan Wang ◽

Jianqiang Li ◽

Sun Hongbo ◽

Yuan Li ◽

Faheem Akhtar ◽

...

Keyword(s):

Real World ◽

Design Methodology ◽

Large Scale ◽

Content Type ◽

Related Research ◽

Network Simulations ◽

Large Scale Simulations ◽

Research Studies

Purpose Simulation is a well-known technique for using computers to imitate or simulate the operations of various kinds of real-world facilities or processes. The facility or process of interest is usually called a system, and to study it scientifically, we often have to make a set of assumptions about how it works. These assumptions, which usually take the form of mathematical or logical relationships, constitute a model that is used to gain some understanding of how the corresponding system behaves, and the quality of these understandings essentially depends on the credibility of given assumptions or models, known as VV&A (verification, validation and accreditation). The main purpose of this paper is to present an in-depth theoretical review and analysis for the application of VV&A in large-scale simulations. Design/methodology/approach After summarizing the VV&A of related research studies, the standards, frameworks, techniques, methods and tools have been discussed according to the characteristics of large-scale simulations (such as crowd network simulations). Findings The contributions of this paper will be useful for both academics and practitioners for formulating VV&A in large-scale simulations (such as crowd network simulations). Originality/value This paper will help researchers to provide support of a recommendation for formulating VV&A in large-scale simulations (such as crowd network simulations).

Download Full-text

Precision Assessment of COVID-19 Phenotypes Using Large-Scale Clinic Visit Audio Recordings: Harnessing the Power of Patient Voice

Journal of Medical Internet Research ◽

10.2196/20545 ◽

2021 ◽

Vol 23 (2) ◽

pp. e20545

Author(s):

Paul J Barr ◽

James Ryan ◽

Nicholas C Jacobson

Keyword(s):

Machine Learning ◽

High Risk ◽

Language Processing ◽

Large Scale ◽

Clinical Symptoms ◽

Clinical Phenotype ◽

Clinic Visit ◽

Patient Reported ◽

Audio Recordings ◽

Clinic Visits

COVID-19 cases are exponentially increasing worldwide; however, its clinical phenotype remains unclear. Natural language processing (NLP) and machine learning approaches may yield key methods to rapidly identify individuals at a high risk of COVID-19 and to understand key symptoms upon clinical manifestation and presentation. Data on such symptoms may not be accurately synthesized into patient records owing to the pressing need to treat patients in overburdened health care settings. In this scenario, clinicians may focus on documenting widely reported symptoms that indicate a confirmed diagnosis of COVID-19, albeit at the expense of infrequently reported symptoms. While NLP solutions can play a key role in generating clinical phenotypes of COVID-19, they are limited by the resulting limitations in data from electronic health records (EHRs). A comprehensive record of clinic visits is required—audio recordings may be the answer. A recording of clinic visits represents a more comprehensive record of patient-reported symptoms. If done at scale, a combination of data from the EHR and recordings of clinic visits can be used to power NLP and machine learning models, thus rapidly generating a clinical phenotype of COVID-19. We propose the generation of a pipeline extending from audio or video recordings of clinic visits to establish a model that factors in clinical symptoms and predict COVID-19 incidence. With vast amounts of available data, we believe that a prediction model can be rapidly developed to promote the accurate screening of individuals at a high risk of COVID-19 and to identify patient characteristics that predict a greater risk of a more severe infection. If clinical encounters are recorded and our NLP model is adequately refined, benchtop virologic findings would be better informed. While clinic visit recordings are not the panacea for this pandemic, they are a low-cost option with many potential benefits, which have recently begun to be explored.

Download Full-text

Using Natural Language Processing to Measure and Improve Quality of Diabetes Care: A Systematic Review

Journal of Diabetes Science and Technology ◽

10.1177/19322968211000831 ◽

2021 ◽

pp. 193229682110008

Author(s):

Alexander Turchin ◽

Luisa F. Florez Builes

Keyword(s):

Systematic Review ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Real World ◽

Diabetes Care ◽

Large Fraction ◽

Real World Data ◽

Quality Of Diabetes Care

Background: Real-world evidence research plays an increasingly important role in diabetes care. However, a large fraction of real-world data are “locked” in narrative format. Natural language processing (NLP) technology offers a solution for analysis of narrative electronic data. Methods: We conducted a systematic review of studies of NLP technology focused on diabetes. Articles published prior to June 2020 were included. Results: We included 38 studies in the analysis. The majority (24; 63.2%) described only development of NLP tools; the remainder used NLP tools to conduct clinical research. A large fraction (17; 44.7%) of studies focused on identification of patients with diabetes; the rest covered a broad range of subjects that included hypoglycemia, lifestyle counseling, diabetic kidney disease, insulin therapy and others. The mean F1 score for all studies where it was available was 0.882. It tended to be lower (0.817) in studies of more linguistically complex concepts. Seven studies reported findings with potential implications for improving delivery of diabetes care. Conclusion: Research in NLP technology to study diabetes is growing quickly, although challenges (e.g. in analysis of more linguistically complex concepts) remain. Its potential to deliver evidence on treatment and improving quality of diabetes care is demonstrated by a number of studies. Further growth in this area would be aided by deeper collaboration between developers and end-users of natural language processing tools as well as by broader sharing of the tools themselves and related resources.

Download Full-text

The Value of Paraphrase for Knowledge Base Predicates

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i05.6475 ◽

2020 ◽

Vol 34 (05) ◽

pp. 9346-9353

Author(s):

Bingcong Xue ◽

Sen Hu ◽

Lei Zou ◽

Jiashu Cheng

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Large Scale ◽

Question Answering ◽

Knowledge Bases ◽

Human Beings ◽

High Quality ◽

Language Generation

Paraphrase, i.e., differing textual realizations of the same meaning, has proven useful for many natural language processing (NLP) applications. Collecting paraphrase for predicates in knowledge bases (KBs) is the key to comprehend the RDF triples in KBs. Existing works have published some paraphrase datasets automatically extracted from large corpora, but have too many redundant pairs or don't cover enough predicates, which cannot be improved by computer only and need the help of human beings. This paper shows a full process of collecting large-scale and high-quality paraphrase dictionaries for predicates in knowledge bases, which takes advantage of existing datasets and combines the technologies of machine mining and crowdsourcing. Our dataset comprises 2284 distinct predicates in DBpedia and 31130 paraphrase pairs in total, the quality of which is a great leap over previous works. Then it is demonstrated that such good paraphrase dictionaries can do great help to natural language processing tasks such as question answering and language generation. We also publish our own dictionary for further research.

Download Full-text

Conversational Intelligence Challenge: Accelerating Research with Crowd Science and Open Source

AI Magazine ◽

10.1609/aimag.v41i3.5324 ◽

2020 ◽

Vol 41 (3) ◽

pp. 18-27

Author(s):

Mikhail Burtsev ◽

Varvara Logacheva

Keyword(s):

Open Source ◽

Language Processing ◽

Large Scale ◽

Training Data ◽

Automatic Evaluation ◽

Dialogue Systems ◽

Open Domain ◽

Short Text ◽

Human Evaluation

Development of conversational systems is one of the most challenging tasks in natural language processing, and it is especially hard in the case of open-domain dialogue. The main factors that hinder progress in this area are lack of training data and difficulty of automatic evaluation. Thus, to reliably evaluate the quality of such models, one needs to resort to time-consuming and expensive human evaluation. We tackle these problems by organizing the Conversational Intelligence Challenge (ConvAI) — open competition of dialogue systems. Our goals are threefold: to work out a good design for human evaluation of open-domain dialogue, to grow open-source code base for conversational systems, and to harvest and publish new datasets. Over the course of ConvAI1 and ConvAI2 competitions, we developed a framework for evaluation of chatbots in messaging platforms and used it to evaluate over 30 dialogue systems in two conversational tasks — discussion of short text snippets from Wikipedia and personalized small talk. These large-scale evaluation experiments were performed by recruiting volunteers as well as paid workers. As a result, we succeeded in collecting a dataset of around 5,000 long meaningful human-to-bot dialogues and got many insights into the organization of human evaluation. This dataset can be used to train an automatic evaluation model or to improve the quality of dialogue systems. Our analysis of ConvAI1 and ConvAI2 competitions shows that the future work in this area should be centered around the more active participation of volunteers in the assessment of dialogue systems. To achieve that, we plan to make the evaluation setup more engaging.

Download Full-text

Clobex Spray Improves Psoriasis, Quality of Life in Real-World Study

Skin & Allergy News ◽

10.1016/s0037-6337(07)70248-6 ◽

2007 ◽

Vol 38 (4) ◽

pp. 26

Author(s):

DAMIAN MCNAMARA

Keyword(s):

Quality Of Life ◽

Real World

Download Full-text