scholarly journals Table-to-Text Generation with Accurate Content Copying

Author(s):  
Yang Yang ◽  
Juan Cao ◽  
Yujun Wen ◽  
Pengzhou Zhang

Abstract Table-to-text generation is an important task in natural language generation that aims to generate smooth, informative text based on structured data. In this paper, we propose a novel transformer-based autoregressive model that incorporates table content copying and language model based generation. At first, we propose a word transformation method to process a target text. By using target text containing fields and position information, we can help the model learn the relationship between target text and table and gain the position of where to copy. We then propose two auxiliary learning goals: table-text constraint loss and copy loss. Table-text constraint loss is introduced to effectively model table inputs, whereas copy loss is exploited to precisely copy word fragments from a table. In addition, we change the maximization-based text search strategy to reduce the probability of problems such as sentence repetition and inconsistency. On the WIKIBIO dataset, our model improves its BLUE scores from 45.47 to 46.87 and ROUGE scores from 41.54 to 42.28, outperforming state-of-the-art baseline models on automatic evaluation metrics. On the ROTOWIRE test set, compared with the best baseline model, our model gets 4.29% higher on CO metric, and 1.93 points higher on BLEU.

2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Yang Yang ◽  
Juan Cao ◽  
Yujun Wen ◽  
Pengzhou Zhang

AbstractGenerating fluent, coherent, and informative text from structured data is called table-to-text generation. Copying words from the table is a common method to solve the “out-of-vocabulary” problem, but it’s difficult to achieve accurate copying. In order to overcome this problem, we invent an auto-regressive framework based on the transformer that combines a copying mechanism and language modeling to generate target texts. Firstly, to make the model better learn the semantic relevance between table and text, we apply a word transformation method, which incorporates the field and position information into the target text to acquire the position of where to copy. Then we propose two auxiliary learning objectives, namely table-text constraint loss and copy loss. Table-text constraint loss is used to effectively model table inputs, whereas copy loss is exploited to precisely copy word fragments from a table. Furthermore, we improve the text search strategy to reduce the probability of generating incoherent and repetitive sentences. The model is verified by experiments on two datasets and better results are obtained than the baseline model. On WIKIBIO, the result is improved from 45.47 to 46.87 on BLEU and from 41.54 to 42.28 on ROUGE. On ROTOWIRE, the result is increased by 4.29% on CO metric, and 1.93 points higher on BLEU.


Author(s):  
Yana Zemlyanskaya ◽  
Martina Valente ◽  
Elena V. Syurina

AbstractThis mixed-methods study explored the conversation around orthorexia nervosa (ON) on Instagram from a Russian-speaking perspective. Two quantitative data sources were implemented; a comparative content analysis of posts tagged with #opтopeкcия (n = 234) and #orthorexia (n = 243), and an online questionnaire completed by Russian-speakers (n = 96) sharing ON-related content on Instagram. Additionally, five questionnaire participants were interviewed, four of which identified with having (had) ON. Russian-speakers who share ON-related content on Instagram are primarily female, around their late-twenties, and prefer Instagram over other platforms. They describe people with ON as obsessed with correct eating, rather than healthy or clean eating. Instagram appears to have a dual effect; it has the potential to both trigger the onset of ON and encourage recovery. Positive content encourages a healthy relationship with food, promotes intuitive eating, and spread recovery advice. Harmful content, in turn, emphasizes specific diet and beauty ideals. Russian-speaking users mainly post pictures of food, followed by largely informative text that explains what ON is, and what recovery may look like. Their reasons for posting ON-related content are to share personal experiences, support others in recovery, and raise awareness about ON. Two main target audiences were people unaware of ON and people seeking recovery support. The relationship between ON and social media is not strictly limited to the global north. Thus, it may be valuable to further investigate non-English-speaking populations currently underrepresented in ON research.Level of evidence: Level V, descriptive study.


2021 ◽  
pp. 009862832110088
Author(s):  
Todd D. Watson

Background: Student anxiety about statistics may lead to poorer learning outcomes. Objective: The purpose of this study was to evaluate an exercise designed to teach students in an introductory statistics class the principles of bivariate regression and to emphasize how statistical tools used by psychologists are also implemented in other fields. Method: Students used a published model on the relationship between tooth size and the length of great white sharks to estimate the length of extinct sharks and to explore factors that could affect the accuracy or validity of regression analyses. Data from an anonymous self-report scale were used to assess the activity. Results: More than 95% of respondents agreed or strongly agreed that the activity was engaging, approximately 95% of students agreed or strongly agreed that the activity helped them learn about factors that can lead to problems with bivariate correlation/regression, and approximately 91% of respondents correctly answered a question designed to assess basic content acquisition. Conclusion: Feedback data suggest that the exercise was successful in achieving its content and process learning goals. Teaching Implications: Implementation of similar exercises may improve student engagement and outcomes in psychology statistics courses.


Author(s):  
Jose Camacho-Collados ◽  
Luis Espinosa-Anke ◽  
Shoaib Jameel ◽  
Steven Schockaert

Recently a number of unsupervised approaches have been proposed for learning vectors that capture the relationship between two words. Inspired by word embedding models, these approaches rely on co-occurrence statistics that are obtained from sentences in which the two target words appear. However, the number of such sentences is often quite small, and most of the words that occur in them are not relevant for characterizing the considered relationship. As a result, standard co-occurrence statistics typically lead to noisy relation vectors. To address this issue, we propose a latent variable model that aims to explicitly determine what words from the given sentences best characterize the relationship between the two target words. Relation vectors then correspond to the parameters of a simple unigram language model which is estimated from these words.


2018 ◽  
Vol 17 (2) ◽  
pp. 141-160
Author(s):  
ANELISE SABBAG ◽  
JOAN GARFIELD ◽  
ANDREW ZIEFFLER

Statistical literacy and statistical reasoning are important learning goals that instructors aim to develop in statistics students. However, there is a lack of clarity regarding the relationship among these learning goals and to what extent they overlap. The REasoning and Literacy Instrument (REALI) was designed to concurrently measure statistical literacy and reasoning. This paper reports the development process of the REALI assessment, which included test blueprint, expert review, item categorization, pilot and field testing, and data analysis to identify what measurement model best represents the constructs of statistical literacy and reasoning given the criteria of fit and parsimony. The results suggested that statistical literacy and reasoning can be measured effectively by the REALI assessment with high score precision. First published November 2018 at Statistics Education Research Journal Archives


Entropy ◽  
2020 ◽  
Vol 22 (5) ◽  
pp. 533
Author(s):  
Qin Zhao ◽  
Chenguang Hou ◽  
Changjian Liu ◽  
Peng Zhang ◽  
Ruifeng Xu

Quantum-inspired language models have been introduced to Information Retrieval due to their transparency and interpretability. While exciting progresses have been made, current studies mainly investigate the relationship between density matrices of difference sentence subspaces of a semantic Hilbert space. The Hilbert space as a whole which has a unique density matrix is lack of exploration. In this paper, we propose a novel Quantum Expectation Value based Language Model (QEV-LM). A unique shared density matrix is constructed for the Semantic Hilbert Space. Words and sentences are viewed as different observables in this quantum model. Under this background, a matching score describing the similarity between a question-answer pair is naturally explained as the quantum expectation value of a joint question-answer observable. In addition to the theoretical soundness, experiment results on the TREC-QA and WIKIQA datasets demonstrate the computational efficiency of our proposed model with excellent performance and low time consumption.


2010 ◽  
Vol 171-172 ◽  
pp. 94-97
Author(s):  
Rui Liu ◽  
Ming Hu Jiang

The image search engines have been effective tools to find pictures from the Internet. They provide a list of image items in response to a user’s query, and rank the items according to their relevance to the query. An image item is often accompanied with a short descriptive text, which is brief text summaries extracted from the webpage title, content, image caption, or its metadata, to provide auxiliary information about the image. In this paper, we present a new and effective descriptive text generation method by using the idea of summarizing an image’s surrounding text, using text’s position information, and finding an image’s nearest neighbors.


2012 ◽  
Vol 65 (3) ◽  
pp. 561-570 ◽  
Author(s):  
Wantong Chen ◽  
Yanzhong Zhang

GNSS relative positioning technique is an important field of study, in which the standard ‘GNSS Baseline Model’ is often used. Differencing between observation equations is used to construct the mathematical model, since this method can eliminate some common errors in the GNSS signal measurements. The ‘Orthogonal Transformation’ method can also construct the GNSS Baseline Model. However, as is described by some scholars, this model may avoid some drawbacks of Double Differencing (DD) while maintaining all the advantages. For comparison purposes, this model is evaluated and the theoretical equivalence of both approaches is proved for the short baseline from two aspects: the Integer Ambiguity Resolution and the conditional least-squares baseline vector.


2017 ◽  
Vol 3 (2) ◽  
pp. 33-51
Author(s):  
Siti Bariroh

Discipline is one of the factors of success in achieving learning goals. Discipline teacher will bring a positive impact on the development of students, it requires dedication and a high responsibility. A teacher required to cultivate the mindset, have extensive knowledge, must also have competencies that qualified, good pedagogy, methodology, or disciplines to be taught. The teaching profession is a noble profession, although sometimes underestimated, but it is crucial in preparing the next generation, which will continue nation's leadership in the future. Teachers are also required to be able to provide examples of good models, in order to produce generations that can be better in the future. xThe benchmark of whether good nor bad of one community is the education itself. While the main character in the world of education is the teacher, because the teacher are able to single handedly carve a student's future whether it's good nor bad.The people's expectation on the products of the education system can be seen on the student's both academic and non-academic achievements.This research is using a quantitative descriptive method, whereas the type of approach used in this study is a partial correlation by analyzing the relationship (influence) between the variables of work discipline in the teacher with student achievement.Data collection technique is to spread the Likert scale questionnaire form which contains a number of questions about the indicators of work discipline and student achievement. The data obtained was then added to the partial correlation formula per variable, then connected, whether there is a significant relationship between teacher's work discipline with student achievement in SMA Negeri 1 Bumiayu, Brebes Regency, while the author used determinant coefficient to analyze the relationship between teacher's work discipline and student achievement.The results of the study is showing that teacher's work discipline will affect student achievement, because there is a relationship of 0.786 or 78.6% that were classified as very strong based on the level of its relationship.


2019 ◽  
Author(s):  
Sheng Shen ◽  
Daniel Fried ◽  
Jacob Andreas ◽  
Dan Klein

Sign in / Sign up

Export Citation Format

Share Document