scholarly journals Knowledge-Enhanced Graph Attention Network for Fact Verification

Mathematics ◽  
2021 ◽  
Vol 9 (16) ◽  
pp. 1949
Author(s):  
Chonghao Chen ◽  
Jianming Zheng ◽  
Honghui Chen

Fact verification aims to evaluate the authenticity of a given claim based on the evidence sentences retrieved from Wikipedia articles. Existing works mainly leverage the natural language inference methods to model the semantic interaction of claim and evidence, or further employ the graph structure to capture the relation features between multiple evidences. However, previous methods have limited representation ability in encoding complicated units of claim and evidences, and thus cannot support sophisticated reasoning. In addition, a limited amount of supervisory signals lead to the graph encoder could not distinguish the distinctions of different graph structures and weaken the encoding ability. To address the above issues, we propose a Knowledge-Enhanced Graph Attention network (KEGA) for fact verification, which introduces a knowledge integration module to enhance the representation of claims and evidences by incorporating external knowledge. Moreover, KEGA leverages an auxiliary loss based on contrastive learning to fine-tune the graph attention encoder and learn the discriminative features for the evidence graph. Comprehensive experiments conducted on FEVER, a large-scale benchmark dataset for fact verification, demonstrate the superiority of our proposal in both the multi-evidences and single-evidence scenarios. In addition, our findings show that the background knowledge for words can effectively improve the model performance.


Author(s):  
Pauline Jacobson

This chapter examines the currently fashionable notion of ‘experimental semantics’, and argues that most work in natural language semantics has always been experimental. The oft-cited dichotomy between ‘theoretical’ (or ‘armchair’) and ‘experimental’ is bogus and should be dropped form the discourse. The same holds for dichotomies like ‘intuition-based’ (or ‘thought experiments’) vs. ‘empirical’ work (and ‘real experiments’). The so-called new ‘empirical’ methods are often nothing more than collecting the large-scale ‘intuitions’ or, doing multiple thought experiments. Of course the use of multiple subjects could well allow for a better experiment than the more traditional single or few subject methodologies. But whether or not this is the case depends entirely on the question at hand. In fact, the chapter considers several multiple-subject studies and shows that the particular methodology in those cases does not necessarily provide important insights, and the chapter argues that some its claimed benefits are incorrect.



2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Olga Majewska ◽  
Charlotte Collins ◽  
Simon Baker ◽  
Jari Björne ◽  
Susan Windisch Brown ◽  
...  

Abstract Background Recent advances in representation learning have enabled large strides in natural language understanding; However, verbal reasoning remains a challenge for state-of-the-art systems. External sources of structured, expert-curated verb-related knowledge have been shown to boost model performance in different Natural Language Processing (NLP) tasks where accurate handling of verb meaning and behaviour is critical. The costliness and time required for manual lexicon construction has been a major obstacle to porting the benefits of such resources to NLP in specialised domains, such as biomedicine. To address this issue, we combine a neural classification method with expert annotation to create BioVerbNet. This new resource comprises 693 verbs assigned to 22 top-level and 117 fine-grained semantic-syntactic verb classes. We make this resource available complete with semantic roles and VerbNet-style syntactic frames. Results We demonstrate the utility of the new resource in boosting model performance in document- and sentence-level classification in biomedicine. We apply an established retrofitting method to harness the verb class membership knowledge from BioVerbNet and transform a pretrained word embedding space by pulling together verbs belonging to the same semantic-syntactic class. The BioVerbNet knowledge-aware embeddings surpass the non-specialised baseline by a significant margin on both tasks. Conclusion This work introduces the first large, annotated semantic-syntactic classification of biomedical verbs, providing a detailed account of the annotation process, the key differences in verb behaviour between the general and biomedical domain, and the design choices made to accurately capture the meaning and properties of verbs used in biomedical texts. The demonstrated benefits of leveraging BioVerbNet in text classification suggest the resource could help systems better tackle challenging NLP tasks in biomedicine.



2021 ◽  
Author(s):  
Maxwell Adam Levinson ◽  
Justin Niestroy ◽  
Sadnan Al Manir ◽  
Karen Fairchild ◽  
Douglas E. Lake ◽  
...  

AbstractResults of computational analyses require transparent disclosure of their supporting resources, while the analyses themselves often can be very large scale and involve multiple processing steps separated in time. Evidence for the correctness of any analysis should include not only a textual description, but also a formal record of the computations which produced the result, including accessible data and software with runtime parameters, environment, and personnel involved. This article describes FAIRSCAPE, a reusable computational framework, enabling simplified access to modern scalable cloud-based components. FAIRSCAPE fully implements the FAIR data principles and extends them to provide fully FAIR Evidence, including machine-interpretable provenance of datasets, software and computations, as metadata for all computed results. The FAIRSCAPE microservices framework creates a complete Evidence Graph for every computational result, including persistent identifiers with metadata, resolvable to the software, computations, and datasets used in the computation; and stores a URI to the root of the graph in the result’s metadata. An ontology for Evidence Graphs, EVI (https://w3id.org/EVI), supports inferential reasoning over the evidence. FAIRSCAPE can run nested or disjoint workflows and preserves provenance across them. It can run Apache Spark jobs, scripts, workflows, or user-supplied containers. All objects are assigned persistent IDs, including software. All results are annotated with FAIR metadata using the evidence graph model for access, validation, reproducibility, and re-use of archived data and software.



2021 ◽  
Vol 9 (6) ◽  
pp. 635
Author(s):  
Hyeok Jin ◽  
Kideok Do ◽  
Sungwon Shin ◽  
Daniel Cox

Coastal dunes are important morphological features for both ecosystems and coastal hazard mitigation. Because understanding and predicting dune erosion phenomena is very important, various numerical models have been developed to improve the accuracy. In the present study, a process-based model (XBeachX) was tested and calibrated to improve the accuracy of the simulation of dune erosion from a storm event by adjusting the coefficients in the model and comparing it with the large-scale experimental data. The breaker slope coefficient was calibrated to predict cross-shore wave transformation more accurately. To improve the prediction of the dune erosion profile, the coefficients related to skewness and asymmetry were adjusted. Moreover, the bermslope coefficient was calibrated to improve the simulation performance of the bermslope near the dune face. Model performance was assessed based on the model-data comparisons. The calibrated XBeachX successfully predicted wave transformation and dune erosion phenomena. In addition, the results obtained from other two similar experiments on dune erosion with the same calibrated set matched well with the observed wave and profile data. However, the prediction of underwater sand bar evolution remains a challenge.



Author(s):  
Lianli Gao ◽  
Pengpeng Zeng ◽  
Jingkuan Song ◽  
Yuan-Fang Li ◽  
Wu Liu ◽  
...  

To date, visual question answering (VQA) (i.e., image QA and video QA) is still a holy grail in vision and language understanding, especially for video QA. Compared with image QA that focuses primarily on understanding the associations between image region-level details and corresponding questions, video QA requires a model to jointly reason across both spatial and long-range temporal structures of a video as well as text to provide an accurate answer. In this paper, we specifically tackle the problem of video QA by proposing a Structured Two-stream Attention network, namely STA, to answer a free-form or open-ended natural language question about the content of a given video. First, we infer rich longrange temporal structures in videos using our structured segment component and encode text features. Then, our structured two-stream attention component simultaneously localizes important visual instance, reduces the influence of background video and focuses on the relevant text. Finally, the structured two-stream fusion component incorporates different segments of query and video aware context representation and infers the answers. Experiments on the large-scale video QA dataset TGIF-QA show that our proposed method significantly surpasses the best counterpart (i.e., with one representation for the video input) by 13.0%, 13.5%, 11.0% and 0.3 for Action, Trans., TrameQA and Count tasks. It also outperforms the best competitor (i.e., with two representations) on the Action, Trans., TrameQA tasks by 4.1%, 4.7%, and 5.1%.



Author(s):  
Siva Reddy ◽  
Mirella Lapata ◽  
Mark Steedman

In this paper we introduce a novel semantic parsing approach to query Freebase in natural language without requiring manual annotations or question-answer pairs. Our key insight is to represent natural language via semantic graphs whose topology shares many commonalities with Freebase. Given this representation, we conceptualize semantic parsing as a graph matching problem. Our model converts sentences to semantic graphs using CCG and subsequently grounds them to Freebase guided by denotations as a form of weak supervision. Evaluation experiments on a subset of the Free917 and WebQuestions benchmark datasets show our semantic parser improves over the state of the art.



2021 ◽  
Author(s):  
Moctar Dembélé ◽  
Bettina Schaefli ◽  
Grégoire Mariéthoz

<p>The diversity of remotely sensed or reanalysis-based rainfall data steadily increases, which on one hand opens new perspectives for large scale hydrological modelling in data scarce regions, but on the other hand poses challenging question regarding parameter identification and transferability under multiple input datasets. This study analyzes the variability of hydrological model performance when (1) a set of parameters is transferred from the calibration input dataset to a different meteorological datasets and reversely, when (2) an input dataset is used with a parameter set, originally calibrated for a different input dataset.</p><p>The research objective is to highlight the uncertainties related to input data and the limitations of hydrological model parameter transferability across input datasets. An ensemble of 17 rainfall datasets and 6 temperature datasets from satellite and reanalysis sources (Dembélé et al., 2020), corresponding to 102 combinations of meteorological data, is used to force the fully distributed mesoscale Hydrologic Model (mHM). The mHM model is calibrated for each combination of meteorological datasets, thereby resulting in 102 calibrated parameter sets, which almost all give similar model performance. Each of the 102 parameter sets is used to run the mHM model with each of the 102 input datasets, yielding 10404 scenarios to that serve for the transferability tests. The experiment is carried out for a decade from 2003 to 2012 in the large and data-scarce Volta River basin (415600 km2) in West Africa.</p><p>The results show that there is a high variability in model performance for streamflow (mean CV=105%) when the parameters are transferred from the original input dataset to other input datasets (test 1 above). Moreover, the model performance is in general lower and can drop considerably when parameters obtained under all other input datasets are transferred to a selected input dataset (test 2 above). This underlines the need for model performance evaluation when different input datasets and parameter sets than those used during calibration are used to run a model. Our results represent a first step to tackle the question of parameter transferability to climate change scenarios. An in-depth analysis of the results at a later stage will shed light on which model parameterizations might be the main source of performance variability.</p><p>Dembélé, M., Schaefli, B., van de Giesen, N., & Mariéthoz, G. (2020). Suitability of 17 rainfall and temperature gridded datasets for large-scale hydrological modelling in West Africa. Hydrology and Earth System Sciences (HESS). https://doi.org/10.5194/hess-24-5379-2020</p>



2021 ◽  
Author(s):  
Xinxu Shen ◽  
Troy Houser ◽  
David Victor Smith ◽  
Vishnu P. Murty

The use of naturalistic stimuli, such as narrative movies, is gaining popularity in many fields, characterizing memory, affect, and decision-making. Narrative recall paradigms are often used to capture the complexity and richness of memory for naturalistic events. However, scoring narrative recalls is time-consuming and prone to human biases. Here, we show the validity and reliability of using a natural language processing tool, the Universal Sentence Encoder (USE), to automatically score narrative recall. We compared the reliability in scoring made between two independent raters (i.e., hand-scored) and between our automated algorithm and individual raters (i.e., automated) on trial-unique, video clips of magic tricks. Study 1 showed that our automated segmentation approaches yielded high reliability and reflected measures yielded by hand-scoring, and further that the results using USE outperformed another popular natural language processing tool, GloVe. In study two, we tested whether our automated approach remained valid when testing individual’s varying on clinically-relevant dimensions that influence episodic memory, age and anxiety. We found that our automated approach was equally reliable across both age groups and anxiety groups, which shows the efficacy of our approach to assess narrative recall in large-scale individual difference analysis. In sum, these findings suggested that machine learning approaches implementing USE are a promising tool for scoring large-scale narrative recalls and perform individual difference analysis for research using naturalistic stimuli.





2009 ◽  
Vol 137 (11) ◽  
pp. 4030-4046 ◽  
Author(s):  
Daniel F. Steinhoff ◽  
Saptarshi Chaudhuri ◽  
David H. Bromwich

Abstract A case study illustrating cloud processes and other features associated with the Ross Ice Shelf airstream (RAS), in Antarctica, is presented. The RAS is a semipermanent low-level wind regime primarily over the western Ross Ice Shelf, linked to the midlatitude circulation and formed from terrain-induced and large-scale forcing effects. An integrated approach utilizes Moderate Resolution Imaging Spectroradiometer (MODIS) satellite imagery, automatic weather station (AWS) data, and Antarctic Mesoscale Prediction System (AMPS) forecast output to study the synoptic-scale and mesoscale phenomena involved in cloud formation over the Ross Ice Shelf during a RAS event. A synoptic-scale cyclone offshore of Marie Byrd Land draws moisture across West Antarctica to the southern base of the Ross Ice Shelf. Vertical lifting associated with flow around the Queen Maud Mountains leads to cloud formation that extends across the Ross Ice Shelf to the north. The low-level cloud has a warm signature in thermal infrared imagery, resembling a surface feature of turbulent katabatic flow typically ascribed to the RAS. Strategically placed AWS sites allow assessment of model performance within and outside of the RAS signature. AMPS provides realistic simulation of conditions aloft but experiences problems at low levels due to issues with the model PBL physics. Key meteorological features of this case study, within the context of previous studies on longer time scales, are inferred to be common occurrences. The assumption that warm thermal infrared signatures are surface features is found to be too restrictive.



Sign in / Sign up

Export Citation Format

Share Document