semantic coherence
Recently Published Documents


TOTAL DOCUMENTS

109
(FIVE YEARS 45)

H-INDEX

10
(FIVE YEARS 2)

2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Sarah E. Morgan ◽  
Kelly Diederen ◽  
Petra E. Vértes ◽  
Samantha H. Y. Ip ◽  
Bo Wang ◽  
...  

AbstractRecent work has suggested that disorganised speech might be a powerful predictor of later psychotic illness in clinical high risk subjects. To that end, several automated measures to quantify disorganisation of transcribed speech have been proposed. However, it remains unclear which measures are most strongly associated with psychosis, how different measures are related to each other and what the best strategies are to collect speech data from participants. Here, we assessed whether twelve automated Natural Language Processing markers could differentiate transcribed speech excerpts from subjects at clinical high risk for psychosis, first episode psychosis patients and healthy control subjects (total N = 54). In-line with previous work, several measures showed significant differences between groups, including semantic coherence, speech graph connectivity and a measure of whether speech was on-topic, the latter of which outperformed the related measure of tangentiality. Most NLP measures examined were only weakly related to each other, suggesting they provide complementary information. Finally, we compared the ability of transcribed speech generated using different tasks to differentiate the groups. Speech generated from picture descriptions of the Thematic Apperception Test and a story re-telling task outperformed free speech, suggesting that choice of speech generation method may be an important consideration. Overall, quantitative speech markers represent a promising direction for future clinical applications.


2021 ◽  
Author(s):  
John Meredith ◽  
Nik Whitehead ◽  
Michael Dacey

A FOXS stack assembles HL7 FHIR, openEHR, IHE XDS and SNOMED CT as an operational clinical data platform to build digital systems. This paper analyses its applicability for FAIR-enabled medical research based on a summary of key principles. It highlights the benefit of the blended approach to operational technology stacks for health systems, and a need for industry standard technologies to enable greater semantic coherence for primary/secondary data use.


2021 ◽  
Vol 28 (1) ◽  
pp. 219-230
Author(s):  
Michał Klementowicz

Each public speech (including the homily) should be associated with showing the right which serves as the basis for accepting the presented statements. This is one of the fundamental features of the correct construction of a statement. The theological content of a homily can be presented on the basis of the classical form of argumentum ex auctoritate. As an inductive structure, this argument may promote specific ways of organising the whole speech. Firstly, in the rhetorical inventio structure, it allows the author to control the semantic coherence of the text. Secondly, in the structure of dis­positio, the authority can make for an interesting use of rhetorical narrative in homily text. Thirdly, argumentum ex auctoritate can be used to build the ethos of the preacher. Thanks to the above proposals, it is possible to influence the processuality of the text, which may determine the recognition and assimilation of the different key elements, both of which are crucial for preaching.


2021 ◽  
Vol 7 (10) ◽  
pp. 420-424

This article considers peculiarities of deixis and anaphora in speech discourse. The author of the article believes that the separation of deictic and anaphora relations as independent objects of research in linguistics is associated with the consideration of the problems of deixis as a category of general activity theory, communicative-functional, pragmatic and cognitive linguistics, and anaphora as a category of text/discourse linguistics, which provides its structural-syntactic and semantic coherence. This explains the transition from the study of the characteristics of syntactic anaphora, mainly pronouns, within the sentence/utterance to the consideration of text / discourse anaphora.


2021 ◽  
pp. 174702182110463
Author(s):  
Mahmoud Elsherif ◽  
linda ruth wheeldon ◽  
Steven Frisson

According to the lexical quality hypothesis (Perfetti, 2007), differences in the orthographic, semantic, and phonological representations of words will affect individual reading performance. Whilst several studies have focused on orthographic precision and semantic coherence, few have considered phonological precision. The present study used a suite of individual difference measures to assess which components of lexical quality contributed to competition resolution in a masked priming experiment. The experiment measured form priming for word and pseudoword targets with dense and sparse neighbourhoods in 84 university students. Individual difference measures of language and cognitive skills were also collected and a principal component analysis was used to group these data into components. The data showed that phonological precision and NHD interacted with form priming. In participants with high phonological precision, the direction of priming for word targets with sparse neighbourhoods was facilitatory, while the direction for those with dense neighbourhoods was inhibitory. In contrast, people with low phonological precision showed the opposite pattern, but the interaction was non-significant. These results suggest that the component of phonological precision is linked to lexical competition for word recognition and that access to the mental lexicon during reading is affected by differing levels of phonological processing.


2021 ◽  
Vol 1 ◽  
pp. 651-660
Author(s):  
Joshua T. Gyory ◽  
Binyang Song ◽  
Jonathan Cagan ◽  
Christopher McComb

AbstractHuman-artificial intelligent (AI) - assisted teaming is becoming a strategy for coalescing the complementary strengths of humans and computers to solve difficult tasks. Yet, there is still much to learn regarding how the integration of humans with AI agents into a team affects human behavior. Accordingly, this work begins to inform this research gap by focusing specifically on how the communication structure and interaction changes within AI-assisted human teams. The underlying discourse data for this work originates from a prior research study in which teams solve an interdisciplinary drone design and path-planning problem. Several metrics are employed in this work to study team discourse, including count, diversity, content richness, and semantic coherence. Results show significant differences in communication behavior in AI-assisted teams including more diversity and frequency in communication, more exchange of information regarding principal design parameters and problem-solving strategies, and more cohesion. Overall, this work takes meaningful steps towards understanding the effects of AI agents on human behavior in teams, critical for fully building effective human-AI hybrid teams in the future.


2021 ◽  
Author(s):  
Alejandro Garcia-Rudolph ◽  
Blanca Cegarra ◽  
Joan Sauri ◽  
John D. Kelleher ◽  
Katryna Cisek ◽  
...  

BACKGROUND Topic modeling and word embeddings’ studies of Twitter data related to COVID-19 are being extensively reported. Another social media platform that experienced a tremendous increase in new users and posts due to COVID-19 was Reddit, offering a much less explored alternative, especially the submissions’ titles, due to their format (≤ 300 characters) and content rules. The positivity of self-presentation on social media has an influence on both the quantity and quality of reactions (upvotes) from other social media contacts. OBJECTIVE 1) Expand on the concept of resilience identifying possible related topics considering their number of upvotes and its closest terms and 2) Associate specific emotions obtained from the state-of-the-art literature to their closest terms in order to relate such emotions to experienced situations. METHODS Reddit data were collected from pushshift.io, with the pushshiftr R package, data cleaning and preprocessing was performed using quanteda, tidyverse, tidytext R packages. A word2vec model (W2V) was trained using submissions’ titles, preliminary validation was performed using a subset of Mikolov’s analogies and a COVID-19 glossary. The W2V model was trained with the wordVectors R package. Main topics (represented as sets of words) using the number of upvotes as covariate were extracted using structural topic modelling (STM) with the spectral methos using the stm R package. Topics validation was performed using semantic coherence and exclusivity. Clusters were assessed using Dunn index. RESULTS We collected all 374,421 titles submitted by 104,351 different redditors to the r/Coronavirus subreddit between January 20th 2020 and 14th May 2021. We trained W2V and identified more than 20 valid analogies (e.g. doctor – hospital + teacher = school). We further validated W2V with representative terms extracted from a COVID-19 glossary, all closest terms retrieved by W2V were verified using state of the art publications. STM retrieved 20 topics (with 20 words each) ordered by their number of upvotes, we run W2V in a representative topic (addressing vaccines) and we used two terms as seeds leading to other related terms (represented using cluster analysis) that we validated using scientific publications. STM did not retrieve any topic containing the term “resilience”, it hardly appeared (less than 0.02%) in all titles. Nevertheless we identified several closest terms (e.g. wellbeing, roadmap) and combined terms (e.g. resilience and elderly, resilience and indigenous) as well as specific emotions that W2V related to lived experiences (e.g. the emotion of gratitude associated to applauses and balconies). CONCLUSIONS We applied for the first time the combination of STM and a word2vec model trained with a relatively small Coronavirus dataset of Reddit titles, leading to immediate and accurate terms that can be used to expand our knowledge on topics associated to the pandemic (e.g. vaccines) or specific aspects such as resilience.


Complexity ◽  
2021 ◽  
Vol 2021 ◽  
pp. 1-9
Author(s):  
Miao Teng

In this paper, we conduct an in-depth study of Japanese keyword extraction from news reports, train external computer document word sets from text preprocessing into word vectors using the Ship-gram model in the deep learning tool Word2Vec, and calculate the cosine distance between word vectors. In this paper, the sliding window in TextRank is designed to connect internal document information to improve the in-text semantic coherence. The main idea is to use not only the statistical and structural features of words but also the semantic features of words extracted through word-embedding techniques, i.e., multifeature fusion, to obtain the importance weights of words themselves and the attraction weights between words and then iteratively calculate the final weight of each word through the graph model algorithm to determine the extracted keywords. To verify the performance of the algorithm, extensive simulation experimental studies were conducted on three different types of datasets. The experimental results show that the proposed keyword extraction algorithm can improve the performance by a maximum of 6.45% and 20.36% compared with the existing word frequency statistics and graph model methods, respectively; MF-Rank can achieve a maximum performance improvement of 1.76% compared with PW-TF.


Author(s):  
Uttam Chauhan ◽  
Apurva Shah

A topic model is one of the best stochastic models for summarizing an extensive collection of text. It has accomplished an inordinate achievement in text analysis as well as text summarization. It can be employed to the set of documents that are represented as a bag-of-words, without considering grammar and order of the words. We modeled the topics for Gujarati news articles corpus. As the Gujarati language has a diverse morphological structure and inflectionally rich, Gujarati text processing finds more complexity. The size of the vocabulary plays an important role in the inference process and quality of topics. As the vocabulary size increases, the inference process becomes slower and topic semantic coherence decreases. If the vocabulary size is diminished, then the topic inference process can be accelerated. It may also improve the quality of topics. In this work, the list of suffixes has been prepared that encounters too frequently with words in Gujarati text. The inflectional forms have been reduced to the root words concerning the suffixes in the list. Moreover, Gujarati single-letter words have been eliminated for faster inference and better quality of topics. Experimentally, it has been proved that if inflectional forms are reduced to their root words, then vocabulary length is shrunk to a significant extent. It also caused the topic formation process quicker. Moreover, the inflectional forms reduction and single-letter word removal enhanced the interpretability of topics. The interpretability of topics has been assessed on semantic coherence, word length, and topic size. The experimental results showed improvements in the topical semantic coherence score. Also, the topic size grew notably as the number of tokens assigned to the topics increased.


Sign in / Sign up

Export Citation Format

Share Document