semantic distances
Recently Published Documents


TOTAL DOCUMENTS

28
(FIVE YEARS 12)

H-INDEX

4
(FIVE YEARS 1)

2021 ◽  
Vol 5 (4) ◽  
pp. 77
Author(s):  
Asra Fatima ◽  
Ying Li ◽  
Thomas Trenholm Hills ◽  
Massimo Stella

Most current affect scales and sentiment analysis on written text focus on quantifying valence/sentiment, the primary dimension of emotion. Distinguishing broader, more complex negative emotions of similar valence is key to evaluating mental health. We propose a semi-supervised machine learning model, DASentimental, to extract depression, anxiety, and stress from written text. We trained DASentimental to identify how N = 200 sequences of recalled emotional words correlate with recallers’ depression, anxiety, and stress from the Depression Anxiety Stress Scale (DASS-21). Using cognitive network science, we modeled every recall list as a bag-of-words (BOW) vector and as a walk over a network representation of semantic memory—in this case, free associations. This weights BOW entries according to their centrality (degree) in semantic memory and informs recalls using semantic network distances, thus embedding recalls in a cognitive representation. This embedding translated into state-of-the-art, cross-validated predictions for depression (R = 0.7), anxiety (R = 0.44), and stress (R = 0.52), equivalent to previous results employing additional human data. Powered by a multilayer perceptron neural network, DASentimental opens the door to probing the semantic organizations of emotional distress. We found that semantic distances between recalls (i.e., walk coverage), was key for estimating depression levels but redundant for anxiety and stress levels. Semantic distances from “fear” boosted anxiety predictions but were redundant when the “sad–happy” dyad was considered. We applied DASentimental to a clinical dataset of 142 suicide notes and found that the predicted depression and anxiety levels (high/low) corresponded to differences in valence and arousal as expected from a circumplex model of affect. We discuss key directions for future research enabled by artificial intelligence detecting stress, anxiety, and depression in texts.


Entropy ◽  
2021 ◽  
Vol 23 (12) ◽  
pp. 1614
Author(s):  
Justyna Golec ◽  
Tomasz Hachaj ◽  
Grzegorz Sokal

We propose an algorithm to generate graphical summarising of longer text passages using a set of illustrative pictures (TIPS). TIPS is an algorithm using a voting process that uses results of individual “weak” algorithms. The proposed method includes a summarising algorithm that generates a digest of the input document. Each sentence of the text summary is used as the input for further processing by the sentence transformer separately. A sentence transformer performs text embedding and a group of CLIP similarity-based algorithms trained on different image embedding finds semantic distances between images in the illustration image database and the input text. A voting process extracts the most matching images to the text. The TIPS algorithm allows the integration of the best (highest scored) results of the different recommendation algorithms by diminishing the influence of images that are a disjointed part of the recommendations of the component algorithms. TIPS returns a set of illustrative images that describe each sentence of the text summary. Three human judges found that the use of TIPS resulted in an increase in matching highly relevant images to text, ranging from 5% to 8% and images relevant to text ranging from 3% to 7% compared to the approach based on single-embedding schema.


Healthcare ◽  
2021 ◽  
Vol 9 (9) ◽  
pp. 1115
Author(s):  
Yogi Tri Prasetyo ◽  
Ratna Sari Dewi ◽  
Naiomi M. Balatbat ◽  
Michael Lancelot B. Antonio ◽  
Thanatorn Chuenyindee ◽  
...  

Icons have been widely utilized to describe and promote COVID-19 prevention measures. The purpose of this study was to analyze the preference and subjective design features of 133 existing icons associated with COVID-19 prevention measures published by the health and medical organizations of different countries. The 133 icons represent nineteen different function names, such as “Wash Hands” and “Wear Face Mask”. A total of 57 participants were recruited to perform two different tests: ranking test and subjective rating test. The ranking test was conducted to elicit the preference ranking of seven icon designs representing each function name. It was followed by a subjective rating test using 13 semantic scales on the two most preferred icons to analyze their perceived quality. Spearmen correlation was applied to derive the possible correlations between users’ rankings and the semantic scales, and Friedman’s test was also performed to determine the true difference between ranking in terms of each semantic scale to provide a fully meaningful interpretation of the data. Generally, findings from the current study showed that the image presented in the icon is the key point that affects the icons’ perceived quality. Interestingly, Spearman’s correlation analysis between preference ranking and semantic scales showed that vague–clear, weak–strong, incompatible–compatible, and ineffective–effective were the four strongest semantic scales that highly correlated with the preference ranking. Considering the significant relationships between the semantic distances and the functions, images depicted in an icon should be realistic and as close as possible to its respected function to cater to users’ preferences. In addition, the results of Spearman’s correlation and Friedman’s test also inferred that compatibility and clarity of icon elements are the main factors determining a particular icon’s preferability. This study is the first comprehensive study to evaluate the icons associated with the COVID-19 prevention measures. The findings of this study can be utilized as the basis for redesigning icons, particularly for icons related to COVID-19 prevention measures. Furthermore, the approach can also be applied and extended for evaluating other medical icons.


Author(s):  
Yuxuan Shi ◽  
Gong Cheng ◽  
Trung-Kien Tran ◽  
Jie Tang ◽  
Evgeny Kharlamov

Exploring complex structured knowledge graphs (KGs) is challenging for non-experts as it requires knowledge of query languages and the underlying structure of the KGs. Keyword-based exploration is a convenient paradigm, and computing a group Steiner tree (GST) as an answer is a popular implementation. Recent studies suggested improving the cohesiveness of an answer where entities have small semantic distances from each other. However, how to efficiently compute such an answer is open. In this paper, to model cohesiveness in a generalized way, the quadratic group Steiner tree problem (QGSTP) is formulated where the cost function extends GST with quadratic terms representing semantic distances. For QGSTP we design a branch-and-bound best-first (B3F) algorithm where we exploit combinatorial methods to estimate lower bounds for costs. This exact algorithm shows practical performance on medium-sized KGs.


2021 ◽  
Vol 12 ◽  
Author(s):  
Sunghye Cho ◽  
Naomi Nevler ◽  
Natalia Parjane ◽  
Christopher Cieri ◽  
Mark Liberman ◽  
...  

The letter-guided naming fluency task is a measure of an individual’s executive function and working memory. This study employed a novel, automated, quantifiable, and reproducible method to investigate how language characteristics of words produced during a fluency task are related to fluency performance, inter-word response time (RT), and over task duration using digitized F-letter-guided fluency recordings produced by 76 young healthy participants. Our automated algorithm counted the number of correct responses from the transcripts of the F-letter fluency data, and individual words were rated for concreteness, ambiguity, frequency, familiarity, and age of acquisition (AoA). Using a forced aligner, the transcripts were automatically aligned with the corresponding audio recordings. We measured inter-word RT, word duration, and word start time from the forced alignments. Articulation rate was also computed. Phonetic and semantic distances between two consecutive F-letter words were measured. We found that total F-letter score was significantly correlated with the mean values of word frequency, familiarity, AoA, word duration, phonetic similarity, and articulation rate; total score was also correlated with an individual’s standard deviation of AoA, familiarity, and phonetic similarity. RT was negatively correlated with frequency and ambiguity of F-letter words and was positively correlated with AoA, number of phonemes, and phonetic and semantic distances. Lastly, the frequency, ambiguity, AoA, number of phonemes, and semantic distance of words produced significantly changed over time during the task. The method employed in this paper demonstrates the successful implementation of our automated language processing pipelines in a standardized neuropsychological task. This novel approach captures subtle and rich language characteristics during test performance that enhance informativeness and cannot be extracted manually without massive effort. This work will serve as the reference for letter-guided category fluency production similarly acquired in neurodegenerative patients.


PLoS ONE ◽  
2021 ◽  
Vol 16 (7) ◽  
pp. e0254034
Author(s):  
Sotaro Shibayama ◽  
Deyun Yin ◽  
Kuniko Matsumoto

Novelty is a core value in science, and a reliable measurement of novelty is crucial. This study proposes a new approach of measuring the novelty of scientific articles based on both citation data and text data. The proposed approach considers an article to be novel if it cites a combination of semantically distant references. To this end, we first assign a word embedding–a vector representation of each vocabulary–to each cited reference on the basis of text information included in the reference. With these vectors, a distance between every pair of references is computed. Finally, the novelty of a focal document is evaluated by summarizing the distances between all references. The approach draws on limited text information (the titles of references) and publicly shared library for word embeddings, which minimizes the requirement of data access and computational cost. We share the code, with which one can compute the novelty score of a document of interest only by having the focal document’s reference list. We validate the proposed measure through three exercises. First, we confirm that word embeddings can be used to quantify semantic distances between documents by comparing with an established bibliometric distance measure. Second, we confirm the criterion-related validity of the proposed novelty measure with self-reported novelty scores collected from a questionnaire survey. Finally, as novelty is known to be correlated with future citation impact, we confirm that the proposed measure can predict future citation.


2021 ◽  
Vol 118 (25) ◽  
pp. e2022340118
Author(s):  
Jay A. Olson ◽  
Johnny Nahas ◽  
Denis Chmoulevitch ◽  
Simon J. Cropper ◽  
Margaret E. Webb

Several theories posit that creative people are able to generate more divergent ideas. If this is correct, simply naming unrelated words and then measuring the semantic distance between them could serve as an objective measure of divergent thinking. To test this hypothesis, we asked 8,914 participants to name 10 words that are as different from each other as possible. A computational algorithm then estimated the average semantic distance between the words; related words (e.g., cat and dog) have shorter distances than unrelated ones (e.g., cat and thimble). We predicted that people producing greater semantic distances would also score higher on traditional creativity measures. In Study 1, we found moderate to strong correlations between semantic distance and two widely used creativity measures (the Alternative Uses Task and the Bridge-the-Associative-Gap Task). In Study 2, with participants from 98 countries, semantic distances varied only slightly by basic demographic variables. There was also a positive correlation between semantic distance and performance on a range of problems known to predict creativity. Overall, semantic distance correlated at least as strongly with established creativity measures as those measures did with each other. Naming unrelated words in what we call the Divergent Association Task can thus serve as a brief, reliable, and objective measure of divergent thinking.


2021 ◽  
pp. 102986492110015
Author(s):  
Lindsey Reymore

This paper offers a series of characterizations of prototypical musical timbres, called Timbre Trait Profiles, for 34 musical instruments common in Western orchestras and wind ensembles. These profiles represent the results of a study in which 243 musician participants imagined the sounds of various instruments and used the 20-dimensional model of musical instrument timbre qualia proposed by Reymore and Huron (2020) to rate their auditory image of each instrument. The rating means are visualized through radar plots, which provide timbral-linguistic thumbprints, and are summarized through snapshot profiles, which catalog the six highest- and three lowest-rated descriptors. The Euclidean distances among instruments offer a quantitative operationalization of semantic distances; these distances are illustrated through hierarchical clustering and multidimensional scaling. Exploratory Factor Analysis is used to analyze the latent structure of the rating data. Finally, results are used to assess Reymore and Huron’s 20-dimensional timbre qualia model, suggesting that the model is highly reliable. It is anticipated that the Timbre Trait Profiles can be applied in future perceptual/cognitive research on timbre and orchestration, in music theoretical analysis for both close readings and corpus studies, and in orchestration pedagogy.


2021 ◽  
Author(s):  
Wim Pouw ◽  
Jan de Wit ◽  
Sara Bögels ◽  
Marlou Rasenberg ◽  
Branka Milivojevic ◽  
...  

Most manual communicative gestures that humans produce cannot be looked up in a dictionary, as these manual gestures inherit their meaning in large part from the communicative context and are not conventionalized. However, it is understudied to what extent the communicative signal as such — bodily postures in movement, or kinematics — can inform about gesture semantics. Can we construct, in principle, a distribution-based semantics of gesture kinematics, similar to how word vectorization methods in NLP (Natural language Processing) are now widely used to study semantic properties in text and speech? For such a project to get off the ground, we need to know the extent to which semantically similar gestures are more likely to be kinematically similar. In study 1 we assess whether semantic word2vec distances between the conveyed concepts participants were explicitly instructed to convey in silent gestures, relate to the kinematic distances of these gestures as obtained from Dynamic Time Warping (DTW). In a second director-matcher dyadic study we assess kinematic similarity between spontaneous co-speech gestures produced between interacting participants. Participants were asked before and after they interacted how they would name the objects. The semantic distances between the resulting names were related to the gesture kinematic distances of gestures that were made in the context of conveying those objects in the interaction. We find that the gestures’ semantic relatedness is reliably predictive of kinematic relatedness across these highly divergent studies, which suggests that the development of an NLP method of deriving semantic relatedness from kinematics is a promising avenue for future developments in automated multimodal recognition. Deeper implications for statistical learning processes in multimodal language are discussed.


2020 ◽  
Author(s):  
Jay A. Olson ◽  
Johnny Nahas ◽  
Denis Chmoulevitch ◽  
Margaret E Webb

Several theories posit that creative people are able to generate more divergent ideas. If this is correct, the simple act of naming unrelated words and then measuring the semantic distance between them could serve as an objective measure of creativity. To test this hypothesis, we asked 8,892 participants to name 10 words that are as different from each other as possible. A computational algorithm then estimated the average semantic distance between the words; related words (e.g., “cat” and “dog”) have shorter distances than unrelated ones (e.g., “cat” and “thimble”). We predicted that people producing greater semantic distances would also score higher on traditional creativity measures. In Study 1, there were moderate to strong correlations between semantic distance and two other creativity measures (the Alternative Uses Task and the Bridge-the-Associative-Gap Task). In Study 2, with participants from 98 countries, semantic distances varied only slightly by demographic variables which suggests that the measure can be used without modification across diverse populations. There was also a positive correlation between semantic distance and performance on problem solving tasks known to predict creativity. Overall, semantic distance correlated at least as strongly with established creativity measures as those measures did with each other. Naming unrelated words in what we call the Divergent Association Task can thus serve as a brief, reliable, and objective measure of creativity.


Sign in / Sign up

Export Citation Format

Share Document