Selectivity-Based Keyword Extraction Method

Author(s):  
Slobodan Beliga ◽  
Ana Meštrović ◽  
Sanda Martinčić-Ipšić

In this work the authors propose a novel Selectivity-Based Keyword Extraction (SBKE) method, which extracts keywords from the source text represented as a network. The node selectivity value is calculated from a weighted network as the average weight distributed on the links of a single node and is used in the procedure of keyword candidate ranking and extraction. The authors show that selectivity-based keyword extraction slightly outperforms an extraction based on the standard centrality measures: in/out-degree, betweenness and closeness. Therefore, they include selectivity and its modification – generalized selectivity as node centrality measures in the SBKE method. Selectivity-based extraction does not require linguistic knowledge as it is derived purely from statistical and structural information of the network. The experimental results point out that selectivity-based keyword extraction has a great potential for the collection-oriented keyword extraction task.

Author(s):  
Slobodan Beliga ◽  
Ana Meštrović ◽  
Sanda Martinčić-Ipšić

This chapter presents a novel Selectivity-Based Keyword Extraction (SBKE) method, which extracts keywords from the source text represented as a network. The node selectivity value is calculated from a weighted network as the average weight distributed on the links of a single node and is used in the procedure of keyword candidate ranking and extraction. The selectivity slightly outperforms an extraction based on the standard centrality measures. Therefore, the selectivity and its modification – generalized selectivity as the node centrality measures are included in the SBKE method. Selectivity-based extraction does not require linguistic knowledge as it is derived purely from statistical and structural information of the network and it can be easily ported to new languages and used in a multilingual scenario. The true potential of the proposed SBKE method is in its generality, portability and low computation costs, which positions it as a strong candidate for preparing collections which lack human annotations for keyword extraction.


2021 ◽  
Vol 11 (4) ◽  
pp. 498
Author(s):  
Marcello Zanghieri ◽  
Giulia Menichetti ◽  
Alessandra Retico ◽  
Sara Calderoni ◽  
Gastone Castellani ◽  
...  

Autism spectrum disorders (ASDs) are a heterogeneous group of neurodevelopmental conditions characterized by impairments in social interaction and communication and restricted patterns of behavior, interests, and activities. Although the etiopathogenesis of idiopathic ASD has not been fully elucidated, compelling evidence suggests an interaction between genetic liability and environmental factors in producing early alterations of structural and functional brain development that are detectable by magnetic resonance imaging (MRI) at the group level. This work shows the results of a network-based approach to characterize not only variations in the values of the extracted features but also in their mutual relationships that might reflect underlying brain structural differences between autistic subjects and healthy controls. We applied a network-based analysis on sMRI data from the Autism Brain Imaging Data Exchange I (ABIDE-I) database, containing 419 features extracted with FreeSurfer software. Two networks were generated: one from subjects with autistic disorder (AUT) (DSM-IV-TR), and one from typically developing controls (TD), adopting a subsampling strategy to overcome class imbalance (235 AUT, 418 TD). We compared the distribution of several node centrality measures and observed significant inter-class differences in averaged centralities. Moreover, a single-node analysis allowed us to identify the most relevant features that distinguished the groups.


2014 ◽  
Vol 635-637 ◽  
pp. 1476-1479 ◽  
Author(s):  
Su Wang ◽  
Ming Ya Wang ◽  
Jun Zheng ◽  
Kai Zheng

Keyword extraction is important for information retrieval. This paper gave a hybrid keyword extraction method based on TF and semantic strategies for Chinese document. A new word finding method was proposed to find the new word not exist in the dictionary. Moreover the semantic strategies were introduced to filter the dependent words and remove the synonyms. Experimental results show that the proposed method can improve the accuracy and performance of keyword extraction.


2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Douglas Guilbeault ◽  
Damon Centola

AbstractThe standard measure of distance in social networks – average shortest path length – assumes a model of “simple” contagion, in which people only need exposure to influence from one peer to adopt the contagion. However, many social phenomena are “complex” contagions, for which people need exposure to multiple peers before they adopt. Here, we show that the classical measure of path length fails to define network connectedness and node centrality for complex contagions. Centrality measures and seeding strategies based on the classical definition of path length frequently misidentify the network features that are most effective for spreading complex contagions. To address these issues, we derive measures of complex path length and complex centrality, which significantly improve the capacity to identify the network structures and central individuals best suited for spreading complex contagions. We validate our theory using empirical data on the spread of a microfinance program in 43 rural Indian villages.


2021 ◽  
Vol 1955 (1) ◽  
pp. 012072
Author(s):  
Ruiheng Li ◽  
Xuan Zhang ◽  
Chengdong Li ◽  
Zhongju Zheng ◽  
Zihang Zhou ◽  
...  

Terminology ◽  
2000 ◽  
Vol 6 (2) ◽  
pp. 195-210 ◽  
Author(s):  
Hiroshi Nakagawa

The NTCIR1 TMREC group called for participation of the term recognition task which is a part of NTCIR1 held in 1999. As an activity of TMREC, they have provided us with the test collection of the term recognition task. The goal of this task is to automatically recognize and extract terms from the text corpus which consists of 1,870 abstracts gathered from the NACSIS Academic Conference Database. This article describes the term extraction method we have proposed to extract terms consisting of simple and compound nouns and the experimental evaluation of the proposed method with this NTCIR TMREC test collection. The basic idea of scoring a simple noun N of our term extraction method is to count how many nouns are conjoined with N to make compound nouns. Then we extend this score to measure the score of compound nouns because most of technical terms are compound nouns. Our method has a parameter to tune the degree of preference either for longer compound nouns or for shorter compound nouns. As for term candidates, in addition to noun sequences, we may add variations such as patterns of "A no B" that roughly means "B of A" or "A’ś B" and/or "A na B" where "A na" is an adjective. Experimental results of our method are promising, namely recall of 0.83, precision of 0.46 and F-value of 0.59 for exactly matched extracted terms when we take into account top scoring 16,000 extracted terms.


BMC Medicine ◽  
2020 ◽  
Vol 18 (1) ◽  
Author(s):  
Tobias R. Spiller ◽  
Ofir Levi ◽  
Yuval Neria ◽  
Benjamin Suarez-Jimenez ◽  
Yair Bar-Haim ◽  
...  

Abstract Background In the network approach to psychopathology, psychiatric disorders are considered networks of causally active symptoms (nodes), with node centrality hypothesized to reflect symptoms’ causal influence within a network. Accordingly, centrality measures have been used in numerous network-based cross-sectional studies to identify specific treatment targets, based on the assumption that deactivating highly central nodes would proliferate to other nodes in the network, thereby collapsing the network structure and alleviating the overall psychopathology (i.e., the centrality hypothesis). Methods Here, we summarize three types of evidence pertaining to the centrality hypothesis in psychopathology. First, we discuss the validity of the theoretical assumptions underlying the centrality hypothesis in psychopathology. We then summarize the methodological aspects of extant studies using centrality measures as predictors of symptom change following treatment, while delineating their main findings and several of their limitations. Finally, using a specific dataset of 710 treatment-seeking patients with posttraumatic stress disorder (PTSD) as an example, we empirically examine node centrality as a predictor of therapeutic change, replicating the approach taken by previous studies, while addressing some of their limitations. Specifically, we investigated whether three pre-treatment centrality indices (strength, predictability, and expected influence) were significantly correlated with the strength of the association between a symptom’s change and the change in the severity of all other symptoms in the network from pre- to post-treatment (Δnode-Δnetwork association). Using similar analyses, we also examine the predictive validity of two simple non-causal node properties (mean symptom severity and infrequency of symptom endorsement). Results Of the three centrality measures, only expected influence successfully predicted how strongly changes in nodes/symptoms were associated with change in the remainder of the nodes/symptoms. Importantly, when excluding the amnesia node, a well-documented outlier in the phenomenology of PTSD, none of the tested centrality measures predicted symptom change. Conversely, both mean symptom severity and infrequency of symptom endorsement, two standard non-network-derived indices, were found to be more predictive than expected influence and remained significantly predictive also after excluding amnesia from the network analyses. Conclusions The centrality hypothesis in its current form is ill-defined, showing no consistent supporting evidence in the context of cross-sectional, between-subject networks.


Sign in / Sign up

Export Citation Format

Share Document