scholarly journals A Distributed Framework for NLP-Based Keyword and Keyphrase Extraction From Web Pages and Documents

Author(s):  
Paolo Nesi ◽  
Gianni Pantaleo ◽  
Gianmarco Sanesi
2019 ◽  
Vol 3 (3) ◽  
pp. 58 ◽  
Author(s):  
Tim Haarman ◽  
Bastiaan Zijlema ◽  
Marco Wiering

Keyphrase extraction is an important part of natural language processing (NLP) research, although little research is done in the domain of web pages. The World Wide Web contains billions of pages that are potentially interesting for various NLP tasks, yet it remains largely untouched in scientific research. Current research is often only applied to clean corpora such as abstracts and articles from academic journals or sets of scraped texts from a single domain. However, textual data from web pages differ from normal text documents, as it is structured using HTML elements and often consists of many small fragments. These elements are furthermore used in a highly inconsistent manner and are likely to contain noise. We evaluated the keyphrases extracted by several state-of-the-art extraction methods and found that they did not transfer well to web pages. We therefore propose WebEmbedRank, an adaptation of a recently proposed extraction method that can make use of structural information in web pages in a robust manner. We compared this novel method to other baselines and state-of-the-art methods using a manually annotated dataset and found that WebEmbedRank achieved significant improvements over existing extraction methods on web pages.


Author(s):  
Chandrakala Arya ◽  
◽  
Sanjay k. Dwivedi

Author(s):  
Emre Tolga Ayan ◽  
Rabia Arslan ◽  
Muhammed Said Zengin ◽  
Haci Ali Duru ◽  
Sedat Salman ◽  
...  

Crisis ◽  
2018 ◽  
Vol 39 (3) ◽  
pp. 197-204 ◽  
Author(s):  
Hajime Sueki ◽  
Jiro Ito

Abstract. Background: Gatekeeper training is an effective suicide prevention strategy. However, the appropriate targets of online gatekeeping have not yet been clarified. Aim: We examined the association between the outcomes of online gatekeeping using the Internet and the characteristics of consultation service users. Method: An advertisement to encourage the use of e-mail-based psychological consultation services among viewers was placed on web pages that showed the results of searches using suicide-related keywords. All e-mails received between October 2014 and December 2015 were replied to as part of gatekeeping, and the obtained data (responses to an online questionnaire and the content of the received e-mails) were analyzed. Results: A total of 154 consultation service users were analyzed, 35.7% of whom were male. The median age range was 20–29 years. Online gatekeeping was significantly more likely to be successful when such users faced financial/daily life or workplace problems, or revealed their names (including online names). By contrast, the activity was more likely to be unsuccessful when it was impossible to assess the problems faced by consultation service users. Conclusion: It may be possible to increase the success rate of online gatekeeping by targeting individuals facing financial/daily life or workplace problems with marked tendencies for self-disclosure.


2012 ◽  
Vol 2 (9) ◽  
pp. 148-150 ◽  
Author(s):  
Marriboyina Rajendra ◽  
◽  
S. Suresh Babu

Sign in / Sign up

Export Citation Format

Share Document