scholarly journals The Modern Greek Language on the Social Web: A Survey of Data Sets and Mining Applications

Data ◽  
2021 ◽  
Vol 6 (5) ◽  
pp. 52
Author(s):  
Maria Nefeli Nikiforos ◽  
Yorghos Voutos ◽  
Anthi Drougani ◽  
Phivos Mylonas ◽  
Katia Lida Kermanidis

Mining social web text has been at the heart of the Natural Language Processing and Data Mining research community in the last 15 years. Though most of the reported work is on widely spoken languages, such as English, the significance of approaches that deal with less commonly spoken languages, such as Greek, is evident for reasons of preserving and documenting minority languages, cultural and ethnic diversity, and identifying intercultural similarities and differences. The present work aims at identifying, documenting and comparing social text data sets, as well as mining techniques and applications on social web text that target Modern Greek, focusing on the arising challenges and the potential for future research in the specific less widely spoken language.

Author(s):  
Athanasios Kokkos ◽  
Theodoros Tzouramanis

Online social networking services have come to dominate the dot com world: Countless online communities coexist on the social Web. Some typically characteristic user attributes, such as gender, age group, sexual orientation, are not automatically part of the profile information. In some cases user attributes can even be deliberately and maliciously falsified. This paper examines automated inference of gender on online social networks by analyzing written text with a combination of natural language processing and classification techniques. Extensive experimentation on LinkedIn and Twitter has yielded accuracy of this gender identification technique of up to 98.4 percent.


Author(s):  
Puneetha KR

Abstract: Research into cyberbullying detection has increased in recent years, due in part to the proliferation of cyberbullying across social media and its detrimental effect on young people. Cyber bullying is one of the most common problems faced by the internet users making internet a vulnerable space hence there has to be some detection that is needed on the social media platforms. Detecting the bullies online at the earliest makes sure that these platforms are safer for the user and internet indeed becomes a platform to share information and use it for other leisure activities. Even though there has been some research going on implementing detection and prevention of cyber bullying, it is not completely feasible due to certain limitations imposed. In this paper lexicon-based approach of the NLTK sentiwordnetis used to differentiate the positive and negative words and produce results. These words are given negative and positive values greater than or less than zero for positive and negative words respectively. Lexicon based systems utilize word lists and use the presence of words within the lists to detect cyberbullying. Lemmatization is used to find the root word. This paper essentially maps out the state-of-the-art in cyberbullying detection research and serves as a resource for researchers to determine where to best direct their future research efforts in thisfield. Keywords: Abuse and crime involving computers, natural language processing, sentiment analysis, social networking


2022 ◽  
Author(s):  
Matej Gjurković ◽  
Iva Vukojević ◽  
Jan Šnajder

Automated text-based personality assessment (ATBPA) methods can analyze large amounts of text data and identify nuanced linguistic personality cues. However, current approaches lack the interpretability, explainability, and validity offered by standard questionnaire instruments. To address these weaknesses, we propose an approach that combines questionnaire-based and text-based approaches to personality assessment. Our Statement-to-Item Matching Personality Assessment (SIMPA) framework uses natural language processing methods to detect self-referencing descriptions of personality in a target’s text and utilizes these descriptions for personality assessment. The core of the framework is the notion of a trait-constrained semantic similarity between the target’s freely expressed statements and questionnaire items. The conceptual basis is provided by the realistic accuracy model (RAM), which describes the process of accurate personality judgments and which we extend with a feedback loop mechanism to improve the accuracy of judgments. We present a simple proof-of-concept implementation of SIMPA for ATBPA on the social media site Reddit. We show how the framework can be used directly for unsupervised estimation of a target’s Big 5 scores and indirectly to produce features for a supervised ATBPA model, demonstrating state-of-the-art results for the personality prediction task on Reddit.


2021 ◽  
Vol 2021 (4) ◽  
pp. 480-499
Author(s):  
Henry Hosseini ◽  
Martin Degeling ◽  
Christine Utz ◽  
Thomas Hupperich

Abstract Privacy policies have become a focal point of privacy research. With their goal to reflect the privacy practices of a website, service, or app, they are often the starting point for researchers who analyze the accuracy of claimed data practices, user understanding of practices, or control mechanisms for users. Due to vast differences in structure, presentation, and content, it is often challenging to extract privacy policies from online resources like websites for analysis. In the past, researchers have relied on scrapers tailored to the specific analysis or task, which complicates comparing results across different studies. To unify future research in this field, we developed a toolchain to process website privacy policies and prepare them for research purposes. The core part of this chain is a detector module for English and German, using natural language processing and machine learning to automatically determine whether given texts are privacy or cookie policies. We leverage multiple existing data sets to refine our approach, evaluate it on a recently published longitudinal corpus, and show that it contains a number of misclassified documents. We believe that unifying data preparation for the analysis of privacy policies can help make different studies more comparable and is a step towards more thorough analyses. In addition, we provide insights into common pitfalls that may lead to invalid analyses.


2010 ◽  
pp. 560-586
Author(s):  
Pankaj Kamthan

In the last decade, patterns have emerged as a notable problem-solving approach in various disciplines. This paper aims to address the communication requirements of the elements of pattern engineering (namely, actors, activities, and artifacts) in general and the pattern realization process in particular. To that regard, a theoretical framework using the Social Web as the medium is proposed and its implications are explored. The prospects of using the Social Web are analyzed by means of practical scenarios and concrete examples. The concerns of using the Social Web related to cost to actors, decentralization and distribution of control, and semiotic quality of representations of patterns are highlighted. The directions for future research including the use of patterns for Social Web applications, and the potential of the confluence of the Social Web and the Semantic Web for communicating the elements of pattern engineering, are briefly explored.


2011 ◽  
pp. 2250-2277
Author(s):  
Pankaj Kamthan

In the last decade, patterns have emerged as a notable problem-solving approach in various disciplines. This paper aims to address the communication requirements of the elements of pattern engineering (namely, actors, activities, and artifacts) in general and the pattern realization process in particular. To that regard, a theoretical framework using the Social Web as the medium is proposed and its implications are explored. The prospects of using the Social Web are analyzed by means of practical scenarios and concrete examples. The concerns of using the Social Web related to cost to actors, decentralization and distribution of control, and semiotic quality of representations of patterns are highlighted. The directions for future research including the use of patterns for Social Web applications, and the potential of the confluence of the Social Web and the Semantic Web for communicating the elements of pattern engineering, are briefly explored.


Author(s):  
Pankaj Kamthan

In this chapter, the affordances of the social Web in managing patterns are explored. For that, a classification of stakeholders of patterns and a process for producing patterns are proposed. The role of the stakeholders in carrying out the different workflows of the process is elaborated and, in doing so, the prospects presented by the technologies/applications underlying the social Web are highlighted. The directions for future research, including the potential of the convergence of the social Web and the Semantic Web, are briefly explored.


2019 ◽  
Vol 76 (1) ◽  
pp. 197-211 ◽  
Author(s):  
Namjoo Choi ◽  
Lindsey M. Harper

Purpose The purpose of this paper is to update Carlsson (2015), which examined the research on public libraries and the social web published from 2006 to 2012, and it also intends to go beyond Carlsson (2015) by including six additional variables. Design/methodology/approach Literature searches were performed against Web of Science Core Collection and EBSCOhost databases. By adapting Carlsson’s (2015) three level key phrase searches, which were then complemented by chain searching, a total of 60 articles were identified and analyzed. Findings In comparison to Carlsson (2015), this study shows that the recent research, published between 2012 and 2018, leans toward a more general acceptance of the social web’s usage to improve the services provided by public libraries; that the public library is rarely premised to be in a state of crisis; and that the social web is mostly perceived as having a complementary relationship with librarianship and library services. The findings from analyzing the six additional variables are also presented. Research limitations/implications The findings from this study provide LIS professionals a greater understanding of where the research stands on the topic at present, and this study also identifies gaps in the literature to offer insight into the areas where future research can be directed. Originality/value Given the continued popularity of social web usage among public libraries, this study examines the literature published on the social web in the public library context between 2012 and 2018 and offers implications and future research suggestions.


2020 ◽  
Vol 50 (4) ◽  
pp. 517-545
Author(s):  
Myron P. Gutmann

From its very beginnings, the JIH published articles that embraced quantitative methods, but in its effort to engage as many disciplines as possible, it did much more. Over the nearly fifty years of its publishing history, it has continued to publish variegated interdisciplinary material and, in the process, to present leading-edge research. Within the last ten years, however, the journal has acquired a new role in a much more international context. The emergence of new quantitative methods has permitted the JIH to redefine interdisciplinarity. Immense data sets, with modes of interpretation drawn from the social sciences as well as from the humanities, natural sciences, and medicine, will certainly continue to revolutionize future research in history and cognate disciplines.


Sign in / Sign up

Export Citation Format

Share Document