Detecting Health-Related Privacy Leaks in Social Networks Using Text Mining Tools

Author(s):  
Kambiz Ghazinour ◽  
Marina Sokolova ◽  
Stan Matwin
PLoS ONE ◽  
2021 ◽  
Vol 16 (3) ◽  
pp. e0247319
Author(s):  
Noha Alnazzawi

Narrative information in electronic health records (EHRs) contains a wealth of information related to patient health conditions. In addition, people use Twitter to express their experiences regarding personal health issues, such as medical complaints, symptoms, treatments, lifestyle, and other factors. Both genres of text include different types of health-related information concerning disease complications and risk factors. Knowing detailed information about controlling disease risk factors has a great impact on modifying these risks and subsequently preventing disease complications. Text-mining tools provide efficient solutions to extract and integrate vital information related to disease complications hidden in the large volume of the narrative text. However, the development of text-mining tools depends on the availability of an annotated corpus. In response, we have developed the PrevComp corpus, which is annotated with information relevant to the identification of disease complications, underlying risk factors, and prevention measures, in the context of the interaction between hypertension and diabetes. The corpus is unique and novel in terms of the very specific topic in the biomedical domain and as an integration of information from both EHRs and tweets collected from Twitter. The annotation scheme was designed with guidance by a domain expert, and two further domain experts performed the annotation, resulting in a high-quality annotation, with agreement rate F-scores as high as 0.60 and 0.75 for EHRs and tweets, respectively.


Author(s):  
Miquel Pans ◽  
Joaquin Madera ◽  
Luís-Millan González ◽  
Maite Pellicer-Chenoll

It is currently difficult to have a global state of the art vision of certain scientific topics. In the field of physical activity (PA) and exercise, this is due to information overload. The present study aims to provide a solution by analysing a large mass of scientific articles using text mining (TM). The purpose was to analyse what is being investigated in the PA health field on young people from primary, secondary and higher education. Titles and abstracts published in the Web of Science (WOS) database were analysed using TM on 24 November 2020, and after removing duplicates, 85,368 remained. The results show 9960 (unique) words and the most frequently used bi-grams and tri-grams. A co-occurrence network was also generated. ‘Health’ was the first term of importance and the most repeated bi-grams and tri-grams were ‘body_mass’ and ‘body_mass_index’. The analyses of the 20 topics identified focused on health-related terms, the social sphere, sports performance and research processes. It also found that the terms health and exercise have become more important in recent years.


Author(s):  
Shivani Batra ◽  
Shelly Sachdeva

EHRs aid in maintaining longitudinal (lifelong) health records constituting a multitude of representations in order to make health related information accessible. However, storing EHRs data is non-trivial due to the issues of semantic interoperability, sparseness, and frequent evolution. Standard-based EHRs are recommended to attain semantic interoperability. However, standard-based EHRs possess challenges (in terms of sparseness and frequent evolution) that need to be handled through a suitable data model. The traditional RDBMS is not well-suited for standardized EHRs (due to sparseness and frequent evolution). Thus, modifications to the existing relational model is required. One such widely adopted data model for EHRs is entity attribute value (EAV) model. However, EAV representation is not compatible with mining tools available in the market. To style the representation of EAV, as per the requirement of mining tools, pivoting is required. The chapter explains the architecture to organize EAV for the purpose of preparing the dataset for use by existing mining tools.


Author(s):  
Manoel Vitor Santos ◽  
Amélia M. P. C. Brandão

The primary purpose of the present research is to develop a methodology which can accurately analyse online public reviews on Google using Netnography studies combined with text mining analyses. By analysing the current techniques applied to a lifestyle hotel brand in nine properties in different countries and carefully studying how negative reviews are expressed online by costumers, this study aims to create a pattern of lifestyle customer complaints. This research seeks to demonstrate patterns of consumer behaviour that are not fully satisfied with the hotel service and how it can negatively affect the brand. This study identifies the areas that five stars lifestyle hoteliers and hotel managers need to pay attention to improve services, considering online reviews on online platforms, such as social networks and other tourism sites. Today, online reviews and customer experiences have a significant impact on the choice of a hotel.


Author(s):  
Antonina Durfee

Massive quantities of information continue accumulating at about 1.5 billion gigabytes per year in numerous repositories held at news agencies, at libraries, on corporate intranets, on personal computers, and on the Web. A large portion of all available information exists in the form of text. Researchers, analysts, editors, venture capitalists, lawyers, help desk specialists, and even students are faced with text analysis challenges. Text mining tools aim at discovering knowledge from textual databases by isolating key bits of information from large amounts of text, identifying relationships among documents. Text mining technology is used for plagiarism and authorship attribution, text summarization and retrieval, and deception detection.


2015 ◽  
pp. 1539-1556
Author(s):  
Dhiraj Murthy ◽  
Alexander Gross ◽  
Alex Takata

This chapter identifies a number of the most common data mining toolkits and evaluates their utility in the extraction of data from heterogeneous online social networks. It introduces not only the complexities of scraping data from the diverse forms of data manifested in these sources, but also critically evaluates currently available tools. This analysis is followed by a presentation and discussion on the development of a hybrid system, which builds upon the work of the open-source Web-Harvest framework, for the collection of information from online social networks. This tool, VoyeurServer, attempts to address the weaknesses of tools identified in earlier sections, as well as prototype the implementation of key functionalities thought to be missing from commonly available data extraction toolkits. The authors conclude the chapter with a case study and subsequent evaluation of the VoyeurServer system itself. This evaluation presents future directions, remaining challenges, and additional extensions thought to be important to the effective development of data mining tools for the study of online social networks.


Author(s):  
Shruti Kohli ◽  
Sonia Saini

Recent work in machine learning and natural language processing has studied the content of health related information in tweets and demonstrated the potential for extracting useful public health information from their aggregation. Social intelligence derived from health content has become of significant importance for various applications, including post-marketing drug surveillance, competitive intelligence, medicine reviews and to assess health-related opinions and sentiments. Further, the quantity of medical information in the media such as tweets on Twitter, Facebook or medical blogs is growing at an exponential rate. Medical data such as health records, drug data, etc. has become major candidates for Big Data analysis and thus exploring this content has become a necessity for organizations. However, the volume, velocity, variety, and quality of online health information present challenges, necessitating enhanced facilitation mechanisms for medical social computing. The objective of this chapter is to discuss the possibility of mining medical trends using Social Networks.


Sign in / Sign up

Export Citation Format

Share Document