Exploring the dominant features of social media for depression detection

2019 ◽  
Vol 46 (6) ◽  
pp. 739-759
Author(s):  
Jamil Hussain ◽  
Fahad Ahmed Satti ◽  
Muhammad Afzal ◽  
Wajahat Ali Khan ◽  
Hafiz Syed Muhammad Bilal ◽  
...  

Recently, social media have been used by researchers to detect depressive symptoms in individuals using linguistic data from users’ posts. In this study, we propose a framework to identify social information as a significant predictor of depression. Using the proposed framework, we develop an application called the Socially Mediated Patient Portal (SMPP), which detects depression-related markers in Facebook users by applying a data-driven approach with machine learning classification techniques. We examined a data set of 4350 users who were evaluated for depression using the Center for Epidemiological Studies Depression (CES-D) scale. From this analysis, we identified a set of features that can distinguish between individuals with and without depression. Finally, we identified the dominant features that adequately assess individuals with and without depression on social media. The model trained on these features will be helpful to physicians in diagnosing mental diseases and psychiatrists in analysing patient behaviour.

2020 ◽  
Vol 103 (3) ◽  
pp. 129-157
Author(s):  
Fabia Hultin Morger

Satire has been present in various different media throughout the centuries. With the rise of television, satire has made its way onto TV screens via various outlets including news parodies. As these TV shows began using social media, new forms of satire have appeared, among them satirical Internet Memes commenting on political events. The objects of interest in this study are Memes published by two German news parodies Heute Show and Extra 3 on the platform Facebook that thematise the G-20 summit, which took place in Hamburg in 2017. My data set consists of 27 Memes from the platforms Facebook, Instagram and Twitter, as well as the public Facebook comments published alongside these Memes. Using an empirical, data-driven approach to my investigation, I broach questions regarding the way Memes make use of satire and how they interact with the Internet as a medium, and in particular, their affordances on the platform Facebook.


2018 ◽  
Vol 4 (4) ◽  
pp. 487-501 ◽  
Author(s):  
Kun Kuang ◽  
Meng Jiang ◽  
Peng Cui ◽  
Hengliang Luo ◽  
Shiqiang Yang

2014 ◽  
Vol 26 (2) ◽  
pp. 349-376 ◽  
Author(s):  
Motoaki Kawanabe ◽  
Wojciech Samek ◽  
Klaus-Robert Müller ◽  
Carmen Vidaurre

Electroencephalographic signals are known to be nonstationary and easily affected by artifacts; therefore, their analysis requires methods that can deal with noise. In this work, we present a way to robustify the popular common spatial patterns (CSP) algorithm under a maxmin approach. In contrast to standard CSP that maximizes the variance ratio between two conditions based on a single estimate of the class covariance matrices, we propose to robustly compute spatial filters by maximizing the minimum variance ratio within a prefixed set of covariance matrices called the tolerance set. We show that this kind of maxmin optimization makes CSP robust to outliers and reduces its tendency to overfit. We also present a data-driven approach to construct a tolerance set that captures the variability of the covariance matrices over time and shows its ability to reduce the nonstationarity of the extracted features and significantly improve classification accuracy. We test the spatial filters derived with this approach and compare them to standard CSP and a state-of-the-art method on a real-world brain-computer interface (BCI) data set in which we expect substantial fluctuations caused by environmental differences. Finally we investigate the advantages and limitations of the maxmin approach with simulations.


2021 ◽  
Vol 12 ◽  
Author(s):  
Akio Onogi ◽  
Daisuke Sekine ◽  
Akito Kaga ◽  
Satoshi Nakano ◽  
Tetsuya Yamada ◽  
...  

It has not been fully understood in real fields what environment stimuli cause the genotype-by-environment (G × E) interactions, when they occur, and what genes react to them. Large-scale multi-environment data sets are attractive data sources for these purposes because they potentially experienced various environmental conditions. Here we developed a data-driven approach termed Environmental Covariate Search Affecting Genetic Correlations (ECGC) to identify environmental stimuli and genes responsible for the G × E interactions from large-scale multi-environment data sets. ECGC was applied to a soybean (Glycine max) data set that consisted of 25,158 records collected at 52 environments. ECGC illustrated what meteorological factors shaped the G × E interactions in six traits including yield, flowering time, and protein content and when these factors were involved in the interactions. For example, it illustrated the relevance of precipitation around sowing dates and hours of sunshine just before maturity to the interactions observed for yield. Moreover, genome-wide association mapping on the sensitivities to the identified stimuli discovered candidate and known genes responsible for the G × E interactions. Our results demonstrate the capability of data-driven approaches to bring novel insights on the G × E interactions observed in fields.


Author(s):  
Emad Badawi ◽  
Guy-Vincent Jourdan ◽  
Gregor Bochmann ◽  
Iosif-Viorel Onut

The “Game Hack” Scam (GHS) is a mostly unreported cyberattack in which attackers attempt to convince victims that they will be provided with free, unlimited “resources” or other advantages for their favorite game. The endgame of the scammers ranges from monetizing for themselves the victims time and resources by having them click through endless “surveys”, filing out “market research” forms, etc., to collecting personal information, getting the victims to subscribe to questionable services, up to installing questionable executable files on their machines. Other scams such as the “Technical Support Scam”, the “Survey Scam”, and the “Romance Scam” have been analyzed before but to the best of our knowledge, GHS has not been well studied so far and is indeed mostly unknown. In this paper, our aim is to investigate and gain more knowledge on this type of scam by following a data-driven approach; we formulate GHS-related search queries, and used multiple search engines to collect data about the websites to which GHS victims are directed when they search online for various game hacks and tricks. We analyze the collected data to provide new insight into GHS and research the extent of this scam. We show that despite its low profile, the click traffic generated by the scam is in the hundreds of millions. We also show that GHS attackers use social media, streaming sites, blogs, and even unrelated sites such as change.org or jeuxvideo.com to carry out their attacks and reach a large number of victims. Our data collection spans a year; in that time, we uncovered 65,905 different GHS URLs, mapped onto over 5,900 unique domains.We were able to link attacks to attackers and found that they routinely target a vast array of games. Furthermore, we find that GHS instances are on the rise, and so is the number of victims. Our low-end estimation is that these attacks have been clicked at least 150 million times in the last five years. Finally, in keeping with similar large-scale scam studies, we find that the current public blacklists are inadequate and suggest that our method is more effective at detecting these attacks.


2020 ◽  
Vol 22 (12) ◽  
pp. 1137-1147 ◽  
Author(s):  
Masataka Enomoto ◽  
B Duncan X Lascelles ◽  
Margaret E Gruen

Objectives The aim of this study was to develop an evidence-based, clinically expedient checklist to identify cats likely to have degenerative joint disease (DJD)-associated pain. Methods Data were compiled from previously conducted studies that employed a standardized subjective outcome measure consisting of a series of questions. These studies included a prevalence study (with DJD non-informed owners) and therapeutic trials (with DJD-informed owners). For each cat, and each question, response scores were converted to ‘impaired’ and ‘unimpaired’. Cats were categorized as ‘DJD pain’ and ‘non-DJD’ based on orthopedic pain and radiographic DJD scores. These binary data were compared between cat phenotypes (non-DJD and DJD pain) for each question. Sensitivity and specificity of each question were calculated using the binary data; based on this, potential questions for the checklist were selected. Sensitivity and specificity across this group of questions were calculated, and questions sequentially removed to optimize length, sensitivity and specificity. Finally, the proposed checklist was applied to a novel data set to evaluate its ability to identify cats with DJD pain. Results In total, 249 DJD pain cats and 53 non-DJD cats from five studies were included. Nine questions with adequate sensitivity and specificity were initially identified. Following sequential removal of questions, a checklist with six binary questions was proposed. Based on the data from the cohorts of DJD-informed and DJD non-informed owners, the sensitivity and specificity of the proposed checklist were approximately 99% and 100%, and 55% and 97%, respectively. Conclusions and relevance The proposed checklist represents a data-driven approach to construct a screening checklist for DJD pain in cats. This checklist provides a clinically expedient tool likely to increase veterinarians’ ability to screen for DJD pain in cats. The identified behaviors comprising the checklist may further provide a foundation for increasing awareness of DJD pain among cat owners.


Author(s):  
David Duran-Rodas ◽  
Emmanouil Chaniotakis ◽  
Constantinos Antoniou

Identification of factors influencing ridership is necessary for policy-making, as well as, when examining transferability and aspects of performance and reliability. In this work, a data-driven method is formulated to correlate arrivals and departures of station-based bike sharing systems with built environment factors in multiple cities. Ridership data from stations of multiple cities are pooled in one data set regardless of their geographic boundaries. The method bundles the collection, analysis, and processing of data, as well as, the model’s estimation using statistical and machine learning techniques. The method was applied on a national level in six cities in Germany, and also on an international level in three cities in Europe and North America. The results suggest that the model’s performance did not depend on clustering cities by size but by the relative daily distribution of the rentals. Selected statistically significant factors were identified to vary temporally (e.g., nightclubs were significant during the night). The most influencing variables were related to the city population, distance to city center, leisure-related establishments, and transport-related infrastructure. This data-driven method can help as a support decision-making tool to implement or expand bike sharing systems.


Sign in / Sign up

Export Citation Format

Share Document