scholarly journals Clustering suicides: A data-driven, exploratory machine learning approach

2019 ◽  
Vol 62 ◽  
pp. 15-19 ◽  
Author(s):  
Birgit Ludwig ◽  
Daniel König ◽  
Nestor D. Kapusta ◽  
Victor Blüml ◽  
Georg Dorffner ◽  
...  

Abstract Methods of suicide have received considerable attention in suicide research. The common approach to differentiate methods of suicide is the classification into “violent” versus “non-violent” method. Interestingly, since the proposition of this dichotomous differentiation, no further efforts have been made to question the validity of such a classification of suicides. This study aimed to challenge the traditional separation into “violent” and “non-violent” suicides by generating a cluster analysis with a data-driven, machine learning approach. In a retrospective analysis, data on all officially confirmed suicides (N = 77,894) in Austria between 1970 and 2016 were assessed. Based on a defined distance metric between distributions of suicides over age group and month of the year, a standard hierarchical clustering method was performed with the five most frequent suicide methods. In cluster analysis, poisoning emerged as distinct from all other methods – both in the entire sample as well as in the male subsample. Violent suicides could be further divided into sub-clusters: hanging, shooting, and drowning on the one hand and jumping on the other hand. In the female sample, two different clusters were revealed – hanging and drowning on the one hand and jumping, poisoning, and shooting on the other. Our data-driven results in this large epidemiological study confirmed the traditional dichotomization of suicide methods into “violent” and “non-violent” methods, but on closer inspection “violent methods” can be further divided into sub-clusters and a different cluster pattern could be identified for women, requiring further research to support these refined suicide phenotypes.

2016 ◽  
Vol 11 (4) ◽  
pp. 791-799 ◽  
Author(s):  
Rina Kagawa ◽  
Yoshimasa Kawazoe ◽  
Yusuke Ida ◽  
Emiko Shinohara ◽  
Katsuya Tanaka ◽  
...  

Background: Phenotyping is an automated technique that can be used to distinguish patients based on electronic health records. To improve the quality of medical care and advance type 2 diabetes mellitus (T2DM) research, the demand for T2DM phenotyping has been increasing. Some existing phenotyping algorithms are not sufficiently accurate for screening or identifying clinical research subjects. Objective: We propose a practical phenotyping framework using both expert knowledge and a machine learning approach to develop 2 phenotyping algorithms: one is for screening; the other is for identifying research subjects. Methods: We employ expert knowledge as rules to exclude obvious control patients and machine learning to increase accuracy for complicated patients. We developed phenotyping algorithms on the basis of our framework and performed binary classification to determine whether a patient has T2DM. To facilitate development of practical phenotyping algorithms, this study introduces new evaluation metrics: area under the precision-sensitivity curve (AUPS) with a high sensitivity and AUPS with a high positive predictive value. Results: The proposed phenotyping algorithms based on our framework show higher performance than baseline algorithms. Our proposed framework can be used to develop 2 types of phenotyping algorithms depending on the tuning approach: one for screening, the other for identifying research subjects. Conclusions: We develop a novel phenotyping framework that can be easily implemented on the basis of proper evaluation metrics, which are in accordance with users’ objectives. The phenotyping algorithms based on our framework are useful for extraction of T2DM patients in retrospective studies.


2020 ◽  
Author(s):  
Bowen Wang ◽  
Biao Xie ◽  
Jin Xuan ◽  
Wen Gu ◽  
Dezong Zhao ◽  
...  

Author(s):  
Ranjan Raj Aryal ◽  
Ankit Bhattarai

Social media is one platform where people share their opinions and views on different topics, services, or behaviors that happen around them. Since the COVID19 pandemic that started at the end of 2019, it has been a topic on which people express their sentiments. Recently, the COVID19 vaccination programs have got a lot of responses. In this paper, we have proposed two models: one based on the machine learning approach: Naive Bayes & the other based on deep learning: LSTM, whose goal is to know the sentiment of Asian region tweets towards the vaccine through sentiment analysis. The data were extracted with the help of Twitter API from March 23, 2021, till April 2, 2021. The extraction approach contains keywords with geocoding of some of the Asian countries, especially Nepal, India and Singapore. After collecting data, some preprocessing such as removing numbers, non-English & stop words, removing special characters, and hyperlinks were done. The polarity of tweets was assigned using the Text blob library. The tweets were classified into one of the three: positive, negative, or neutral. Now the data were preprocessed with the splitting of tweets into training & testing sets. Both the models were trained & tested using 10767 unique tweets. This experiment shows that a number of people in these three countries (Nepal, India and Singapore) have positive sentiment towards the vaccine and are taking the first dose of Covid19 vaccine. At last, the accuracy of the LSTM model was found to be 7% greater than that of the Naive Bayes-based model.


Sign in / Sign up

Export Citation Format

Share Document