Measuring happiness of large-scaled online Turkish unstructured data (Preprint)
BACKGROUND Measuring online Turkish happiness requires a Turkish happiness dictionary which could reflect norms and social values more culturally and linguistically instead of using a translation-oriented method. Analyzing data without neglecting cultural characteristics will not be reliable. Turkish translation of an English word in the Affective Norms of English Words (ANEW) dictionary does not express the same feeling of a Turkish word. In addition, existing emotional dictionaries are not developed for specifically for the social networks with emoticons. OBJECTIVE This research presents the Turkish Happiness Index (THI) which is a set of psychological normative happiness scores to measure an average level of happiness of Turkish online unstructured large-scale data. A well-being informatics analytics research is also done by using THI. METHODS Turkish Happiness Index was completely generated on social networks. 20000 words were extracted with web text mining from social networks. Natural Language Processing algorithms were applied. After data reduction quantitative research methodology is applied. The happiness scores were based detected based on 667 participants’ subjective happiness levels and their thoughts about the 1874 Turkish words. Alexithymia scale was also used to identify the emotional awareness of the participants. The evaluations of the words were done in the dimension of valence using the Self-Assessment Manikin in an online platform. NLP was used to measure online Turkish happiness of data. Data was collected from Facebook with negative #war and positive #family hashtags in a duration of one month using a 3rd party software tool. Natural language processing algorithms including tokenization, transformation, filtering and stemming after converting data to documents. The happiness levels of the documents based on hashtags were determined using the Turkish Happiness Index dictionary. RESULTS THI which contains 345 words and their happiness scores in the Turkish language was developed. The THI is given in Appendix 1. We also put a comparison between words of dictionaries to understand the cultural differences. CONCLUSIONS THI provide researchers with standard materials through which they can automatically measure online happiness of Turkish large-scale data. THI can be used in in real-time big data analytics.