An ontological artifact for classifying social media: Text mining analysis for financial data

Author(s):  
Zamil Alzamil ◽  
Deniz Appelbaum ◽  
Robert Nehmer
2021 ◽  
Author(s):  
Fei Shen ◽  
Wenting Yu ◽  
Chen Min ◽  
Qianying Ye ◽  
Chuanli Xia ◽  
...  

Text mining has been a dominant approach to extracting useful information from massive unstructured data online. But existing tools for Chinese word segmentation are not ideal for processing social media text data in Cantonese. This project developed CyberCan (https://github.com/shenfei1010/CyberCan), a lexicon of contemporary Cantonese based on more than 100 million pieces of internet texts. We compared the performance of CyberCan with existing Mandarin and Cantonese lexicons in terms of their word segmentation performance. Findings suggest that CyberCan outperforms all existing lexicons by a considerable margin.


2020 ◽  
Vol 34 (5) ◽  
pp. 826-844 ◽  
Author(s):  
Louis Tay ◽  
Sang Eun Woo ◽  
Louis Hickman ◽  
Rachel M. Saef

In the age of big data, substantial research is now moving toward using digital footprints like social media text data to assess personality. Nevertheless, there are concerns and questions regarding the psychometric and validity evidence of such approaches. We seek to address this issue by focusing on social media text data and (i) conducting a review of psychometric validation efforts in social media text mining (SMTM) for personality assessment and discussing additional work that needs to be done; (ii) considering additional validity issues from the standpoint of reference (i.e. ‘ground truth’) and causality (i.e. how personality determines variations in scores derived from SMTM); and (iii) discussing the unique issues of generalizability when validating SMTM for personality assessment across different social media platforms and populations. In doing so, we explicate the key validity and validation issues that need to be considered as a field to advance SMTM for personality assessment, and, more generally, machine learning personality assessment methods. © 2020 European Association of Personality Psychology


2021 ◽  
Author(s):  
Shriphani Palakodety ◽  
Ashiqur R. KhudaBukhsh ◽  
Guha Jayachandran

Author(s):  
Danushka Bollegala ◽  
Simon Maskell ◽  
Richard Sloane ◽  
Joanna Hajne ◽  
Munir Pirmohamed

2021 ◽  
Vol 12 ◽  
Author(s):  
Gancheng Zhu ◽  
Yuci Zhou ◽  
Fengfeng Zhou ◽  
Min Wu ◽  
Xiangping Zhan ◽  
...  

This prospective study was designed to propose a novel method of assessing proactive personality by combining text mining technology and Item Response Theory (IRT) to measure proactive personality more efficiently. We got freely expressed texts (essay question text dataset and social media text dataset) and item response data on the topic of proactive personality from 901 college students. To enhance validity and reliability, three different approaches were employed in the study. In Method 1, we used item response data to develop a proactive personality evaluation model based on IRT. In Method 2, we used freely expressed texts to develop a proactive personality evaluation model based on text mining. In Method 3, we utilized the text mining results as the prior information for the IRT estimation and built a proactive personality evaluation model combining text mining and IRT. Finally, we evaluated those three approaches via the confusion matrix indicators. The major result revealed that (1) the combined method based on essay question text, micro-blog text with pre-estimated IRT parameters performed the highest accuracy of 0.849; (2) the combined method using essay question text and pre-estimated IRT parameters performed the highest sensitivity of 0.821; (3) the text classification method based on essay question text had the best performance on the specificity of 0.959; and (4) if the models were considered comprehensively, the combined method using essay question text, micro-blog text, and pre-estimated IRT parameters achieved the best performance. Thus, we concluded that the novel combined method was significantly better than the other two traditional methods based on IRT and text mining.


Sign in / Sign up

Export Citation Format

Share Document