scholarly journals Predicting Educational Background using Text Mining

2021 ◽  
Author(s):  
Rense Corten ◽  
Shiva Nadi ◽  
Laurence Frank

We examine to what extent educational background can be inferred from written text, assuming that educational levels are associated with the style of writing and use of language. Using a large public dataset of almost 60000 dating profiles, containing written text for each profile, we look for a methodology to measure author style. We focus on education and essays fields in each profile from which we try to identify relevant features of written text that reveal the level of education of authors behind texts. Using different types of extracted features, we explore the level of education within three approaches: (i) classifying the level of education to elementary or higher education using lexical features; (ii) using Linguistic Inquiry and Word Count (LIWC) features; (iii) combining LIWC features and lexical features. For classification, we rely on regularized logistic regression. The joint model which uses both lexical and LIWC features predicts the education level better than other text representation models, but the contribution of LIWC is marginal. Our results may not only be useful in the context of the platform economy and online markets, also more generally to researchers who need to rely on written text as an indicator of educational background.

2021 ◽  
Author(s):  
Peter Boot

Linguistic Inquiry and Word Count (LIWC) is a text analysis program developed by James Pennebaker and colleagues. At the basis of LIWC is a dictionary that assigns words to categories. This dictionary is specific to English. Researchers who want to use LIWC on non-English texts have typically relied on translations of the dictionary into the language of the texts. Dictionary translation, however, is a labour-intensive procedure. In this paper, we investigate an alternative approach: to use Machine Translation (MT) to translate the texts that must be analysed into English, and then use the English dictionary to analyse the texts. We test several LIWC versions, languages and MT engines, and consistently find the machine-translated text approach performs better than the translated-dictionary approach. We argue that for languages for which effective MT technology is available, there is no need to create new LIWC dictionary translations.


2021 ◽  
Author(s):  
Janak Judd

Research on regulatory focus has often used hopes versus duties to operationalize promotion and prevention focus, respectively. The current research examined regulatory focus in terms of exploration versus self-control to determine whether people tend to bring different types of experiences to mind when thinking about these experiences. I used Linguistic Inquiry and Word Count software to analyze written descriptions of exploration and self-control and used t-tests to examine between-condition differences on word categories that participants used at least 0.5% of the time. Across two studies, descriptions of exploration had more positive emotional tone and used more insight words. In contrast, descriptions of self-control used more function words, more negative emotion words, including anger, more words about ingestion, and more words about power.


Crisis ◽  
2019 ◽  
Vol 40 (2) ◽  
pp. 125-133 ◽  
Author(s):  
Miriam Van den Nest ◽  
Benedikt Till ◽  
Thomas Niederkrotenthaler

Abstract. Background: Little is known about linguistic differences between nonprofessional suicide message boards that differ in regard to their predominant attitude to suicide. Aims: To compare linguistic indicators potentially related to suicidality between anti-suicide, neutral, and pro-suicide message boards, and between the types of posters (primary posters, who initiate the thread, and the respective respondents). Method: In all, 1,200 threads from seven German-language nonprofessional suicide message boards were analyzed using the software Linguistic Inquiry and Word Count (LIWC) with regard to wording related to suicidal fantasies, aggression, and indicators of so-called suicidal constriction. Data were analyzed with ANOVA. Results: There were fewer words related to affective, social, cognitive, and communicative processes in pro-suicide message boards than in other boards. Death-related wording and aggression as well as tentative wording appeared more prevalent in pro-suicide boards. Limitations: Complex language structures cannot be analyzed with LIWC. Conclusion: The results suggest fewer emotion words and wording related to social circumstances among primary posters and respondents in pro-suicide boards as compared with other boards, and a higher use of death- and aggression-related words. These findings might signal a higher degree of suicidality or sheer differences in matters of interest or social desirability. The differences require attention in practice and research.


2020 ◽  
Vol 13 (5) ◽  
pp. 884-892
Author(s):  
Sartaj Ahmad ◽  
Ashutosh Gupta ◽  
Neeraj Kumar Gupta

Background: In recent time, people love online shopping but before any shopping feedbacks or reviews always required. These feedbacks help customers in decision making for buying any product or availing any service. In the country like India this trend of online shopping is increasing very rapidly because awareness and the use of internet which is increasing day by day. As result numbers of customers and their feedbacks are also increasing. It is creating a problem that how to read all reviews manually. So there should be some computerized mechanism that provides customers a summary without spending time in reading feedbacks. Besides big number of reviews another problem is that reviews are not structured. Objective: In this paper, we try to design, implement and compare two algorithms with manual approach for the crossed domain Product’s reviews. Methods: Lexicon based model is used and different types of reviews are tested and analyzed to check the performance of these algorithms. Results: Algorithm based on opinions and feature based opinions are designed, implemented, applied and compared with the manual results and it is found that algorithm # 2 is performing better than algorithm # 1 and near to manual results. Conclusion: Algorithm # 2 is found better on the different product’s reviews and still to be applied on other product’s reviews to enhance its scope. Finally, it will be helpful to automate existing manual process.


2021 ◽  
Vol 32 (3) ◽  
pp. 326-339
Author(s):  
Heather L. Urry ◽  
Chelsea S. Crittle ◽  
Victoria A. Floerke ◽  
Michael Z. Leonard ◽  
Clinton S. Perry ◽  
...  

In this direct replication of Mueller and Oppenheimer’s (2014) Study 1, participants watched a lecture while taking notes with a laptop ( n = 74) or longhand ( n = 68). After a brief distraction and without the opportunity to study, they took a quiz. As in the original study, laptop participants took notes containing more words spoken verbatim by the lecturer and more words overall than did longhand participants. However, laptop participants did not perform better than longhand participants on the quiz. Exploratory meta-analyses of eight similar studies echoed this pattern. In addition, in both the original study and our replication, higher word count was associated with better quiz performance, and higher verbatim overlap was associated with worse quiz performance, but the latter finding was not robust in our replication. Overall, results do not support the idea that longhand note taking improves immediate learning via better encoding of information.


AI ◽  
2021 ◽  
Vol 2 (2) ◽  
pp. 261-273
Author(s):  
Mario Manzo ◽  
Simone Pellino

COVID-19 has been a great challenge for humanity since the year 2020. The whole world has made a huge effort to find an effective vaccine in order to save those not yet infected. The alternative solution is early diagnosis, carried out through real-time polymerase chain reaction (RT-PCR) tests or thorax Computer Tomography (CT) scan images. Deep learning algorithms, specifically convolutional neural networks, represent a methodology for image analysis. They optimize the classification design task, which is essential for an automatic approach with different types of images, including medical. In this paper, we adopt a pretrained deep convolutional neural network architecture in order to diagnose COVID-19 disease from CT images. Our idea is inspired by what the whole of humanity is achieving, as the set of multiple contributions is better than any single one for the fight against the pandemic. First, we adapt, and subsequently retrain for our assumption, some neural architectures that have been adopted in other application domains. Secondly, we combine the knowledge extracted from images by the neural architectures in an ensemble classification context. Our experimental phase is performed on a CT image dataset, and the results obtained show the effectiveness of the proposed approach with respect to the state-of-the-art competitors.


Electronics ◽  
2021 ◽  
Vol 10 (4) ◽  
pp. 495
Author(s):  
Imayanmosha Wahlang ◽  
Arnab Kumar Maji ◽  
Goutam Saha ◽  
Prasun Chakrabarti ◽  
Michal Jasinski ◽  
...  

This article experiments with deep learning methodologies in echocardiogram (echo), a promising and vigorously researched technique in the preponderance field. This paper involves two different kinds of classification in the echo. Firstly, classification into normal (absence of abnormalities) or abnormal (presence of abnormalities) has been done, using 2D echo images, 3D Doppler images, and videographic images. Secondly, based on different types of regurgitation, namely, Mitral Regurgitation (MR), Aortic Regurgitation (AR), Tricuspid Regurgitation (TR), and a combination of the three types of regurgitation are classified using videographic echo images. Two deep-learning methodologies are used for these purposes, a Recurrent Neural Network (RNN) based methodology (Long Short Term Memory (LSTM)) and an Autoencoder based methodology (Variational AutoEncoder (VAE)). The use of videographic images distinguished this work from the existing work using SVM (Support Vector Machine) and also application of deep-learning methodologies is the first of many in this particular field. It was found that deep-learning methodologies perform better than SVM methodology in normal or abnormal classification. Overall, VAE performs better in 2D and 3D Doppler images (static images) while LSTM performs better in the case of videographic images.


2020 ◽  
Vol 35 (5) ◽  
pp. 336-343
Author(s):  
Katherine Guttmann ◽  
John Flibotte ◽  
Sara B. DeMauro ◽  
Holli Seitz

This study aimed to evaluate how parents of former neonatal intensive care unit patients with cerebral palsy perceive prognostic discussions following neuroimaging. Parent members of a cerebral palsy support network described memories of prognostic discussions after neuroimaging in the neonatal intensive care unit. We analyzed responses using Linguistic Inquiry and Word Count, manual content analysis, and thematic analysis. In 2015, a total of 463 parents met eligibility criteria and 266 provided free-text responses. Linguistic Inquiry and Word Count analysis showed that responses following neuroimaging contained negative emotion. The most common components identified through the content analysis included outcome, uncertainty, hope/hopelessness, and weakness in communication. Thematic analysis revealed 3 themes: (1) Information, (2) Communication, and (3) Impact. Parents of children with cerebral palsy report weakness in communication relating to prognosis, which persists in parents’ memories. Prospective work to develop interventions to improve communication between parents and providers in the neonatal intensive care unit is necessary.


1979 ◽  
Vol 57 (4) ◽  
pp. 400-403 ◽  
Author(s):  
Anne Le Narvor ◽  
Pierre Saumagne

The ir spectra of mixtures of methyl propionate/water and methyl propionate/Ba2+ in dimethylsulfoxide and in acetonitrile have been recorded in the region of the νCO mode of the ester. Evidence is presented to indicate the presence of different types of complexes; their concentration was determined as a function of the composition of the medium. The spectroscopic results are compared to those from the kinetics of the alkaline hydrolysis in the same conditions. It is demonstrated that the orbital control explains the experimental results better than does the charge density on the carbon of the carbonyl group. [Journal translation]


Sign in / Sign up

Export Citation Format

Share Document