Natural language processing methods are sensitive to sub-clinical linguistic differences in schizophrenia spectrum disorders

AbstractComputerized natural language processing (NLP) allows for objective and sensitive detection of speech disturbance, a hallmark of schizophrenia spectrum disorders (SSD). We explored several methods for characterizing speech changes in SSD (n = 20) compared to healthy control (HC) participants (n = 11) and approached linguistic phenotyping on three levels: individual words, parts-of-speech (POS), and sentence-level coherence. NLP features were compared with a clinical gold standard, the Scale for the Assessment of Thought, Language and Communication (TLC). We utilized Bidirectional Encoder Representations from Transformers (BERT), a state-of-the-art embedding algorithm incorporating bidirectional context. Through the POS approach, we found that SSD used more pronouns but fewer adverbs, adjectives, and determiners (e.g., “the,” “a,”). Analysis of individual word usage was notable for more frequent use of first-person singular pronouns among individuals with SSD and first-person plural pronouns among HC. There was a striking increase in incomplete words among SSD. Sentence-level analysis using BERT reflected increased tangentiality among SSD with greater sentence embedding distances. The SSD sample had low speech disturbance on average and there was no difference in group means for TLC scores. However, NLP measures of language disturbance appear to be sensitive to these subclinical differences and showed greater ability to discriminate between HC and SSD than a model based on clinical ratings alone. These intriguing exploratory results from a small sample prompt further inquiry into NLP methods for characterizing language disturbance in SSD and suggest that NLP measures may yield clinically relevant and informative biomarkers.

Download Full-text

Sentence embeddings in NLI with iterative refinement encoders

Natural Language Engineering ◽

10.1017/s1351324919000202 ◽

2019 ◽

Vol 25 (4) ◽

pp. 467-482 ◽

Cited By ~ 3

Author(s):

Aarne Talman ◽

Anssi Yli-Jyrä ◽

Jörg Tiedemann

Keyword(s):

Neural Networks ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Recurrent Neural Networks ◽

State Of The Art ◽

Iterative Refinement ◽

Learning Tasks ◽

Sentence Level ◽

Refinement Strategy

AbstractSentence-level representations are necessary for various natural language processing tasks. Recurrent neural networks have proven to be very effective in learning distributed representations and can be trained efficiently on natural language inference tasks. We build on top of one such model and propose a hierarchy of bidirectional LSTM and max pooling layers that implements an iterative refinement strategy and yields state of the art results on the SciTail dataset as well as strong results for Stanford Natural Language Inference and Multi-Genre Natural Language Inference. We can show that the sentence embeddings learned in this way can be utilized in a wide variety of transfer learning tasks, outperforming InferSent on 7 out of 10 and SkipThought on 8 out of 9 SentEval sentence embedding evaluation tasks. Furthermore, our model beats the InferSent model in 8 out of 10 recently published SentEval probing tasks designed to evaluate sentence embeddings’ ability to capture some of the important linguistic properties of sentences.

Download Full-text

Ανάλυση συναισθήματος και γνώμης

10.12681/eadd/44752 ◽

2018 ◽

Author(s):

Αγγελική-Σπυριδούλα Βλαχοστέργιου

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Sentiment Analysis ◽

Language Processing ◽

Context Aware ◽

Sentence Level ◽

Document Level

Τα τελευταία χρόνια έχει παρατηρηθεί μια αύξηση του αριθμού των προσπαθειών για την αυτόματη αναγνώριση και κατηγοριοποίηση του ανθρωπίνου συναισθήματος χρησιμοποιώντας σήματα φυσιολογίας, σήματα από το πρόσωπο, τη φωνή, καθώς επίσης και προσωπικές ερμηνείες από κείμενα μεγάλων κοινωνικών δεδομένων. Αρκετοί είναι οι τομείς της έρευνας που θα μπορούσαν να επωφεληθούν από αυτά τα συστήματα: διαδραστικά συστήματα διδασκαλίας, τα οποία να επιτρέπουν στους εκπαιδευτικούς να γνωρίζουν το άγχος των φοιτητών, πρόληψη των ατυχημάτων (π.χ. εντοπισμός της κόπωσης του οδηγού), στρατιωτικά ομαδικά καθήκοντα που χαρακτηρίζονται από μεγάλης διάρκειας περιόδους άγχους και πίεσης και εφαρμογές στον τομέα της Υγείας για την έγκαιρη διάγνωση νευροεκφυλιστικών νόσων (π.χ. νόσος του Πάρκινσον), όπου η εκδήλωση των συμπτωμάτων συμβαίνει πολλά χρόνια μετά την έναρξη του νευροεκφυλισμού.Ωστόσο, παρά τις μέχρι τώρα ερευνητικές προσπάθειες, δεν έχει επιτευχθεί ο μακροπρόθεσμος στόχος της δημιουργίας ενός ισχυρού πλαισίου αναγνώρισης του εξεταζόμενου τομέα έρευνας που να βασίζεται στην ανάλυση και στην ερμηνεία του. Δεν υπάρχει καμία αμφιβολία ότι η δημιουργία του συναισθήματος (affect production) επηρεάζεται από το εκάστοτε πλαίσιο που λαμβάνει χώρα τη δεδομένη στιγμή, όπως το έργο στο οποίο υποβάλλεται ο χρήστης, τα άτομα που αλληλεπιδρούν με το χρήστη, η ταυτότητα αλλά και η εκφραστικότητά τους. Η οποιαδήποτε λοιπόν συμπληρωματική μορφή πληροφορίας πλαισίου αναφορικά με τον εξεταζόμενο τομέα έρευνας μας βοηθά ώστε να απαντήσουμε στο ερώτημα: τί είναι πιθανότερο να συμβεί, εκτρέποντας έτσι τον ταξινομητή από τις πιθανότερες/σχετικές κατηγορίες. Χωρίς το πλαίσιο, ακόμη και οι άνθρωποι μπορεί να παρερμηνεύουν τις παρατηρούμενες εκφράσεις του. Έτσι, με την αντιμετώπιση των προκλήσεων υπό το πρίσμα της αναγνώρισης του συναισθήματος υπό συγκεκριμένο πλαίσιο (context-aware affect analysis), δηλαδή με την καλύτερη μελέτη των πληροφοριών πλαισίου, με την ερμηνεία του σε συγκεκριμένους τομείς εφαρμογών, την αναπαράστασή του, τη μοντελοποίησή του, μπορούμε να προσεγγίσουμε καλύτερα την αναγνώριση του συναισθήματος σε πραγματικό χρόνο. Αντίστοιχα, στον τομέα των προσωπικών ερμηνειών από το κείμενο (Sentiment Analysis) αλλά και γενικότερα στον τομέα της Φυσικής Γλώσσας (Natural Language Processing (NLP)) η συνεισφορά του πλαισίου έγκειται στην καλύτερη αναγνώριση, ερμηνεία και επεξεργασία των απόψεων (opinions) και συναισθημάτων (sentiments) σε κείμενα, τα οποία εξετάζονται σε επίπεδο κειμένου (document-level), προτάσεων sentence-level και χαρακτηριστικών (aspect-level) αντίστοιχα. Στην περίπτωση αυτή, λαμβάνονται υπόψιν η σημασιολογία, οι γνωστικές και οι συναισθηματικές πληροφορίες των υποκειμενικών απαντήσεων των ατόμων. Ειδικότερα, στον τομέα αυτό, η συνεισφορά μας έγκειται στην εκπαίδευση ισχυρών αναπαραστάσεις χαρακτηριστικών από μη επισημειωμένα δεδομένα με τη χρήση Νευρωνικών Δικτύων και συγκεκριμένα με τη χρήση Ανταγωνιστικά Παραγωγικών Μοντέλων (GANs), η χρήση των οποίων έχει επιδείξει εντυπωσιακά αποτελέσματα στον τομέα της Όρασης Υπολογιστών. Η πρωτοτυπία της συγκεριμένης μεθόδου έγκειται στον τρόπο υλοποίησης του μοντέλου, στην επιλογή των υπερπαραμετρων, στη χρήση μη επιβλεπόμενης μάθησης και στην πειραματική επικύρωση του προτεινόμενου μοντέλου σε σώματα κειμένου που προέρχονται από διαφορετικές πηγές αναφορικά με το είδος τους και την έκτασή τους.

Download Full-text

ERPS as a probe of language processing difficulties in schizophrenia spectrum disorders

Biological Psychiatry ◽

10.1016/0006-3223(96)84462-3 ◽

1996 ◽

Vol 39 (7) ◽

pp. 654

Author(s):

M.A. Nizniklewicz ◽

P.G. Nestor ◽

B.F. O'Donnell ◽

L. Seidman ◽

C.C. Dickey ◽

...

Keyword(s):

Language Processing ◽

Schizophrenia Spectrum Disorders ◽

Schizophrenia Spectrum ◽

Spectrum Disorders

Download Full-text

Identification of Adverse Drug Event–Related Japanese Articles: Natural Language Processing Analysis (Preprint)

10.2196/preprints.22661 ◽

2020 ◽

Author(s):

Shogo Ujiie ◽

Shuntaro Yada ◽

Shoko Wakamiya ◽

Eiji Aramaki

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Drug Safety ◽

Automated System ◽

Pharmaceutical Companies ◽

Manual Labor ◽

Sentence Level ◽

Medical Articles ◽

Document Level

BACKGROUND Medical articles covering adverse drug events (ADEs) are systematically reported by pharmaceutical companies for drug safety information purposes. Although policies governing reporting to regulatory bodies vary among countries and regions, all medical article reporting may be categorized as precision or recall based. Recall-based reporting, which is implemented in Japan, requires the reporting of any possible ADE. Therefore, recall-based reporting can introduce numerous false negatives or substantial amounts of noise, a problem that is difficult to address using limited manual labor. OBJECTIVE Our aim was to develop an automated system that could identify ADE-related medical articles, support recall-based reporting, and alleviate manual labor in Japanese pharmaceutical companies. METHODS Using medical articles as input, our system based on natural language processing applies document-level classification to extract articles containing ADEs (replacing manual labor in the first screening) and sentence-level classification to extract sentences within those articles that imply ADEs (thus supporting experts in the second screening). We used 509 Japanese medical articles annotated by a medical engineer to evaluate the performance of the proposed system. RESULTS Document-level classification yielded an F1 of 0.903. Sentence-level classification yielded an F1 of 0.413. These were averages of fivefold cross-validations. CONCLUSIONS A simple automated system may alleviate the manual labor involved in screening drug safety–related medical articles in pharmaceutical companies. After improving the accuracy of the sentence-level classification by considering a wider context, we intend to apply this system toward real-world postmarketing surveillance.

Download Full-text

Sentence-Level Multilingual Multi-modal Embedding for Natural Language Processing

RANLP 2017 - Recent Advances in Natural Language Processing Meet Deep Learning ◽

10.26615/978-954-452-049-6_020 ◽

2017 ◽

Cited By ~ 2

Author(s):

Iacer Calixto ◽

◽

Qun Liu

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Sentence Level

Download Full-text

Combining Lexico-semantic Features for Emotion Classification in Suicide Notes

Biomedical Informatics Insights ◽

10.4137/bii.s8960 ◽

2012 ◽

Vol 5s1 ◽

pp. BII.S8960 ◽

Cited By ~ 5

Author(s):

Bart Desmet ◽

Véronique Hoste

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Bag Of Words ◽

Semantic Features ◽

Emotion Classification ◽

Shared Task ◽

Suicide Notes ◽

Sentence Level ◽

Semantic Resources

This paper describes a system for automatic emotion classification, developed for the 2011 i2b2 Natural Language Processing Challenge, Track 2. The objective of the shared task was to label suicide notes with 15 relevant emotions on the sentence level. Our system uses 15 SVM models (one for each emotion) using the combination of features that was found to perform best on a given emotion. Features included lemmas and trigram bag of words, and information from semantic resources such as WordNet, SentiWordNet and subjectivity clues. The best-performing system labeled 7 of the 15 emotions and achieved an F-score of 53.31% on the test data.

Download Full-text

Review-Based Sentiment Prediction of Rating Using Natural Language Processing Sentence-Level Sentiment Analysis with Bag-of-Words Approach

First International Conference on Sustainable Technologies for Computational Intelligence - Advances in Intelligent Systems and Computing ◽

10.1007/978-981-15-0029-9_63 ◽

2019 ◽

pp. 807-821

Author(s):

K. Venkata Raju ◽

M. Sridhar

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Sentiment Analysis ◽

Language Processing ◽

Bag Of Words ◽

Sentence Level

Download Full-text

Electrocardiographic signs of autonomic imbalance in medicated patients with first-episode schizophrenia spectrum disorders – relations to first treatment discontinuation and five-year remission status

European Psychiatry ◽

10.1016/j.eurpsy.2010.12.002 ◽

2012 ◽

Vol 27 (3) ◽

pp. 213-218 ◽

Cited By ~ 2

Author(s):

Robert Bodén ◽

Leif Lindström ◽

Pentti Rautaharju ◽

Johan Sundström

Keyword(s):

Heart Rate ◽

Small Sample ◽

First Episode ◽

Autonomic Balance ◽

Schizophrenia Spectrum Disorders ◽

Prognostic Information ◽

Remission Status ◽

Schizophrenia Spectrum ◽

Spectrum Disorders ◽

First Episode Schizophrenia

AbstractPurposeTo explore measures in electrocardiograms (ECG) influenced by autonomic balance in early schizophrenia spectrum disorders and to examine their relation to subsequent first antipsychotic pharmacotherapy discontinuation and five-year remission status.Subjects and methodsTwelve-lead ECGs were recorded at baseline in 58 patients with first-episode schizophrenia spectrum disorders and in 47 healthy controls of similar age. Selected ECG variables included heart rate and measures of repolarization. Pharmacotherapy data were extracted from medical records. At a five-year follow-up the patients were interviewed and assessed with the Positive and Negative Syndrome Scale.ResultsPatients had higher heart rate and a different ST-T pattern than the controls. High T-wave amplitudes in the leads aVF and V5 and ST-elevations in V5 were associated both with higher risk of an earlier discontinuation of first antipsychotic pharmacotherapy and with non-remission five years later.Discussion and conclusionIn this longitudinal cohort study, simple ECG measures influenced by autonomic balance in the early phase of schizophrenia spectrum disorders contained prognostic information. As this is the first report of this association and is based on a relatively small sample, the results should be interpreted with caution.

Download Full-text

Identification of Adverse Drug Event–Related Japanese Articles: Natural Language Processing Analysis

JMIR Medical Informatics ◽

10.2196/22661 ◽

2020 ◽

Vol 8 (11) ◽

pp. e22661

Author(s):

Shogo Ujiie ◽

Shuntaro Yada ◽

Shoko Wakamiya ◽

Eiji Aramaki

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Drug Safety ◽

Automated System ◽

Pharmaceutical Companies ◽

Manual Labor ◽

Sentence Level ◽

Medical Articles ◽

Document Level

Background Medical articles covering adverse drug events (ADEs) are systematically reported by pharmaceutical companies for drug safety information purposes. Although policies governing reporting to regulatory bodies vary among countries and regions, all medical article reporting may be categorized as precision or recall based. Recall-based reporting, which is implemented in Japan, requires the reporting of any possible ADE. Therefore, recall-based reporting can introduce numerous false negatives or substantial amounts of noise, a problem that is difficult to address using limited manual labor. Objective Our aim was to develop an automated system that could identify ADE-related medical articles, support recall-based reporting, and alleviate manual labor in Japanese pharmaceutical companies. Methods Using medical articles as input, our system based on natural language processing applies document-level classification to extract articles containing ADEs (replacing manual labor in the first screening) and sentence-level classification to extract sentences within those articles that imply ADEs (thus supporting experts in the second screening). We used 509 Japanese medical articles annotated by a medical engineer to evaluate the performance of the proposed system. Results Document-level classification yielded an F1 of 0.903. Sentence-level classification yielded an F1 of 0.413. These were averages of fivefold cross-validations. Conclusions A simple automated system may alleviate the manual labor involved in screening drug safety–related medical articles in pharmaceutical companies. After improving the accuracy of the sentence-level classification by considering a wider context, we intend to apply this system toward real-world postmarketing surveillance.

Download Full-text

Natural language processing of clinical mental health notes may add predictive value to existing suicide risk models

Psychological Medicine ◽

10.1017/s0033291720000173 ◽

2020 ◽

pp. 1-10 ◽

Cited By ~ 2

Author(s):

Maxwell Levis ◽

Christine Leonard Westgate ◽

Jiang Gui ◽

Bradley V. Watts ◽

Brian Shiner

Keyword(s):

Mental Health ◽

Natural Language Processing ◽

Natural Language ◽

Prediction Model ◽

Language Processing ◽

Predictive Value ◽

Suicide Risk ◽

Predictive Accuracy ◽

Small Sample ◽

Suicide Prediction

Abstract Background This study evaluated whether natural language processing (NLP) of psychotherapy note text provides additional accuracy over and above currently used suicide prediction models. Methods We used a cohort of Veterans Health Administration (VHA) users diagnosed with post-traumatic stress disorder (PTSD) between 2004–2013. Using a case-control design, cases (those that died by suicide during the year following diagnosis) were matched to controls (those that remained alive). After selecting conditional matches based on having shared mental health providers, we chose controls using a 5:1 nearest-neighbor propensity match based on the VHA's structured Electronic Medical Records (EMR)-based suicide prediction model. For cases, psychotherapist notes were collected from diagnosis until death. For controls, psychotherapist notes were collected from diagnosis until matched case's date of death. After ensuring similar numbers of notes, the final sample included 246 cases and 986 controls. Notes were analyzed using Sentiment Analysis and Cognition Engine, a Python-based NLP package. The output was evaluated using machine-learning algorithms. The area under the curve (AUC) was calculated to determine models' predictive accuracy. Results NLP derived variables offered small but significant predictive improvement (AUC = 0.58) for patients that had longer treatment duration. A small sample size limited predictive accuracy. Conclusions Study identifies a novel method for measuring suicide risk over time and potentially categorizing patient subgroups with distinct risk sensitivities. Findings suggest leveraging NLP derived variables from psychotherapy notes offers an additional predictive value over and above the VHA's state-of-the-art structured EMR-based suicide prediction model. Replication with a larger non-PTSD specific sample is required.

Download Full-text