scholarly journals Natural language processing methods are sensitive to sub-clinical linguistic differences in schizophrenia spectrum disorders

2021 ◽  
Vol 7 (1) ◽  
Author(s):  
Sunny X. Tang ◽  
Reno Kriz ◽  
Sunghye Cho ◽  
Suh Jung Park ◽  
Jenna Harowitz ◽  
...  

AbstractComputerized natural language processing (NLP) allows for objective and sensitive detection of speech disturbance, a hallmark of schizophrenia spectrum disorders (SSD). We explored several methods for characterizing speech changes in SSD (n = 20) compared to healthy control (HC) participants (n = 11) and approached linguistic phenotyping on three levels: individual words, parts-of-speech (POS), and sentence-level coherence. NLP features were compared with a clinical gold standard, the Scale for the Assessment of Thought, Language and Communication (TLC). We utilized Bidirectional Encoder Representations from Transformers (BERT), a state-of-the-art embedding algorithm incorporating bidirectional context. Through the POS approach, we found that SSD used more pronouns but fewer adverbs, adjectives, and determiners (e.g., “the,” “a,”). Analysis of individual word usage was notable for more frequent use of first-person singular pronouns among individuals with SSD and first-person plural pronouns among HC. There was a striking increase in incomplete words among SSD. Sentence-level analysis using BERT reflected increased tangentiality among SSD with greater sentence embedding distances. The SSD sample had low speech disturbance on average and there was no difference in group means for TLC scores. However, NLP measures of language disturbance appear to be sensitive to these subclinical differences and showed greater ability to discriminate between HC and SSD than a model based on clinical ratings alone. These intriguing exploratory results from a small sample prompt further inquiry into NLP methods for characterizing language disturbance in SSD and suggest that NLP measures may yield clinically relevant and informative biomarkers.

2019 ◽  
Vol 25 (4) ◽  
pp. 467-482 ◽  
Author(s):  
Aarne Talman ◽  
Anssi Yli-Jyrä ◽  
Jörg Tiedemann

AbstractSentence-level representations are necessary for various natural language processing tasks. Recurrent neural networks have proven to be very effective in learning distributed representations and can be trained efficiently on natural language inference tasks. We build on top of one such model and propose a hierarchy of bidirectional LSTM and max pooling layers that implements an iterative refinement strategy and yields state of the art results on the SciTail dataset as well as strong results for Stanford Natural Language Inference and Multi-Genre Natural Language Inference. We can show that the sentence embeddings learned in this way can be utilized in a wide variety of transfer learning tasks, outperforming InferSent on 7 out of 10 and SkipThought on 8 out of 9 SentEval sentence embedding evaluation tasks. Furthermore, our model beats the InferSent model in 8 out of 10 recently published SentEval probing tasks designed to evaluate sentence embeddings’ ability to capture some of the important linguistic properties of sentences.


2018 ◽  
Author(s):  
Αγγελική-Σπυριδούλα Βλαχοστέργιου

Τα τελευταία χρόνια έχει παρατηρηθεί μια αύξηση του αριθμού των προσπαθειών για την αυτόματη αναγνώριση και κατηγοριοποίηση του ανθρωπίνου συναισθήματος χρησιμοποιώντας σήματα φυσιολογίας, σήματα από το πρόσωπο, τη φωνή, καθώς επίσης και προσωπικές ερμηνείες από κείμενα μεγάλων κοινωνικών δεδομένων. Αρκετοί είναι οι τομείς της έρευνας που θα μπορούσαν να επωφεληθούν από αυτά τα συστήματα: διαδραστικά συστήματα διδασκαλίας, τα οποία να επιτρέπουν στους εκπαιδευτικούς να γνωρίζουν το άγχος των φοιτητών, πρόληψη των ατυχημάτων (π.χ. εντοπισμός της κόπωσης του οδηγού), στρατιωτικά ομαδικά καθήκοντα που χαρακτηρίζονται από μεγάλης διάρκειας περιόδους άγχους και πίεσης και εφαρμογές στον τομέα της Υγείας για την έγκαιρη διάγνωση νευροεκφυλιστικών νόσων (π.χ. νόσος του Πάρκινσον), όπου η εκδήλωση των συμπτωμάτων συμβαίνει πολλά χρόνια μετά την έναρξη του νευροεκφυλισμού.Ωστόσο, παρά τις μέχρι τώρα ερευνητικές προσπάθειες, δεν έχει επιτευχθεί ο μακροπρόθεσμος στόχος της δημιουργίας ενός ισχυρού πλαισίου αναγνώρισης του εξεταζόμενου τομέα έρευνας που να βασίζεται στην ανάλυση και στην ερμηνεία του. Δεν υπάρχει καμία αμφιβολία ότι η δημιουργία του συναισθήματος (affect production) επηρεάζεται από το εκάστοτε πλαίσιο που λαμβάνει χώρα τη δεδομένη στιγμή, όπως το έργο στο οποίο υποβάλλεται ο χρήστης, τα άτομα που αλληλεπιδρούν με το χρήστη, η ταυτότητα αλλά και η εκφραστικότητά τους. Η οποιαδήποτε λοιπόν συμπληρωματική μορφή πληροφορίας πλαισίου αναφορικά με τον εξεταζόμενο τομέα έρευνας μας βοηθά ώστε να απαντήσουμε στο ερώτημα: τί είναι πιθανότερο να συμβεί, εκτρέποντας έτσι τον ταξινομητή από τις πιθανότερες/σχετικές κατηγορίες. Χωρίς το πλαίσιο, ακόμη και οι άνθρωποι μπορεί να παρερμηνεύουν τις παρατηρούμενες εκφράσεις του. Έτσι, με την αντιμετώπιση των προκλήσεων υπό το πρίσμα της αναγνώρισης του συναισθήματος υπό συγκεκριμένο πλαίσιο (context-aware affect analysis), δηλαδή με την καλύτερη μελέτη των πληροφοριών πλαισίου, με την ερμηνεία του σε συγκεκριμένους τομείς εφαρμογών, την αναπαράστασή του, τη μοντελοποίησή του, μπορούμε να προσεγγίσουμε καλύτερα την αναγνώριση του συναισθήματος σε πραγματικό χρόνο. Αντίστοιχα, στον τομέα των προσωπικών ερμηνειών από το κείμενο (Sentiment Analysis) αλλά και γενικότερα στον τομέα της Φυσικής Γλώσσας (Natural Language Processing (NLP)) η συνεισφορά του πλαισίου έγκειται στην καλύτερη αναγνώριση, ερμηνεία και επεξεργασία των απόψεων (opinions) και συναισθημάτων (sentiments) σε κείμενα, τα οποία εξετάζονται σε επίπεδο κειμένου (document-level), προτάσεων sentence-level και χαρακτηριστικών (aspect-level) αντίστοιχα. Στην περίπτωση αυτή, λαμβάνονται υπόψιν η σημασιολογία, οι γνωστικές και οι συναισθηματικές πληροφορίες των υποκειμενικών απαντήσεων των ατόμων. Ειδικότερα, στον τομέα αυτό, η συνεισφορά μας έγκειται στην εκπαίδευση ισχυρών αναπαραστάσεις χαρακτηριστικών από μη επισημειωμένα δεδομένα με τη χρήση Νευρωνικών Δικτύων και συγκεκριμένα με τη χρήση Ανταγωνιστικά Παραγωγικών Μοντέλων (GANs), η χρήση των οποίων έχει επιδείξει εντυπωσιακά αποτελέσματα στον τομέα της Όρασης Υπολογιστών. Η πρωτοτυπία της συγκεριμένης μεθόδου έγκειται στον τρόπο υλοποίησης του μοντέλου, στην επιλογή των υπερπαραμετρων, στη χρήση μη επιβλεπόμενης μάθησης και στην πειραματική επικύρωση του προτεινόμενου μοντέλου σε σώματα κειμένου που προέρχονται από διαφορετικές πηγές αναφορικά με το είδος τους και την έκτασή τους.


1996 ◽  
Vol 39 (7) ◽  
pp. 654
Author(s):  
M.A. Nizniklewicz ◽  
P.G. Nestor ◽  
B.F. O'Donnell ◽  
L. Seidman ◽  
C.C. Dickey ◽  
...  

2020 ◽  
Author(s):  
Shogo Ujiie ◽  
Shuntaro Yada ◽  
Shoko Wakamiya ◽  
Eiji Aramaki

BACKGROUND Medical articles covering adverse drug events (ADEs) are systematically reported by pharmaceutical companies for drug safety information purposes. Although policies governing reporting to regulatory bodies vary among countries and regions, all medical article reporting may be categorized as precision or recall based. Recall-based reporting, which is implemented in Japan, requires the reporting of any possible ADE. Therefore, recall-based reporting can introduce numerous false negatives or substantial amounts of noise, a problem that is difficult to address using limited manual labor. OBJECTIVE Our aim was to develop an automated system that could identify ADE-related medical articles, support recall-based reporting, and alleviate manual labor in Japanese pharmaceutical companies. METHODS Using medical articles as input, our system based on natural language processing applies document-level classification to extract articles containing ADEs (replacing manual labor in the first screening) and sentence-level classification to extract sentences within those articles that imply ADEs (thus supporting experts in the second screening). We used 509 Japanese medical articles annotated by a medical engineer to evaluate the performance of the proposed system. RESULTS Document-level classification yielded an F1 of 0.903. Sentence-level classification yielded an F1 of 0.413. These were averages of fivefold cross-validations. CONCLUSIONS A simple automated system may alleviate the manual labor involved in screening drug safety–related medical articles in pharmaceutical companies. After improving the accuracy of the sentence-level classification by considering a wider context, we intend to apply this system toward real-world postmarketing surveillance.


2012 ◽  
Vol 5s1 ◽  
pp. BII.S8960 ◽  
Author(s):  
Bart Desmet ◽  
Véronique Hoste

This paper describes a system for automatic emotion classification, developed for the 2011 i2b2 Natural Language Processing Challenge, Track 2. The objective of the shared task was to label suicide notes with 15 relevant emotions on the sentence level. Our system uses 15 SVM models (one for each emotion) using the combination of features that was found to perform best on a given emotion. Features included lemmas and trigram bag of words, and information from semantic resources such as WordNet, SentiWordNet and subjectivity clues. The best-performing system labeled 7 of the 15 emotions and achieved an F-score of 53.31% on the test data.


2012 ◽  
Vol 27 (3) ◽  
pp. 213-218 ◽  
Author(s):  
Robert Bodén ◽  
Leif Lindström ◽  
Pentti Rautaharju ◽  
Johan Sundström

AbstractPurposeTo explore measures in electrocardiograms (ECG) influenced by autonomic balance in early schizophrenia spectrum disorders and to examine their relation to subsequent first antipsychotic pharmacotherapy discontinuation and five-year remission status.Subjects and methodsTwelve-lead ECGs were recorded at baseline in 58 patients with first-episode schizophrenia spectrum disorders and in 47 healthy controls of similar age. Selected ECG variables included heart rate and measures of repolarization. Pharmacotherapy data were extracted from medical records. At a five-year follow-up the patients were interviewed and assessed with the Positive and Negative Syndrome Scale.ResultsPatients had higher heart rate and a different ST-T pattern than the controls. High T-wave amplitudes in the leads aVF and V5 and ST-elevations in V5 were associated both with higher risk of an earlier discontinuation of first antipsychotic pharmacotherapy and with non-remission five years later.Discussion and conclusionIn this longitudinal cohort study, simple ECG measures influenced by autonomic balance in the early phase of schizophrenia spectrum disorders contained prognostic information. As this is the first report of this association and is based on a relatively small sample, the results should be interpreted with caution.


10.2196/22661 ◽  
2020 ◽  
Vol 8 (11) ◽  
pp. e22661
Author(s):  
Shogo Ujiie ◽  
Shuntaro Yada ◽  
Shoko Wakamiya ◽  
Eiji Aramaki

Background Medical articles covering adverse drug events (ADEs) are systematically reported by pharmaceutical companies for drug safety information purposes. Although policies governing reporting to regulatory bodies vary among countries and regions, all medical article reporting may be categorized as precision or recall based. Recall-based reporting, which is implemented in Japan, requires the reporting of any possible ADE. Therefore, recall-based reporting can introduce numerous false negatives or substantial amounts of noise, a problem that is difficult to address using limited manual labor. Objective Our aim was to develop an automated system that could identify ADE-related medical articles, support recall-based reporting, and alleviate manual labor in Japanese pharmaceutical companies. Methods Using medical articles as input, our system based on natural language processing applies document-level classification to extract articles containing ADEs (replacing manual labor in the first screening) and sentence-level classification to extract sentences within those articles that imply ADEs (thus supporting experts in the second screening). We used 509 Japanese medical articles annotated by a medical engineer to evaluate the performance of the proposed system. Results Document-level classification yielded an F1 of 0.903. Sentence-level classification yielded an F1 of 0.413. These were averages of fivefold cross-validations. Conclusions A simple automated system may alleviate the manual labor involved in screening drug safety–related medical articles in pharmaceutical companies. After improving the accuracy of the sentence-level classification by considering a wider context, we intend to apply this system toward real-world postmarketing surveillance.


2020 ◽  
pp. 1-10 ◽  
Author(s):  
Maxwell Levis ◽  
Christine Leonard Westgate ◽  
Jiang Gui ◽  
Bradley V. Watts ◽  
Brian Shiner

Abstract Background This study evaluated whether natural language processing (NLP) of psychotherapy note text provides additional accuracy over and above currently used suicide prediction models. Methods We used a cohort of Veterans Health Administration (VHA) users diagnosed with post-traumatic stress disorder (PTSD) between 2004–2013. Using a case-control design, cases (those that died by suicide during the year following diagnosis) were matched to controls (those that remained alive). After selecting conditional matches based on having shared mental health providers, we chose controls using a 5:1 nearest-neighbor propensity match based on the VHA's structured Electronic Medical Records (EMR)-based suicide prediction model. For cases, psychotherapist notes were collected from diagnosis until death. For controls, psychotherapist notes were collected from diagnosis until matched case's date of death. After ensuring similar numbers of notes, the final sample included 246 cases and 986 controls. Notes were analyzed using Sentiment Analysis and Cognition Engine, a Python-based NLP package. The output was evaluated using machine-learning algorithms. The area under the curve (AUC) was calculated to determine models' predictive accuracy. Results NLP derived variables offered small but significant predictive improvement (AUC = 0.58) for patients that had longer treatment duration. A small sample size limited predictive accuracy. Conclusions Study identifies a novel method for measuring suicide risk over time and potentially categorizing patient subgroups with distinct risk sensitivities. Findings suggest leveraging NLP derived variables from psychotherapy notes offers an additional predictive value over and above the VHA's state-of-the-art structured EMR-based suicide prediction model. Replication with a larger non-PTSD specific sample is required.


Sign in / Sign up

Export Citation Format

Share Document