automated coding
Recently Published Documents


TOTAL DOCUMENTS

89
(FIVE YEARS 43)

H-INDEX

11
(FIVE YEARS 2)

Author(s):  
James Eynstone-Hinkins ◽  
Lauren Moran

The Australian mortality data are a foundational health dataset which supports research, policy and planning. The COVID-19 pandemic necessitated the need for more timely mortality data that could assist in monitoring direct mortality from the virus as well as indirect mortality due to social and economic societal change. This paper discusses the evolution of mortality data in Australia during the pandemic and looks at emerging opportunities associated with electronic infrastructure such as electronic Medical Certificates of Cause of Death (eMCCDs), ICD-11 and automated coding tools that will form the foundations of a more responsive and comprehensive future mortality dataset.


2021 ◽  
pp. 1532673X2110556
Author(s):  
Vladislav Petkevic ◽  
Alessandro Nai

Negativity in election campaign matters. To what extent can the content of social media posts provide a reliable indicator of candidates' campaign negativity? We introduce and critically assess an automated classification procedure that we trained to annotate more than 16,000 tweets of candidates competing in the 2018 Senate Midterms. The algorithm is able to identify the presence of political attacks (both in general, and specifically for character and policy attacks) and incivility. Due to the novel nature of the instrument, the article discusses the external and convergent validity of these measures. Results suggest that automated classifications are able to provide reliable measurements of campaign negativity. Triangulations with independent data show that our automatic classification is strongly associated with the experts’ perceptions of the candidates’ campaign. Furthermore, variations in our measures of negativity can be explained by theoretically relevant factors at the candidate and context levels (e.g., incumbency status and candidate gender); theoretically meaningful trends are also found when replicating the analysis using tweets for the 2020 Senate election, coded using the automated classifier developed for 2018. The implications of such results for the automated coding of campaign negativity in social media are discussed.


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Xingwang Li ◽  
Yijia Zhang ◽  
Faiz ul Islam ◽  
Deshi Dong ◽  
Hao Wei ◽  
...  

Abstract Background Clinical notes are documents that contain detailed information about the health status of patients. Medical codes generally accompany them. However, the manual diagnosis is costly and error-prone. Moreover, large datasets in clinical diagnosis are susceptible to noise labels because of erroneous manual annotation. Therefore, machine learning has been utilized to perform automatic diagnoses. Previous state-of-the-art (SOTA) models used convolutional neural networks to build document representations for predicting medical codes. However, the clinical notes are usually long-tailed. Moreover, most models fail to deal with the noise during code allocation. Therefore, denoising mechanism and long-tailed classification are the keys to automated coding at scale. Results In this paper, a new joint learning model is proposed to extend our attention model for predicting medical codes from clinical notes. On the MIMIC-III-50 dataset, our model outperforms all the baselines and SOTA models in all quantitative metrics. On the MIMIC-III-full dataset, our model outperforms in the macro-F1, micro-F1, macro-AUC, and precision at eight compared to the most advanced models. In addition, after introducing the denoising mechanism, the convergence speed of the model becomes faster, and the loss of the model is reduced overall. Conclusions The innovations of our model are threefold: firstly, the code-specific representation can be identified by adopted the self-attention mechanism and the label attention mechanism. Secondly, the performance of the long-tailed distributions can be boosted by introducing the joint learning mechanism. Thirdly, the denoising mechanism is suitable for reducing the noise effects in medical code prediction. Finally, we evaluate the effectiveness of our model on the widely-used MIMIC-III datasets and achieve new SOTA results.


2021 ◽  
Author(s):  
Yotam Erel ◽  
Christine Potter ◽  
Sagi Jaffe-Dax ◽  
Casey Lew-Williams ◽  
Amit Bermano

Infants’ looking behaviors are often used for measuring attention, real-time processing, and learning – often using low-resolution videos. Despite the ubiquity of gaze-related methods in developmental science, current techniques usually involve laborious post hoc coding, imprecise real-time coding, or expensive eye trackers that may increase data loss and require a calibration phase. As a solution, we used computer-vision methods to perform automatic gaze estimation from low-resolution videos. At the core of this approach is an artificial neural network that classifies gaze directions in real time. We tested our method, called iCatcher, on data collected using the looking-while-listening procedure, where infants look at one of two locations on a screen. Using a large dataset of manually-annotated videos from prior research, we demonstrate that the accuracy of iCatcher approximates that of human annotators and replicates the results. Our method is publicly available as an open-source repository at https://github.com/yoterel/iCatcher.


2021 ◽  
Author(s):  
Wim Bernasco ◽  
Eveline Hoeben ◽  
Dennis Koelma ◽  
Lasse Suonperä Liebst ◽  
Josephine Thomas ◽  
...  

Social scientists increasingly use video data, but large-scale analysis of its content is often constrained by scarce manual coding resources. Upscaling may be possible with the application of automated coding procedures, which are being developed in the field of computer vision. Here, we introduce computer vision to social scientists, review the state-of-the-art in relevant subfields, and provide a working example of how computer vision can be applied in empirical sociological work. Our application involves defining a ground truth by human coders, developing an algorithm for automated coding, testing the performance of the algorithm against the ground truth, and run the algorithm on a large-scale dataset of CCTV images. The working example concerns monitoring social distancing behavior in public space over more than a year of the COVID-19 pandemic. Finally, we discuss prospects for the use of computer vision in empirical social science research and address technical and ethical limitations.


2021 ◽  
Author(s):  
Luke T Slater ◽  
Sophie Russell ◽  
Silver Makepeace ◽  
Alexander Carberry ◽  
Andreas Karwath ◽  
...  

Semantic similarity is a valuable tool for analysis in biomedicine. When applied to phenotype profiles derived from clinical text, they have the capacity to enable and enhance 'patient-like me' analyses, automated coding, differential diagnosis, and outcome prediction, by leveraging the wealth of background knowledge provided by biomedical ontologies. While a large body of work exists exploring the use of semantic similarity for multiple tasks, including protein interaction prediction, and rare disease differential diagnosis, there is less work exploring comparison of patient phenotype profiles for clinical tasks. Moreover, there are no experimental explorations of optimal parameters or methods in the area. In this work, we develop a reproducible platform for benchmarking experimental conditions for patient phentoype similarity. Using the platform, we evaluate the task of ranking shared primary diagnosis from uncurated phenotype profiles derived from text narrative associated with admissions in MIMIC-III. In doing this, we identify and interpret the performance of a large number of semantic similarity measures for this task, and provide a basis for further research on related tasks in the area.


2021 ◽  
Author(s):  
Asghar Ahmadi ◽  
Michael Noetel ◽  
Melissa Schellekens ◽  
Philip David Parker ◽  
Devan Antczak ◽  
...  

Many psychological treatments have been shown to be cost-effective and efficacious, as long as they are implemented faithfully. Assessing fidelity and providing feedback is expensive and time-consuming. Machine learning has been used to assess treatment fidelity, but the reliability and generalisability is unclear. We collated and critiqued all implementations of machine learning to assess the verbal behaviour of all helping professionals, with particular emphasis on treatment fidelity for therapists. We conducted searches using nine electronic databases for automated approaches of coding verbal behaviour in therapy and similar contexts. We completed screening, extraction, and quality assessment in duplicate. Fifty-two studies met our inclusion criteria (65.3% in psychotherapy). Automated coding methods performed better than chance, and some methods showed near human-level performance; performance tended to be better with larger data sets, a smaller number of codes, conceptually simple codes, and when predicting session-level ratings than utterance-level ones. Few studies adhered to best-practice machine learning guidelines. Machine learning demonstrated promising results, particularly where there are large, annotated datasets and a modest number of concrete features to code. These methods are novel, cost-effective, scalable ways of assessing fidelity and providing therapists with individualised, prompt, and objective feedback.


Author(s):  
Millicent Webber ◽  
◽  
Rebecca Giblin ◽  
Yanfang Ding ◽  
François Petitjean-Hèche ◽  
...  

Introduction. We investigated patterns in digital audiobook and e-book circulation through Australian libraries to identify and analyse trends in audiobook publishing and reading. Method. In partnership with four Australian library services we collated a dataset of 555,618 audiobook checkouts and 3,475,188 e-book checkouts, representing all OverDrive checkouts through these services from 2006 until July 2017. Analysis. We examined the availability and popularity of audiobook and e-book titles over time. We used bibliographic metadata and manual and automated coding to examine major publishers, sex and nationality of authors, and popular titles and genres. Results. Audiobooks and e-books have experienced substantial growth since 2006. Major publishers including the Big Five, Amazon, and Bolinda have historically been less important in audiobook publishing than in print or e-book markets, with numerous specialist audio publishers and producers prominent in the field. Audiobooks and e-books show disparities in the sex of authors. Crime, science fiction, and fantasy are the most popular audiobook genres. Conclusion. Library checkout data confirm audiobook publishing’s recent volatility. Libraries are the keepers of valuable information about new media forms like audiobooks, and collaboration between libraries, publishers, and researchers directly supports understanding of this important new space of cultural production and consumption.


Author(s):  
Abdelahad Chraibi ◽  
David Delerue ◽  
Julien Taillard ◽  
Ismat Chaib Draa ◽  
Régis Beuscart ◽  
...  

The International Statistical Classification of Diseases and Related Health Problems (ICD) is one of the widely used classification system for diagnoses and procedures to assign diagnosis codes to Electronic Health Record (EHR) associated with a patient’s stay. The aim of this paper is to propose an automated coding system to assist physicians in the assignment of ICD codes to EHR. For this purpose, we created a pipeline of Natural Language Processing (NLP) and Deep Learning (DL) models able to extract the useful information from French medical texts and to perform classification. After the evaluation phase, our approach was able to predict 346 diagnosis codes from heterogeneous medical units with an accuracy average of 83%. Our results were finally validated by physicians of the Medical Information Department (MID) in charge of coding hospital stays.


2021 ◽  
Author(s):  
Manon Schutte ◽  
Glynis Bogaard ◽  
Erik Mac Giolla ◽  
Lara Warmelink ◽  
Bennett Kleinberg ◽  
...  

Purpose: Truthful statements are theorized to be richer in perceptual and contextual detail than deceptive statements. The level of detail can be coded by humans or computers, with human coding argued to be superior. Direct comparisons of human and automated coding, however, are rare.Methods: We applied automatic identification of details with the Linguistic Inquiry and Word Count (LIWC) software on truthful and deceptive statements from four datasets that had been manually coded for details.Results: We noted that the common way of scoring manual and LIWC coding hampers a direct comparison because they rely on different metrics – count and proportion scores, respectively. Lie-truth differences varied substantially across metric and dataset (LIWC: -0.09 ≤ Cohen’s d ≤ .89; Manual: 0.03 ≤ Cohen’s d ≤ .80). When set to the same metric, neither method seemed to outperform the other. Using count scores, both LIWC and manual coding indicated that truthful statements about past events contain more perceptual and contextual details than deceptive statements. Across the four datasets, we also observed considerable variation in manual coding.Conclusions: Human coding does not necessarily outperform LIWC coding of perceptual and contextual details in discriminating lies from truths. Our findings call for systematic comparison of human and automated verbal lie detection approaches on the same data, and we reiterate the need for better data sharing practices to help accomplish that aim.


Sign in / Sign up

Export Citation Format

Share Document