scholarly journals Data-Driven Lexical Normalization for Medical Social Media

2019 ◽  
Vol 3 (3) ◽  
pp. 60 ◽  
Author(s):  
Dirkson ◽  
Verberne ◽  
Sarker ◽  
Kraaij

In the medical domain, user-generated social media text is increasingly used as a valuablecomplementary knowledge source to scientific medical literature. The extraction of this knowledge iscomplicated by colloquial language use and misspellings. However, lexical normalization of suchdata has not been addressed effectively. This paper presents a data-driven lexical normalizationpipeline with a novel spelling correction module for medical social media. Our method significantlyoutperforms state-of-the-art spelling correction methods and can detect mistakes with an F1 of 0.63despite extreme imbalance in the data. We also present the first corpus for spelling mistake detectionand correction in a medical patient forum.

Author(s):  
Gauri Jain ◽  
Manisha Sharma ◽  
Basant Agarwal

This article describes how spam detection in the social media text is becoming increasing important because of the exponential increase in the spam volume over the network. It is challenging, especially in case of text within the limited number of characters. Effective spam detection requires more number of efficient features to be learned. In the current article, the use of a deep learning technology known as a convolutional neural network (CNN) is proposed for spam detection with an added semantic layer on the top of it. The resultant model is known as a semantic convolutional neural network (SCNN). A semantic layer is composed of training the random word vectors with the help of Word2vec to get the semantically enriched word embedding. WordNet and ConceptNet are used to find the word similar to a given word, in case it is missing in the word2vec. The architecture is evaluated on two corpora: SMS Spam dataset (UCI repository) and Twitter dataset (Tweets scrapped from public live tweets). The authors' approach outperforms the-state-of-the-art results with 98.65% accuracy on SMS spam dataset and 94.40% accuracy on Twitter dataset.


Author(s):  
Santosh Kumar Bharti ◽  
Sathya Babu Korra

Posting sarcastic messages on social media like Twitter, Facebook, WhatsApp, etc., became a new trend to avoid direct negativity. Detecting this indirect negativity in the social media text has become an important task as they influence every business organization. In the presence of sarcasm, detection of actual sentiment on these texts has become the most challenging task. An automated system is required that will be capable of identifying actual sentiment of a given text in the presence of sarcasm. In this chapter, we proposed an automated system for sarcasm detection in social media text using six algorithms that are capable to analyze the various types of sarcasm occurs in Twitter data. These algorithms use lexical, pragmatic, hyperbolic and contextual features of text to identify sarcasm. In the contextual feature, we mainly focus on situation, topical, temporal, and historical context of the text. The experimental results of proposed approach were compared with state-of-the-art techniques.


2014 ◽  
Author(s):  
Sandeep Soni ◽  
Tanushree Mitra ◽  
Eric Gilbert ◽  
Jacob Eisenstein

Author(s):  
Yogesh K. Dwivedi ◽  
Elvira Ismagilova ◽  
Nripendra P. Rana ◽  
Ramakrishnan Raman

AbstractSocial media plays an important part in the digital transformation of businesses. This research provides a comprehensive analysis of the use of social media by business-to-business (B2B) companies. The current study focuses on the number of aspects of social media such as the effect of social media, social media tools, social media use, adoption of social media use and its barriers, social media strategies, and measuring the effectiveness of use of social media. This research provides a valuable synthesis of the relevant literature on social media in B2B context by analysing, performing weight analysis and discussing the key findings from existing research on social media. The findings of this study can be used as an informative framework on social media for both, academic and practitioners.


Energies ◽  
2021 ◽  
Vol 14 (9) ◽  
pp. 2371
Author(s):  
Matthieu Dubarry ◽  
David Beck

The development of data driven methods for Li-ion battery diagnosis and prognosis is a growing field of research for the battery community. A big limitation is usually the size of the training datasets which are typically not fully representative of the real usage of the cells. Synthetic datasets were proposed to circumvent this issue. This publication provides improved datasets for three major battery chemistries, LiFePO4, Nickel Aluminum Cobalt Oxide, and Nickel Manganese Cobalt Oxide 811. These datasets can be used for statistical or deep learning methods. This work also provides a detailed statistical analysis of the datasets. Accurate diagnosis as well as early prognosis comparable with state of the art, while providing physical interpretability, were demonstrated by using the combined information of three learnable parameters.


Sign in / Sign up

Export Citation Format

Share Document