scholarly journals A bootstrapping approach to social media quantification

2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Ashlynn R. Daughton ◽  
Michael J. Paul

AbstractThis work considers the use of classifiers in a downstream aggregation task estimating class proportions, such as estimating the percentage of reviews for a movie with positive sentiment. We derive the bias and variance of the class proportion estimator when taking classification error into account to determine how to best trade off different error types when tuning a classifier for these tasks. Additionally, we propose a method for constructing confidence intervals that correctly adjusts for classification error when estimating these statistics. We conduct experiments on four document classification tasks comparing our methods to prior approaches across classifier thresholds, sample sizes, and label distributions. Prior approaches have focused on providing the most accurate point estimate while this work focuses on the creation of correct confidence intervals that appropriately account for classifier error. Compared to the prior approaches, our methods provide lower error and more accurate confidence intervals.

2021 ◽  
Vol 9 (3) ◽  
pp. 232596712199005
Author(s):  
Jonathan S. Yu ◽  
James B. Carr ◽  
Jacob Thomas ◽  
Julianna Kostas ◽  
Zhaorui Wang ◽  
...  

Background: Social media posts regarding ulnar collateral ligament (UCL) injuries and reconstruction surgeries have increased in recent years. Purpose: To analyze posts shared on Instagram and Twitter referencing UCL injuries and reconstruction surgeries to evaluate public perception and any trends in perception over the past 3 years. Study Design: Cross-sectional study. Methods: A search of a 3-year period (August 2016 and August 2019) of public Instagram and Twitter posts was performed. We searched for >22 hashtags and search terms, including #TommyJohn, #TommyJohnSurgery, and #tornUCL. A categorical classification system was used to assess the sentiment, media format, perspective, timing, accuracy, and general content of each post. Post popularity was measured by number of likes and comments. Results: A total of 3119 Instagram posts and 267 Twitter posts were included in the analysis. Of the 3119 Instagram posts analyzed, 34% were from patients, and 28% were from providers. Of the 267 Twitter posts analyzed, 42% were from patients, and 16% were from providers. Although the majority of social media posts were of a positive sentiment, over the past 3 years, there was a major surge in negative sentiment posts (97% increase) versus positive sentiment posts (9% increase). Patients were more likely to focus their posts on rehabilitation, return to play, and activities of daily living. Providers tended to focus their posts on education, rehabilitation, and injury prevention. Patient posts declined over the past 3 years (–28%), whereas provider posts increased substantially (110%). Of posts shared by health care providers, 4% of posts contained inaccurate or misleading information. Conclusion: The majority of patients who post about their UCL injury and reconstruction on social media have a positive sentiment when discussing their procedure. However, negative sentiment posts have increased significantly over the past 3 years. Patient content revolves around rehabilitation and return to play. Although patient posts have declined over the past 3 years, provider posts have increased substantially with an emphasis on education.


NeuroImage ◽  
2021 ◽  
pp. 118786
Author(s):  
Gang Chen ◽  
Daniel S. Pine ◽  
Melissa A. Brotman ◽  
Ashley R. Smith ◽  
Robert W. Cox ◽  
...  
Keyword(s):  

SISTEMASI ◽  
2022 ◽  
Vol 11 (1) ◽  
pp. 197
Author(s):  
Okta Fanny ◽  
Heri Suroyo

From the research that has been done, it can be concluded that Sentiment Analysis can be used to know the sentiment of the public, especially Twitter netizens against omnibus law. After the sentiment analysis, it looks neutral artmen with the largest percentage of 55%, then positive sentiment by 35% and negative sentiment by 10%. The results of the analysis showed that the Naïve Bayes Classifier method provides classification test results with accuracy in Hashtag Pro with an average accuracy score of 92.1%, precision values with an average of 94.8% and recall values with an average of 90.7%. While Hashtag Counter For data classification, with an average accuracy value of 98.3%, precision value with an average of 97.6% and recall value with an average of 98.7%. The result of text cloud analysis conducted on a combination of hashtags both Hashtag pros and Hashtags cons, the dominant word appears is Omnibus Law which means that all hashtags in scrap is really discussing the main topic that is about Omnibus Law


Author(s):  
Yi Song ◽  
Xuesong Lu ◽  
Sadegh Nobari ◽  
Stéphane Bressan ◽  
Panagiotis Karras

One is either on Facebook or not. Of course, this assessment is controversial and its rationale arguable. It is nevertheless not far, for many, from the reason behind joining social media and publishing and sharing details of their professional and private lives. Not only the personal details that may be revealed, but also the structure of the networks are sources of invaluable information for any organization wanting to understand and learn about social groups, their dynamics and members. These organizations may or may not be benevolent. It is important to devise, design and evaluate solutions that guarantee some privacy. One approach that reconciles the different stakeholders’ requirement is the publication of a modified graph. The perturbation is hoped to be sufficient to protect members’ privacy while it maintains sufficient utility for analysts wanting to study the social media as a whole. In this paper, the authors try to empirically quantify the inevitable trade-off between utility and privacy. They do so for two state-of-the-art graph anonymization algorithms that protect against most structural attacks, the k-automorphism algorithm and the k-degree anonymity algorithm. The authors measure several metrics for a series of real graphs from various social media before and after their anonymization under various settings.


2020 ◽  
Vol 79 (11) ◽  
pp. 1432-1437 ◽  
Author(s):  
Chanakya Sharma ◽  
Samuel Whittle ◽  
Pari Delir Haghighi ◽  
Frada Burstein ◽  
Roee Sa'adon ◽  
...  

ObjectivesWe hypothesise that patients have a positive sentiment regarding biological/targeted synthetic disease modifying anti-rheumatic drugs (b/tsDMARDs) and a negative sentiment towards conventional synthetic agents (csDMARDs). We analysed discussions on social media platforms regarding DMARDs to understand the collective sentiment expressed towards these medications.MethodsTreato analytics were used to download all available posts on social media about DMARDs in the context of rheumatoid arthritis. Strict filters ensured that user generated content was downloaded. The sentiment (positive or negative) expressed in these posts was analysed for each DMARD using sentiment analysis. We also analysed the reason(s) for this sentiment for each DMARD, looking specifically at efficacy and side effects.ResultsComputer algorithms analysed millions of social media posts and included 54 742 posts about DMARDs. We found that both classes had an overall positive sentiment. The ratio of positive to negative posts was higher for b/tsDMARDs (1.210) than for csDMARDs (1.048). Efficacy was the most commonly mentioned reason in posts with a positive sentiment and lack of efficacy was the most commonly mentioned reason for a negative sentiment. These were followed by the presence/absence of side effects in negative or positive posts, respectively.ConclusionsPublic opinion on social media is generally positive about DMARDs. Lack of efficacy followed by side effects were the most common themes in posts with a negative sentiment. There are clear reasons why a DMARD generates a positive or negative sentiment, as the sentiment analysis technology becomes more refined, targeted studies could be done to analyse these reasons and allow clinicians to tailor DMARDs to match patient needs.


2020 ◽  
Vol 10 (18) ◽  
pp. 6229
Author(s):  
Juho Bai ◽  
Inwook Shim ◽  
Seog Park

The patent document has different content for each paragraph, and the length of the document is also very long. Moreover, patent documents are classified hierarchically as multi-labels. Many works have employed deep neural architectures to classify the patent documents. Traditional document classification methods have not well represented the characteristics of entire patent document contents because they usually require a fixed input length. To address this issue, we propose a neural network-based document classification for patent documents by designing a novel multi-stage feature extraction network (MEXN), which comprise of paragraphs encoder and summarizer for all paragraphs. MEXN features analysis of the whole documents hierarchically and providing multi-labels outputs. Furthermore, MEXN preserves computing performance marginally increase. We demonstrate that the proposed method outperforms current state-of-the-art models in patent document classification tasks with multi-label classification experiments for USPD datasets.


2016 ◽  
Vol 27 (5) ◽  
pp. 1559-1574 ◽  
Author(s):  
Andrew Carkeet ◽  
Yee Teng Goh

Bland and Altman described approximate methods in 1986 and 1999 for calculating confidence limits for their 95% limits of agreement, approximations which assume large subject numbers. In this paper, these approximations are compared with exact confidence intervals calculated using two-sided tolerance intervals for a normal distribution. The approximations are compared in terms of the tolerance factors themselves but also in terms of the exact confidence limits and the exact limits of agreement coverage corresponding to the approximate confidence interval methods. Using similar methods the 50th percentile of the tolerance interval are compared with the k values of 1.96 and 2, which Bland and Altman used to define limits of agreements (i.e. [Formula: see text]+/− 1.96Sd and [Formula: see text]+/− 2Sd). For limits of agreement outer confidence intervals, Bland and Altman’s approximations are too permissive for sample sizes <40 (1999 approximation) and <76 (1986 approximation). For inner confidence limits the approximations are poorer, being permissive for sample sizes of <490 (1986 approximation) and all practical sample sizes (1999 approximation). Exact confidence intervals for 95% limits of agreements, based on two-sided tolerance factors, can be calculated easily based on tables and should be used in preference to the approximate methods, especially for small sample sizes.


Sign in / Sign up

Export Citation Format

Share Document