A bootstrapping approach to social media quantification

AbstractThis work considers the use of classifiers in a downstream aggregation task estimating class proportions, such as estimating the percentage of reviews for a movie with positive sentiment. We derive the bias and variance of the class proportion estimator when taking classification error into account to determine how to best trade off different error types when tuning a classifier for these tasks. Additionally, we propose a method for constructing confidence intervals that correctly adjusts for classification error when estimating these statistics. We conduct experiments on four document classification tasks comparing our methods to prior approaches across classifier thresholds, sample sizes, and label distributions. Prior approaches have focused on providing the most accurate point estimate while this work focuses on the creation of correct confidence intervals that appropriately account for classifier error. Compared to the prior approaches, our methods provide lower error and more accurate confidence intervals.

Download Full-text

Trends in Patient, Physician, and Public Perception of Ulnar Collateral Ligament Reconstruction Using Social Media Analytics

Orthopaedic Journal of Sports Medicine ◽

10.1177/2325967121990052 ◽

2021 ◽

Vol 9 (3) ◽

pp. 232596712199005

Author(s):

Jonathan S. Yu ◽

James B. Carr ◽

Jacob Thomas ◽

Julianna Kostas ◽

Zhaorui Wang ◽

...

Keyword(s):

Social Media ◽

Public Perception ◽

Ulnar Collateral Ligament ◽

Return To Play ◽

Social Media Analytics ◽

Care Providers ◽

Collateral Ligament ◽

The Past ◽

Positive Sentiment ◽

Negative Sentiment

Background: Social media posts regarding ulnar collateral ligament (UCL) injuries and reconstruction surgeries have increased in recent years. Purpose: To analyze posts shared on Instagram and Twitter referencing UCL injuries and reconstruction surgeries to evaluate public perception and any trends in perception over the past 3 years. Study Design: Cross-sectional study. Methods: A search of a 3-year period (August 2016 and August 2019) of public Instagram and Twitter posts was performed. We searched for >22 hashtags and search terms, including #TommyJohn, #TommyJohnSurgery, and #tornUCL. A categorical classification system was used to assess the sentiment, media format, perspective, timing, accuracy, and general content of each post. Post popularity was measured by number of likes and comments. Results: A total of 3119 Instagram posts and 267 Twitter posts were included in the analysis. Of the 3119 Instagram posts analyzed, 34% were from patients, and 28% were from providers. Of the 267 Twitter posts analyzed, 42% were from patients, and 16% were from providers. Although the majority of social media posts were of a positive sentiment, over the past 3 years, there was a major surge in negative sentiment posts (97% increase) versus positive sentiment posts (9% increase). Patients were more likely to focus their posts on rehabilitation, return to play, and activities of daily living. Providers tended to focus their posts on education, rehabilitation, and injury prevention. Patient posts declined over the past 3 years (–28%), whereas provider posts increased substantially (110%). Of posts shared by health care providers, 4% of posts contained inaccurate or misleading information. Conclusion: The majority of patients who post about their UCL injury and reconstruction on social media have a positive sentiment when discussing their procedure. However, negative sentiment posts have increased significantly over the past 3 years. Patient content revolves around rehabilitation and return to play. Although patient posts have declined over the past 3 years, provider posts have increased substantially with an emphasis on education.

Download Full-text

Sample sizes for constructing confidence intervals and testing hypotheses

Controlled Clinical Trials ◽

10.1016/0197-2456(89)90162-1 ◽

1989 ◽

Vol 10 (3) ◽

pp. 345

Author(s):

David R. Bristol

Keyword(s):

Confidence Intervals ◽

Sample Sizes ◽

Testing Hypotheses

Download Full-text

Hyperbolic trade-off: the importance of balancing trial and subject sample sizes in neuroimaging

NeuroImage ◽

10.1016/j.neuroimage.2021.118786 ◽

2021 ◽

pp. 118786

Author(s):

Gang Chen ◽

Daniel S. Pine ◽

Melissa A. Brotman ◽

Ashley R. Smith ◽

Robert W. Cox ◽

...

Keyword(s):

Sample Sizes ◽

Trade Off

Download Full-text

Analysis of Social Media Users Sentiments against Omnibus Law Based on Hashtags on Twitter

SISTEMASI ◽

10.32520/stmsi.v11i1.1685 ◽

2022 ◽

Vol 11 (1) ◽

pp. 197

Author(s):

Okta Fanny ◽

Heri Suroyo

Keyword(s):

Social Media ◽

Sentiment Analysis ◽

Main Topic ◽

Accuracy Score ◽

Test Results ◽

Bayes Classifier ◽

The Public ◽

Average Accuracy ◽

Positive Sentiment ◽

Negative Sentiment

From the research that has been done, it can be concluded that Sentiment Analysis can be used to know the sentiment of the public, especially Twitter netizens against omnibus law. After the sentiment analysis, it looks neutral artmen with the largest percentage of 55%, then positive sentiment by 35% and negative sentiment by 10%. The results of the analysis showed that the Naïve Bayes Classifier method provides classification test results with accuracy in Hashtag Pro with an average accuracy score of 92.1%, precision values with an average of 94.8% and recall values with an average of 90.7%. While Hashtag Counter For data classification, with an average accuracy value of 98.3%, precision value with an average of 97.6% and recall value with an average of 98.7%. The result of text cloud analysis conducted on a combination of hashtags both Hashtag pros and Hashtags cons, the dominant word appears is Omnibus Law which means that all hashtags in scrap is really discussing the main topic that is about Omnibus Law

Download Full-text

Case Study 2: The Trade-Off between Reproducibility and Privacy in the Use of Social Media Data to Study Political Behavior

The Practice of Reproducible Research ◽

10.1525/9780520967779-011 ◽

2019 ◽

pp. 103-108

Keyword(s):

Social Media ◽

Political Behavior ◽

Social Media Data ◽

Trade Off ◽

Use Of Social Media ◽

Media Data

Download Full-text

On the Privacy and Utility of Anonymized Social Networks

International Journal of Adaptive Resilient and Autonomic Systems ◽

10.4018/jaras.2013040101 ◽

2013 ◽

Vol 4 (2) ◽

pp. 1-34 ◽

Cited By ~ 3

Author(s):

Yi Song ◽

Xuesong Lu ◽

Sadegh Nobari ◽

Stéphane Bressan ◽

Panagiotis Karras

Keyword(s):

Social Networks ◽

Social Media ◽

State Of The Art ◽

Social Groups ◽

Trade Off ◽

Private Lives ◽

The Social ◽

Before And After ◽

Do So

One is either on Facebook or not. Of course, this assessment is controversial and its rationale arguable. It is nevertheless not far, for many, from the reason behind joining social media and publishing and sharing details of their professional and private lives. Not only the personal details that may be revealed, but also the structure of the networks are sources of invaluable information for any organization wanting to understand and learn about social groups, their dynamics and members. These organizations may or may not be benevolent. It is important to devise, design and evaluate solutions that guarantee some privacy. One approach that reconciles the different stakeholders’ requirement is the publication of a modified graph. The perturbation is hoped to be sufficient to protect members’ privacy while it maintains sufficient utility for analysts wanting to study the social media as a whole. In this paper, the authors try to empirically quantify the inevitable trade-off between utility and privacy. They do so for two state-of-the-art graph anonymization algorithms that protect against most structural attacks, the k-automorphism algorithm and the k-degree anonymity algorithm. The authors measure several metrics for a series of real graphs from various social media before and after their anonymization under various settings.

Download Full-text

Mining social media data to investigate patient perceptions regarding DMARD pharmacotherapy for rheumatoid arthritis

Annals of the Rheumatic Diseases ◽

10.1136/annrheumdis-2020-217333 ◽

2020 ◽

Vol 79 (11) ◽

pp. 1432-1437 ◽

Cited By ~ 1

Author(s):

Chanakya Sharma ◽

Samuel Whittle ◽

Pari Delir Haghighi ◽

Frada Burstein ◽

Roee Sa'adon ◽

...

Keyword(s):

Rheumatoid Arthritis ◽

Social Media ◽

Side Effects ◽

Sentiment Analysis ◽

Computer Algorithms ◽

Social Media Platforms ◽

Synthetic Agents ◽

Positive Sentiment ◽

Negative Sentiment ◽

Media Data

ObjectivesWe hypothesise that patients have a positive sentiment regarding biological/targeted synthetic disease modifying anti-rheumatic drugs (b/tsDMARDs) and a negative sentiment towards conventional synthetic agents (csDMARDs). We analysed discussions on social media platforms regarding DMARDs to understand the collective sentiment expressed towards these medications.MethodsTreato analytics were used to download all available posts on social media about DMARDs in the context of rheumatoid arthritis. Strict filters ensured that user generated content was downloaded. The sentiment (positive or negative) expressed in these posts was analysed for each DMARD using sentiment analysis. We also analysed the reason(s) for this sentiment for each DMARD, looking specifically at efficacy and side effects.ResultsComputer algorithms analysed millions of social media posts and included 54 742 posts about DMARDs. We found that both classes had an overall positive sentiment. The ratio of positive to negative posts was higher for b/tsDMARDs (1.210) than for csDMARDs (1.048). Efficacy was the most commonly mentioned reason in posts with a positive sentiment and lack of efficacy was the most commonly mentioned reason for a negative sentiment. These were followed by the presence/absence of side effects in negative or positive posts, respectively.ConclusionsPublic opinion on social media is generally positive about DMARDs. Lack of efficacy followed by side effects were the most common themes in posts with a negative sentiment. There are clear reasons why a DMARD generates a positive or negative sentiment, as the sentiment analysis technology becomes more refined, targeted studies could be done to analyse these reasons and allow clinicians to tailor DMARDs to match patient needs.

Download Full-text

MEXN: Multi-Stage Extraction Network for Patent Document Classification

Applied Sciences ◽

10.3390/app10186229 ◽

2020 ◽

Vol 10 (18) ◽

pp. 6229

Author(s):

Juho Bai ◽

Inwook Shim ◽

Seog Park

Keyword(s):

State Of The Art ◽

Document Classification ◽

Patent Document ◽

Current State ◽

Multi Stage ◽

Computing Performance ◽

Classification Tasks ◽

Patent Documents ◽

Fixed Input ◽

Different Content

The patent document has different content for each paragraph, and the length of the document is also very long. Moreover, patent documents are classified hierarchically as multi-labels. Many works have employed deep neural architectures to classify the patent documents. Traditional document classification methods have not well represented the characteristics of entire patent document contents because they usually require a fixed input length. To address this issue, we propose a neural network-based document classification for patent documents by designing a novel multi-stage feature extraction network (MEXN), which comprise of paragraphs encoder and summarizer for all paragraphs. MEXN features analysis of the whole documents hierarchically and providing multi-labels outputs. Furthermore, MEXN preserves computing performance marginally increase. We demonstrate that the proposed method outperforms current state-of-the-art models in patent document classification tasks with multi-label classification experiments for USPD datasets.

Download Full-text

Confidence and coverage for Bland–Altman limits of agreement and their approximate confidence intervals

Statistical Methods in Medical Research ◽

10.1177/0962280216665419 ◽

2016 ◽

Vol 27 (5) ◽

pp. 1559-1574 ◽

Cited By ~ 22

Author(s):

Andrew Carkeet ◽

Yee Teng Goh

Keyword(s):

Confidence Intervals ◽

Small Sample ◽

Interval Methods ◽

Tolerance Interval ◽

Confidence Limits ◽

Sample Sizes ◽

Limits Of Agreement ◽

Approximate Methods ◽

Exact Confidence Intervals ◽

Tolerance Factors

Bland and Altman described approximate methods in 1986 and 1999 for calculating confidence limits for their 95% limits of agreement, approximations which assume large subject numbers. In this paper, these approximations are compared with exact confidence intervals calculated using two-sided tolerance intervals for a normal distribution. The approximations are compared in terms of the tolerance factors themselves but also in terms of the exact confidence limits and the exact limits of agreement coverage corresponding to the approximate confidence interval methods. Using similar methods the 50th percentile of the tolerance interval are compared with the k values of 1.96 and 2, which Bland and Altman used to define limits of agreements (i.e. [Formula: see text]+/− 1.96Sd and [Formula: see text]+/− 2Sd). For limits of agreement outer confidence intervals, Bland and Altman’s approximations are too permissive for sample sizes <40 (1999 approximation) and <76 (1986 approximation). For inner confidence limits the approximations are poorer, being permissive for sample sizes of <490 (1986 approximation) and all practical sample sizes (1999 approximation). Exact confidence intervals for 95% limits of agreements, based on two-sided tolerance factors, can be calculated easily based on tables and should be used in preference to the approximate methods, especially for small sample sizes.

Download Full-text

The use of incidence counts for estimation of aphid populations. 2. Confidence intervals from fixed sample sizes

Netherlands Journal of Plant Pathology ◽

10.1007/bf01974305 ◽

1985 ◽

Vol 91 (2) ◽

pp. 100-104 ◽

Cited By ~ 5

Author(s):

S. A. Ward ◽

R. Rabbinge ◽

W. P. Mantel

Keyword(s):

Confidence Intervals ◽

Sample Sizes ◽

Fixed Sample ◽

Aphid Populations

Download Full-text