scholarly journals Exploring the Transition to Fatherhood: Feasibility Study Using Social Media and Machine Learning (Preprint)

2018 ◽  
Author(s):  
Samantha J Teague ◽  
Adrian BR Shatte

BACKGROUND Fathers’ experiences across the transition to parenthood are underreported in the literature. Social media offers the potential to capture fathers’ experiences in real time and at scale while also removing the barriers that fathers typically face in participating in research and clinical care. OBJECTIVE This study aimed to assess the feasibility of using social media data to map the discussion topics of fathers across the fatherhood transition. METHODS Discussion threads from two Web-based parenting communities, r/Daddit and r/PreDaddit from the social media platform Reddit, were collected over a 2-week period, resulting in 1980 discussion threads contributed to by 5853 unique users. An unsupervised machine learning algorithm was then implemented to group discussion threads into topics within each community and across a combined collection of all discussion threads. RESULTS Results demonstrated that men use Web-based communities to share the joys and challenges of the fatherhood experience. Minimal overlap in discussions was found between the 2 communities, indicating that distinct conversations are held on each forum. A range of social support techniques was demonstrated, with conversations characterized by encouragement, humor, and experience-based advice. CONCLUSIONS This study demonstrates that rich data on fathers’ experiences can be sourced from social media and analyzed rapidly using automated techniques, providing an additional tool for researchers exploring fatherhood.

Author(s):  
Jānis Kapenieks

INTRODUCTION Opinion analysis in the big data analysis context has been a hot topic in science and the business world recently. Social media has become a key data source for opinions generating a large amount of data every day providing content for further analysis. In the Big data age, unstructured data classification is one of the key tools for fast and reliable content analysis. I expect significant growth in the demand for content classification services in the nearest future. There are many online text classification tools available providing limited functionality -such as automated text classification in predefined categories and sentiment analysis based on a pre-trained machine learning algorithm. The limited functionality does not provide tools such as data mining support and/or a machine learning algorithm training interface. There are a limited number of tools available providing the whole sets of tools required for text classification, i.e. this includes all the steps starting from data mining till building a machine learning algorithm and applying it to a data stream from a social network source. My goal is to create a tool able to generate a classified text stream directly from social media with a user friendly set-up interface. METHODS AND MATERIALS The text classification tool will have a core based modular structure (each module providing certain functionality) so the system can be scaled in terms of technology and functionality. The tool will be built on open source libraries and programming languages running on a Linux OS based server. The tool will be based on three key components: frontend, backend and data storage as described below: backend: Python and Nodejs programming language with machine learning and text filtering libraries: TensorFlow, and Keras, for data storage Mysql 5.7/8 will be used, frontend will be based on web technologies built using PHP and Javascript. EXPECTED RESULTS The expected result of my work is a web-based text classification tool for opinion analysis using data streams from social media. The tool will provide a user friendly interface for data collection, algorithm selection, machine learning algorithm setup and training. Multiple text classification algorithms will be available as listed below: Linear SVM Random Forest Multinomial Naive Bayes Bernoulli Naive Bayes Ridge Regressio Perceptron Passive Aggressive Classifier Deep machine learning algorithm. System users will be able to identify the most effective algorithm for their text classification task and compare them based on their accuracy. The architecture of the text classification tool will be based on a frontend interface and backend services. The frontend interface will provide all the tools the system user will be interacting with the system. This includes setting up data collection streams from multiple social networks and allocating them to pre-specified channels based on keywords. Data from each channel can be classified and assigned to a pre-defined cluster. The tool will provide a training interface for machine learning algorithms. This text classification tool is currently in active development for a client with planned testing and implementation in April 2019.


In today’s world social media is one of the most important tool for communication that helps people to interact with each other and share their thoughts, knowledge or any other information. Some of the most popular social media websites are Facebook, Twitter, Whatsapp and Wechat etc. Since, it has a large impact on people’s daily life it can be used a source for any fake or misinformation. So it is important that any information presented on social media should be evaluated for its genuineness and originality in terms of the probability of correctness and reliability to trust the information exchange. In this work we have identified the features that can be helpful in predicting whether a given Tweet is Rumor or Information. Two machine learning algorithm are executed using WEKA tool for the classification that is Decision Tree and Support Vector Machine.


2020 ◽  
Vol 10 (1) ◽  
pp. 1-12
Author(s):  
Noura A. AlSomaikhi ◽  
Zakarya A. Alzamil

Microblogging platforms, such as Twitter, have become a popular interaction media that are used widely for different daily purposes, such as communication and knowledge sharing. Understanding the behaviors and interests of these platforms' users become a challenge that can help in different areas such as recommendation and filtering. In this article, an approach is proposed for classifying Twitter users with respect to their interests based on their Arabic tweets. A Multinomial Naïve Bayes machine learning algorithm is used for such classification. The proposed approach has been developed as a web-based software system that is integrated with Twitter using Twitter API. An experimental study on Arabic tweets has been investigated on the proposed system as a case study.


2020 ◽  
Vol 17 (7) ◽  
pp. 2869-2875
Author(s):  
Sajay Thomas Samuel ◽  
Booma Poolan Marikannan

Machine learning can help people to perform complex tasks and solve problems as it uses historical data to learn its pattern and make predictions based on the past data. This research addresses the problem about movie reviews on social media specifically Twitter; where it will gather the tweets on movie reviews and display a rating based on the sentiment of the tweet. Twitter is an online social media website where people from all walks of life communicate by tweeting short updates without exceeding the character limit which is 240 characters. Twitter is continuously growing as a business and became one of the biggest platform for communication and instant messaging. Due to the large number of users, there are voluminous amounts of data available that can be used for more in depth information and insights and to get the sentiments from analysing the tweets. In today’s world, there are many applications that are using sentiment analysis in various fields such as to gets insights about a particular brand or product. To do sentiment analysis using the traditional ways can be time consuming and becomes very complex. The aim of this research is to investigate about the domain of sentiment analysis and incorporate a machine learning algorithm to create a system that is able to get and display the ratings of a particular movie. The machine learning algorithms used are Naïve Bayes Classifier and SVM. The algorithm with better accuracy will be chosen for the implementation phase.


2018 ◽  
Vol 1 (2) ◽  
pp. 24-32
Author(s):  
Lamiaa Abd Habeeb

In this paper, we designed a system that extract citizens opinion about Iraqis government and Iraqis politicians through analyze their comments from Facebook (social media network). Since the data is random and contains noise, we cleaned the text and builds a stemmer to stem the words as much as possible, cleaning and stemming reduced the number of vocabulary from 28968 to 17083, these reductions caused reduction in memory size from 382858 bytes to 197102 bytes. Generally, there are two approaches to extract users opinion; namely, lexicon-based approach and machine learning approach. In our work, machine learning approach is applied with three machine learning algorithm which are; Naïve base, K-Nearest neighbor and AdaBoost ensemble machine learning algorithm. For Naïve base, we apply two models; Bernoulli and Multinomial models. We found that, Naïve base with Multinomial models give highest accuracy.


Author(s):  
Sushant Keni ◽  
Priyanka Jadhav ◽  
Mayur Patil ◽  
Prof. Sonal Chaudhari

We evaluate the feasibility of using Facebook data to enhance the effectiveness of a recruitment system, especially for résumé verification and recognize the personality by using social network analysis methods. In the industries employee’s personality is very important in the workplace which will help to growth of the company and give more good service to the client. Currently resume verification is based on trustful third parties who does background verification. Based on this report is sent to the company who is hiring the employee decides to keep employee or not. This manual system usually takes lots of time and this system generally wont display candidates’ nature towards society (in short how he behaves in society weather he posts something wrong on social media in simple words his/her personality). Social media now a days is huge platform where user generally spends too much time on social media like Facebook, LinkedIn etc. like posting a page, commenting, liking the post, certification uploading, adding friends. We are going to design such a system that verifies genuineness of user by scraping or exploring data from Facebook or LinkedIn or both. we are exploring post of person and classifies it into is it technology related, violence related and many more what are the comments he gives on his post how he reacts his language of handling a query will be parsed and classified using machine learning algorithm of previously trained dataset using SVM. And at the end we will show this information to the company to make their own decision based on this result.


2018 ◽  
pp. 1-11 ◽  
Author(s):  
Julian C. Hong ◽  
Donna Niedzwiecki ◽  
Manisha Palta ◽  
Jessica D. Tenenbaum

Purpose Patients undergoing radiotherapy (RT) or chemoradiotherapy (CRT) may require emergency department evaluation or hospitalization. Early identification may direct preventative supportive care, improving outcomes and reducing health care costs. We developed and evaluated a machine learning (ML) approach to predict these events. Methods A total of 8,134 outpatient courses of RT and CRT from a single institution from 2013 to 2016 were identified. Extensive pretreatment data were programmatically extracted and processed from the electronic health record (EHR). Training and internal validation cohorts were randomly generated (3:1 ratio). Gradient tree boosting (GTB), random forest, support vector machine, and least absolute shrinkage and selection operator logistic regression approaches were trained and internally validated based on area under receiver operating characteristic (AUROC) curve. The most predictive ML approach was also evaluated using only disease- and treatment-related factors to assess predictive gain of extensive EHR data. Results All methods had high predictive accuracy, particularly GTB (validation AUROC, 0.798). Extensive EHR data beyond disease and treatment information improved accuracy (delta AUROC, 0.056). A Youden-based cutoff corresponded to validation sensitivity of 81.0% (175 of 216 courses with events) and specificity of 67.3% (1,218 of 1811 courses without events). Interpretability is an important advantage of GTB. Variable importance identified top predictive factors, including treatment (planned RT and systemic therapy), pretreatment encounters (emergency department visits and admissions in the year before treatment), vital signs (weight loss and pain score in the year before treatment), and laboratory values (albumin level at weeks before treatment). Conclusion ML predicts emergency visits and hospitalization during cancer therapy. Incorporating predictions into clinical care algorithms may help direct personalized supportive care, improve quality of care, and reduce costs. A prospective trial investigating ML-assisted direction of increased clinical assessments during RT is planned.


2021 ◽  
Vol 22 (1) ◽  
pp. 78-92
Author(s):  
GA Buntoro ◽  
R Arifin ◽  
GN Syaifuddiin ◽  
A Selamat ◽  
O Krejcar ◽  
...  

In 2019, citizens of Indonesia participated in the democratic process of electing a new president, vice president, and various legislative candidates for the country. The 2019 Indonesian presidential election was very tense in terms of the candidates' campaigns in cyberspace, especially on social media sites such as Facebook, Twitter, Instagram, Google+, Tumblr, LinkedIn, etc. The Indonesian people used social media platforms to express their positive, neutral, and also negative opinions on the respective presidential candidates. The campaigning of respective social media users on their choice of candidates for regents, governors, and legislative positions up to presidential candidates was conducted via the Internet and online media. Therefore, the aim of this paper is to conduct sentiment analysis on the candidates in the 2019 Indonesia presidential election based on Twitter datasets. The study used datasets on the opinions expressed by the Indonesian people available on Twitter with the hashtags (#) containing "Jokowi and Prabowo." We conducted data pre-processing using a selection of comments, data cleansing, text parsing, sentence normalization and tokenization based on the given text in the Indonesian language, determination of class attributes, and, finally, we classified the Twitter posts with the hashtags (#) using Naïve Bayes Classifier (NBC) and a Support Vector Machine (SVM) to achieve an optimal and maximum optimization accuracy. The study provides benefits in terms of helping the community to research opinions on Twitter that contain positive, neutral, or negative sentiments. Sentiment Analysis on the candidates in the 2019 Indonesian presidential election on Twitter using non-conventional processes resulted in cost, time, and effort savings. This research proved that the combination of the SVM machine learning algorithm and alphabetic tokenization produced the highest accuracy value of 79.02%. While the lowest accuracy value in this study was obtained with a combination of the NBC machine learning algorithm and N-gram tokenization with an accuracy value of 44.94%. ABSTRAK: Pada tahun 2019 rakyat Indonesia telah terlibat dalam proses demokrasi memilih presiden baru, wakil presiden, dan berbagai calon legislatif negara. Pemilihan presiden Indonesia 2019 sangat tegang dalam kempen calon di ruang siber, terutama di laman media sosial seperti Facebook, Twitter, Instagram, Google+, Tumblr, LinkedIn, dll. Rakyat Indonesia menggunakan platfom media sosial bagi menyatakan pendapat positif, berkecuali, dan juga negatif terhadap calon presiden masing-masing. Kampen pencalonan menteri, gabenor, dan perundangan hingga pencalonan presiden dilakukan melalui media internet dan atas talian. Oleh itu, kajian ini dilakukan bagi menilai sentimen terhadap calon pemilihan presiden Indonesia 2019 berdasarkan kumpulan data Twitter. Kajian ini menggunakan kumpulan data yang diungkapkan oleh rakyat Indonesia yang terdapat di Twitter dengan hashtag (#) yang mengandungi "Jokowi dan Prabowo." Proses data dibuat menggunakan pilihan komentar, pembersihan data, penguraian teks, normalisasi kalimat, dan tokenisasi teks dalam bahasa Indonesia, penentuan atribut kelas, dan akhirnya, pengklasifikasian catatan Twitter dengan hashtag (#) menggunakan Klasifikasi Naïve Bayes (NBC) dan Mesin Vektor Sokongan (SVM) bagi mencapai ketepatan optimum dan maksimum. Kajian ini memberikan faedah dari segi membantu masyarakat meneliti pendapat di Twitter yang mengandungi sentimen positif, neutral, atau negatif. Analisis Sentimen terhadap calon dalam pemilihan presiden Indonesia 2019 di Twitter menggunakan proses bukan konvensional menghasilkan penjimatan kos, waktu, dan usaha. Penyelidikan ini membuktikan bahawa gabungan algoritma pembelajaran mesin SVM dan tokenisasi abjad menghasilkan nilai ketepatan tertinggi iaitu 79.02%. Manakala nilai ketepatan terendah dalam kajian ini diperoleh dengan kombinasi algoritma pembelajaran mesin NBC dan tokenisasi N-gram dengan nilai ketepatan 44.94%.


Sign in / Sign up

Export Citation Format

Share Document