scholarly journals Performance analysis of a keyword search system

Author(s):  
Mustafa Abdalrassual Jassim

Data mining is the process of discovering patterns in a data set by keyword. Keyword search is the most effective way to discover information in documents. But somewhere, sometimes just searching for a keyword is not enough; with research restricting that keyword has become a necessity. Like in social media abuse of word is increasing. Many systems worked on only detecting an inappropriate word; not on restriction of that word. So here in this paper keyword search method is proposed for social media which not only finds the inappropriate words, but also restrict that word from publishing on the media.

2020 ◽  
Author(s):  
Victoria Yantseva

This study undertakes a systematic analysis of media discourse on migration in Sweden from 2012 to 2019. Using a novel data set consisting of mainstream newspapers, Twitter and forum data, the study answers two questions: What do Swedish media actually talk about when they talk about “migration”? And how do they talk about it? Using a combination of computational text analysis tools, I analyze a shift in the media discourse seen as one of the outcomes of the European refugee crisis in 2015 and try to understand the role of social media in this process. The results of the study indicate that messages on social media generally had negative tonality and suggest that some of the media frames can be attributed to a migration-hostile discourse. At the same time, the analysis of framing and sentiment dynamics provides little evidence for the discourse shift and any long-term effects of the European refugee crisis on the Swedish media discourse. Rather, one can hypothesize that the role of the crisis should be viewed in a broader political and historical context.


2018 ◽  
Vol 37 (2) ◽  
pp. 87-102 ◽  
Author(s):  
Li Zhao ◽  
Chao Min

With the advent of modern cognitive computing technologies, fashion informatics researchers contribute to the academic and professional discussion about how a large-scale data set is able to reshape the fashion industry. Data-mining-based social network analysis is a promising area of fashion informatics to investigate relations and information flow among fashion units. By adopting this pragmatic approach, we provide dynamic network visualizations of the case of Paris Fashion Week. Three time periods were researched to monitor the formulation and mobilization of social media users’ discussions of the event. Initial textual data on social media were crawled, converted, calculated, and visualized by Python and Gephi. The most influential nodes (hashtags) that function as junctions and the distinct hashtag communities were identified and represented visually as graphs. The relations between the contextual clusters and the role of junctions in linking these clusters were investigated and interpreted.


10.2196/16962 ◽  
2020 ◽  
Vol 22 (7) ◽  
pp. e16962
Author(s):  
Hejing Liu ◽  
Qiudan Li ◽  
Yongcheng Zhan ◽  
Zhu Zhang ◽  
Daniel D Zeng ◽  
...  

Background Stopping the epidemic of e-cigarette use among youth has become the common goal of both regulatory authorities and health departments. JUUL is currently the most popular e-cigarette brand on the market. Young people usually obtain and exchange information about JUUL with the help of social media platforms. Along with the rising prevalence of JUUL, posts about underage JUUL buying and selling have appeared on social media platforms such as Reddit, which sharply increase the risk of minors being exposed to JUUL. Objective This study aims to analyze Reddit messages about JUUL buying and selling among the users of the UnderageJuul subreddit, and to further summarize the characteristics of those messages. The findings and insights can contribute to a better understanding of the patterns of underage JUUL use, and help public health officials provide timely education and guidance to minors who have intentions of accessing JUUL. Methods We used a novel cross-subreddit method to analyze the Reddit messages on 2 subreddits. From July 9, 2017, to January 7, 2018, we collected data from the UnderageJuul subreddit, which was created for underage JUUL use discussion. The data set included 716 threads, 2935 comments, and 844 Reddit users (ie, Redditors). We collected our second data set, comprising 23,840 threads and 162,106 comments posted between July 9, 2017, and January 8, 2019, from the JUUL subreddit. We conducted analyses including the following: (1) annotation of users with buying/selling intention, (2) posting patterns discovery and topic comparison, and (3) posting activeness observation of discovered Redditors. Term frequency–inverse document frequency and regular expression-enhanced keyword search methods were applied during the content analysis to extract the posting patterns. The public posting records of the discovered users on the JUUL subreddit during the year after the UnderageJuul subreddit was shut down were analyzed to determine whether they were still active and interested in obtaining JUUL. Results Our study revealed the following: (1) Among the 716 threads on the UnderageJuul subreddit, there were 214 threads related to JUUL sale and 168 threads related to JUUL purchase, which accounted for 53.5% (382/714) of threads. (2) Among the 844 Redditors of the UnderageJuul subreddit, 23.82% (201/844) of users were annotated with buying intention, and 21.10% (178/844) of users were annotated with selling intention. There were 34 users with buying/selling intention that self-reported as being <21 years old. (3) The most common key phrases used in selling threads were “WTS,” “want to sell,” “for sale,” and “selling” (154/214, 72.0%). The most common key phrases used in buying threads were “look for/get JUUL/pods” (58/168, 34.5%) and “WTB” (53/168, 31.5%). (4) The most important concern that UnderageJuul Redditors had in obtaining JUULs was the price (311/1306, 23.81%), followed by the delivery service (68/1306, 5.21%). (5) The most popular flavors among the users with buying/selling intention were mango, cucumber, and mint. The flavor preferences remained consistent on both subreddits. Adverse symptoms related to the mango flavor were reported by 3 users on the JUUL subreddit. (6) In total, 24.4% (49/201) of users wanted to buy JUULs and 46.6% (83/178) of users wanted to sell JUULs, including 11 self-reported underage users, who also participated in the discussions on the JUUL subreddit. (7) Within one year of the UnderageJuul subreddit shutting down, there were 40 users who continued to post 186 threads on the JUUL subreddit, including 10 threads indicating buying/selling willingness that were posted shortly after the UnderageJuul subreddit was closed. Conclusions There were overlapping users active in the JUUL and UnderageJuul subreddits. The buying/selling-related content appeared in multiple venues with certain posting patterns from July 9, 2017, to January 7, 2018. Such content might lead to a high risk of health problems for minors, such as nicotine addiction. Based on these findings, this study provided some insights and suggestions that might contribute to the decision-making processes of regulators and public health officials.


Data Mining is one of the most successful domains in research. It describes the past and speculates the future for analysis. There are several techniques used in data mining. Among them classification is one of the main data mining techniques based on machine learning. In classification technique data set is classified into predefined set of groups or classes. Mathematical techniques such as decision tree, linear regression, neural networks and statistics are used for classification methods. Classification is a problem to identify which set of categories the new observation belongs to using training data set. This paper analyses the data taken from social media and uses the classification algorithm for making a comparative study on social advertisement using python.


2019 ◽  
Author(s):  
Hejing Liu ◽  
Qiudan Li ◽  
Yongcheng Zhan ◽  
Zhu Zhang ◽  
Daniel D Zeng ◽  
...  

BACKGROUND Stopping the epidemic of e-cigarette use among youth has become the common goal of both regulatory authorities and health departments. JUUL is currently the most popular e-cigarette brand on the market. Young people usually obtain and exchange information about JUUL with the help of social media platforms. Along with the rising prevalence of JUUL, posts about underage JUUL buying and selling have appeared on social media platforms such as Reddit, which sharply increase the risk of minors being exposed to JUUL. OBJECTIVE This study aims to analyze Reddit messages about JUUL buying and selling among the users of the UnderageJuul subreddit, and to further summarize the characteristics of those messages. The findings and insights can contribute to a better understanding of the patterns of underage JUUL use, and help public health officials provide timely education and guidance to minors who have intentions of accessing JUUL. METHODS We used a novel cross-subreddit method to analyze the Reddit messages on 2 subreddits. From July 9, 2017, to January 7, 2018, we collected data from the UnderageJuul subreddit, which was created for underage JUUL use discussion. The data set included 716 threads, 2935 comments, and 844 Reddit users (ie, Redditors). We collected our second data set, comprising 23,840 threads and 162,106 comments posted between July 9, 2017, and January 8, 2019, from the JUUL subreddit. We conducted analyses including the following: (1) annotation of users with buying/selling intention, (2) posting patterns discovery and topic comparison, and (3) posting activeness observation of discovered Redditors. Term frequency–inverse document frequency and regular expression-enhanced keyword search methods were applied during the content analysis to extract the posting patterns. The public posting records of the discovered users on the JUUL subreddit during the year after the UnderageJuul subreddit was shut down were analyzed to determine whether they were still active and interested in obtaining JUUL. RESULTS Our study revealed the following: (1) Among the 716 threads on the UnderageJuul subreddit, there were 214 threads related to JUUL sale and 168 threads related to JUUL purchase, which accounted for 53.5% (382/714) of threads. (2) Among the 844 Redditors of the UnderageJuul subreddit, 23.82% (201/844) of users were annotated with buying intention, and 21.10% (178/844) of users were annotated with selling intention. There were 34 users with buying/selling intention that self-reported as being &lt;21 years old. (3) The most common key phrases used in selling threads were “WTS,” “want to sell,” “for sale,” and “selling” (154/214, 72.0%). The most common key phrases used in buying threads were “look for/get JUUL/pods” (58/168, 34.5%) and “WTB” (53/168, 31.5%). (4) The most important concern that UnderageJuul Redditors had in obtaining JUULs was the price (311/1306, 23.81%), followed by the delivery service (68/1306, 5.21%). (5) The most popular flavors among the users with buying/selling intention were mango, cucumber, and mint. The flavor preferences remained consistent on both subreddits. Adverse symptoms related to the mango flavor were reported by 3 users on the JUUL subreddit. (6) In total, 24.4% (49/201) of users wanted to buy JUULs and 46.6% (83/178) of users wanted to sell JUULs, including 11 self-reported underage users, who also participated in the discussions on the JUUL subreddit. (7) Within one year of the UnderageJuul subreddit shutting down, there were 40 users who continued to post 186 threads on the JUUL subreddit, including 10 threads indicating buying/selling willingness that were posted shortly after the UnderageJuul subreddit was closed. CONCLUSIONS There were overlapping users active in the JUUL and UnderageJuul subreddits. The buying/selling-related content appeared in multiple venues with certain posting patterns from July 9, 2017, to January 7, 2018. Such content might lead to a high risk of health problems for minors, such as nicotine addiction. Based on these findings, this study provided some insights and suggestions that might contribute to the decision-making processes of regulators and public health officials.


Data mining and prediction systems have been the center of attraction since information retrieval came into existence. Most IT companies spend a lot of resources on such analysis and systems to improve their performance and generate more revenue depending on the nature of work that they do. Online News Feed Prediction System aims to provide an analysis and comparison of various prediction techniques by using different methods of implementation. UCI repository contains a collection of databases pertaining to different topics. News popularity in multiple social media is one such dataset containing information about news topics from different sources, sentiment analysis of title and headline, topic that they are related to, publishing date, popularity score in various social media platforms. Python, R and Weka have been used on this data set to implement data preprocessing, visualization and prediction techniques like Random Forest, Decision Tree and SVM. Moreover, there is dataset on the analysis of the score for every twenty minutes for the social media platforms chosen. Analysis on these platforms helps in developing a system to reach a wider audience. News agencies can use this system to increase their profit and visibility. This paper aims to realize the ways to obtain these results


2020 ◽  
Vol 6 (4) ◽  
pp. 205630512098105
Author(s):  
Victoria Yantseva

This study undertakes a systematic analysis of media discourse on migration in Sweden from 2012 to 2019. Using a novel data set consisting of mainstream newspapers, Twitter and forum data, the study answers two questions: What do Swedish media actually talk about when they talk about “migration”? And how do they talk about it? Using a combination of computational text analysis tools, I analyze a shift in the media discourse seen as one of the outcomes of the European refugee crisis in 2015 and try to understand the role of social media in this process. The results of the study indicate that messages on social media generally had negative tonality and suggest that some of the media frames can be attributed to a migration-hostile discourse. At the same time, the analysis of framing and sentiment dynamics provides little evidence for the discourse shift and any long-term effects of the European refugee crisis on the Swedish media discourse. Rather, one can hypothesize that the role of the crisis should be viewed in a broader political and historical context.


2015 ◽  
Vol 3 (3) ◽  
pp. 5
Author(s):  
Dr. Neha Sharma

Language being a potent vehicle of transmitting cultural values, norms and beliefs remains a central factor in determining the status of any nation. India is a multilingual country which tends to encourage people to use English at national and international level. Basically English in India owes its presence to the British but its subsequent rise is not fully attributable to the British. It has now become the language of wider communication which is now spoken by large number of people all over the world. It is influenced by many factors such as class, society, developments in science and technology etc. However the major influence on English language is and has been the media.


Sign in / Sign up

Export Citation Format

Share Document