scholarly journals Online Newspaper Clustering in Aceh using the Agglomerative Hierarchical Clustering Method

Author(s):  
Rizal Tjut Adek ◽  
Rozzy Kesuma Dinata ◽  
Ananda Ditha

The rapid progress in the field of information technology, especially the internet, has given birth to a lot of information. The ease of publishing an article on a website causes an explosion of news pages which will certainly confuse readers. The diversity and the increasing number of news articles make it increasingly difficult for internet users to find news and large piles of news data on online newspaper sites in Aceh. The grouping of text documents is needed to classify news in online newspapers in Aceh based on the content contained in news articles. In this study, the process of grouping online news in Aceh was tried using the Agglomerative Hierarchical Clustering method. News is grouped with a Bottom-Up design strategy that starts with placing each object as a cluster then combined into a larger cluster based on the similarity of keywords in each news, then the cluster results are compared and put into each news category. The research design was carried out in a structured manner using data flow diagrams in forming the research framework. The study was conducted by taking online news text data on 10 online news websites in Aceh from July 2016 to March 2017 with 1000 randomly generated documents. The process of crawling news data is done using a php script which will only take text files from the news on the website. News grouping is done based on religion, politics, law, sports, tourism, education, culture, economy and technology. The results of the grouping performance of the Agglomerative Hierarchical Clustering method in this study have an average accuracy of 89.84%.

2010 ◽  
Vol 20 (6) ◽  
pp. 771-777 ◽  
Author(s):  
Hauke Riesch

With the newspapers’ recent move to online reporting, traditional norms and practices of news reporting have changed to accommodate the new realities of online news writing. In particular, online news is much more fluid and prone to change in content than the traditional hard-copy newspapers – online newspaper articles often change over the course of the following days or even weeks as they respond to criticisms and new information becoming available. This poses a problem for social scientists who analyse newspaper coverage of science, health and risk topics, because it is no longer clear who has read and written what version, and what impact they potentially had on the national debates on these topics. In this note I want to briefly flag up this problem through two recent examples of UK national science stories and discuss the potential implications for PUS media research.


2021 ◽  
Author(s):  
Khanh Quoc Tran ◽  
Phap Ngoc Trinh ◽  
Khoa Nguyen-Anh Tran ◽  
An Tran-Hoai Le ◽  
Luan Van Ha ◽  
...  

In this paper, we build a new dataset UIT-ViON (Vietnamese Online Newspaper) collected from well-known online newspapers in Vietnamese. We collect, process, and create the dataset, then experiment with different machine learning models. In particular, we propose an open-domain, large-scale, and high-quality dataset consisting of 260,000 textual data points annotated with multiple labels for evaluating Vietnamese short text classification. In addition, we present the proposed approach using transformer-based learning (PhoBERT) for Vietnamese short text classification on the dataset, which outperforms traditional machine learning (Naive Bayes and Logistic Regression) and deep learning (Text-CNN and LSTM). As a result, the proposed approach achieves the F1-score of 80.62%. This is a positive result and a premise for developing an automatic news classification system. The study is proposed to significantly save time, costs, and human resources and make it easier for readers to find news related to their interesting topics. In future, we will propose solutions to improve the quality of the dataset and improve the performance of classification models.


Author(s):  
Noemi Festic ◽  
Moritz Büchi ◽  
Michael Latzer

Testing communication theories requires a valid empirical basis, yet especially for usage time measures, retrospective self-reports have shown to be biased. This study draws on a unique data set of 923 Swiss internet users who had their internet use tracked for at least 30 days on mobile and desktop devices and took part in a survey covering internet usage as well as person-level background variables. The analysis focuses on active usage time overall and on the major services Google Search, YouTube, WhatsApp, Instagram, Facebook, and the online newspaper 20 Minuten. The results showed that overall internet usage time was lower for older and higher-educated users based on both the tracking and survey data, and the reported usage time was consistently higher than the tracked usage time. The tracking data further revealed that internet users in all social groups spent the majority of their time online on a mobile device. The number of users of the major services varied mainly between age groups. These differences were less pronounced when it came to the time users spent engaging with these services. Over the course of a day, the major services varied in their frequency of use: for example, messaging peaked before noon and in the late afternoon, whereas online news use was comparably constant at a lower level.


Időjárás ◽  
2020 ◽  
Vol 124 (4) ◽  
pp. 499-519
Author(s):  
Golub Ćulafić ◽  
Tatjana Popov ◽  
Slobodan Gnjato ◽  
Davorin Bajić ◽  
Goran Trbić ◽  
...  

The paper analyses, spatial and temporal patterns of precipitation over Montenegro. Data on mean monthly precipitation during the period 1961–2015 from 17 meteorological stations were used for the analysis. Four regions with different spatial precipitation regimes were identified by using the principal component analysis and the agglomerative hierarchical clustering method. A downward tendency in annual precipitation prevails over Montenegro. The most prominent reduction was present in the summer season. In contrast, precipitation increased during autumn. However, the majority of estimated trend values was low and statistically insignificant.


Sign in / Sign up

Export Citation Format

Share Document