A content-based technique for linking dual language news articles in an archive

2020 ◽  
pp. 016555152093761 ◽  
Author(s):  
Muzammil Khan ◽  
Arif Ur Rahman ◽  
Arshad Ahmad ◽  
Sarwar Shah Khan

To retrieve a specific news article from a vast archive containing multilingual news articles against a user query or based on similarity among news articles is a challenging task. The task becomes even further complicated when the archive contains articles from a low resourced and morphologically complex language like Urdu, along with English new articles. The article proposes a content-based (lexical) similarity measure, that is, Common Ratio Measure for Dual Language (CRMDL), for linking digital news articles published in various online news sources. The similarity measure links Urdu-to-English news articles during the preservation process using an Urdu-to-English lexicon. A literature review showed that an Urdu-to-English lexicon did not exist, and therefore, the first task was to build a lexicon from multiple sources. The proposed similarity measure, that is, CRMDL, is evaluated rigorously on different data sets, of varying sizes, to assess the effectiveness. The experimental results show that the proposed measure is feasible and effective for similarity computation between Urdu and English news articles, which can obtain, on average, 50% precision and 67% recall. The performance can be improved sufficiently by managing the limitations summarised in the study.

2021 ◽  
pp. 000276422110216
Author(s):  
Jasmine Lorenzini ◽  
Hanspeter Kriesi ◽  
Peter Makarov ◽  
Bruno Wüest

Protest event analysis is a key method to study social movements, allowing to systematically analyze protest events over time and space. However, the manual coding of protest events is time-consuming and resource intensive. Recently, advances in automated approaches offer opportunities to code multiple sources and create large data sets that span many countries and years. However, too often the procedures used are not discussed in details and, therefore, researchers have a limited capacity to assess the validity and reliability of the data. In addition, many researchers highlighted biases associated with the study of protest events that are reported in the news. In this study, we ask how social scientists can build on electronic news databases and computational tools to create reliable PEA data that cover a large number of countries over a long period of time. We provide a detailed description our semiautomated approach and we offer an extensive discussion of potential biases associated with the study of protest events identified in international news sources.


2021 ◽  
Vol 13 (20) ◽  
pp. 11328
Author(s):  
Alfonso Vara-Miguel ◽  
Cristina Sánchez-Blanco ◽  
Charo Sádaba Sádaba Chalezquer ◽  
Samuel Negredo

Digital news publishers strive to balance revenue streams in their business models: as standard advertising declines, alternatives for sustaining digital journalism arise in the forms of sponsored content, user donations and payments—one-off purchases, subscriptions or memberships, public or private grants, electronic commerce, events and consulting. An exhaustive study found 2874 active online news publications in Spain, and it observed the adoption of such models in early 2021. Advertising remains the most popular source of income for digital news operations (85.8%) and most sites rely on just one or two revenue streams (74.5%). We compare the cases in our census by their origin (digital-native or non-native), geography (local/regional or national/global) and topic scope (generalist or specialized). We find that traditional, national and specialized online media have a broader and more innovative revenue mix than digital-native, regional or local and general-interest news outlets. The comprehensiveness of this pioneering study sheds light for the first time on the risk that the lack of diversification and innovation in funding sources may imperil the financial sustainability of some online news operations in Spain, mostly those with a smaller scope and no backing from a traditional business, according to the results we present here.


2021 ◽  
Author(s):  
Jennifer Lee

The birth of the World Wide Web has made it convenient and cheaper to produce and transfer information to the receiver. Many online news sites provide information for free and the Internet and social media have brought on the affordance to self-publish and engage with the media. New media tools have made it easier to produce a variety of online blogs, magazines, digital papers and content aggregators. In the wake of the information era, journalism has developed into niche news sites, producing different types of news writing. By analyzing news accounts from the same event, this Major Research Paper compares how news language, content and structure deviate between traditional and alternative online news sites. The study reveals that alternative news sources tend to report their news in a more subjective manner, deviating from the goal of being objective, a fundamental element in traditional journalism. Analysis of how information is structured in the news articles also reveals that alternative news sites deviate from traditional forms of the inverted pyramid style (Kovach and Rosenstiel, 2007, p. 82), reporting in a narrative, chronological fashion.


2021 ◽  
Author(s):  
Subhayan Mukerjee

How do people in the world's largest democracy consume online news? This article reports findings from the analysis of a novel empirical dataset tracking the web-browsing behavior of more than 50,000 Indian internet users over 45 months. In doing so, it seeks to understand the digital news consumption landscape of a crucial, but understudied context and appraise the prominence and longitudinal trends of the audience share of different types of news sources in the online Indian space. It finds that while digital-born media have not contested the hegemony of legacy media, regional vernacular media have suffered significant declines in their audience shares. The article proposes the concept of audience mobility, using it to identify qualitatively distinct dynamics in how vernacular audiences in India have migrated to national vis-à-vis international outlets. The findings are discussed in light of contemporary changes in Indian society that is characterized by increasing digitization and literacy.


CCIT Journal ◽  
2012 ◽  
Vol 5 (2) ◽  
pp. 168-185
Author(s):  
Agustoni Agustoni ◽  
Fitri Maya Sari

With the rapid development of Internet, more and more also emerging sites or blogs that provide a wide range of online news articles. An article, before it can be published, originally sent by the reporter to editor to be sorted. Sorting type of news is relatively easily done by humans, but if the case was brought to a level of segregation in automation with computers will bring its own problems, although for a shorter story. Text mining is one way that is expected to solve the above problems. With text mining, can be searched words that can represent the content of news articles, then its category is determined based on the frequency of words contained in it. Stage by the author on the study are: (i) development of a database for the keyword vector, (ii) sorting of news sources based on the database of step (i). This paper is expected to help the electronic editorial system to be able to sort or find out the category of a news article without the need of an editor that saves time and cost of doing business on the model of an electronic news service on-line internet based.


2021 ◽  
pp. 089443932110325
Author(s):  
Jeong-woo Jang

News shared on social media presents multiple layers of sources, from reputable news organizations to individual users who share news on social media. A web-based experiment investigated (a) whether the influence of a primary news source (news organization) on viewers decreases as it becomes less proximate with the presence of a more immediate source (individual user who shared news), and (b) if so, how the evaluations of both sources, along with a varying degree of issue relevance, affect viewers’ agreement with news position. Participants read one news article either shared on Facebook by a well-known celebrity or directly posted onto a news website. The perceived credibility of news organizations predicted viewers’ agreement with the news position, but only when the news was presented on a news web page so that the news organization was shown as the proximate source. When multiple sources were displayed, the influence of news organization credibility disappeared when the given news lacked personal relevance.


2021 ◽  
Author(s):  
Jennifer Lee

The birth of the World Wide Web has made it convenient and cheaper to produce and transfer information to the receiver. Many online news sites provide information for free and the Internet and social media have brought on the affordance to self-publish and engage with the media. New media tools have made it easier to produce a variety of online blogs, magazines, digital papers and content aggregators. In the wake of the information era, journalism has developed into niche news sites, producing different types of news writing. By analyzing news accounts from the same event, this Major Research Paper compares how news language, content and structure deviate between traditional and alternative online news sites. The study reveals that alternative news sources tend to report their news in a more subjective manner, deviating from the goal of being objective, a fundamental element in traditional journalism. Analysis of how information is structured in the news articles also reveals that alternative news sites deviate from traditional forms of the inverted pyramid style (Kovach and Rosenstiel, 2007, p. 82), reporting in a narrative, chronological fashion.


Author(s):  
Meghan Lynch ◽  
Irena Knezevic ◽  
Kennedy Laborde Ryan

To date, most qualitative knowledge about individual eating patterns and the food environment has been derived from traditional data collection methods, such as interviews, focus groups, and observations. However, there currently exists a large source of nutrition-related data in social media discussions that have the potential to provide opportunities to improve dietetic research and practice. Qualitative social media discussion analysis offers a new tool for dietetic researchers and practitioners to gather insights into how the public discusses various nutrition-related topics. We first consider how social media discussion data come with significant advantages including low-cost access to timely ways to gather insights from the public, while also cautioning that social media data have limitations (e.g., difficulty verifying demographic information). We then outline 3 types of social media discussion platforms in particular: (i) online news article comment sections, (ii) food and nutrition blogs, and (iii) discussion forums. We discuss how each different type of social media offers unique insights and provide a specific example from our own research using each platform. We contend that social media discussions can contribute positively to dietetic research and practice.


2021 ◽  
Vol 8 (1) ◽  
Author(s):  
Hossein Ahmadvand ◽  
Fouzhan Foroutan ◽  
Mahmood Fathy

AbstractData variety is one of the most important features of Big Data. Data variety is the result of aggregating data from multiple sources and uneven distribution of data. This feature of Big Data causes high variation in the consumption of processing resources such as CPU consumption. This issue has been overlooked in previous works. To overcome the mentioned problem, in the present work, we used Dynamic Voltage and Frequency Scaling (DVFS) to reduce the energy consumption of computation. To this goal, we consider two types of deadlines as our constraint. Before applying the DVFS technique to computer nodes, we estimate the processing time and the frequency needed to meet the deadline. In the evaluation phase, we have used a set of data sets and applications. The experimental results show that our proposed approach surpasses the other scenarios in processing real datasets. Based on the experimental results in this paper, DV-DVFS can achieve up to 15% improvement in energy consumption.


Sign in / Sign up

Export Citation Format

Share Document