Big data experiments with the archived Web: Methodological reflections on studying the development of a nation's Web

Author(s):  
Niels Brügger ◽  
Janne Nielsen ◽  
Ditte Laursen

This article outlines how the 'digital geography' of a nation can be studied, that is the online presence of one nation. The entire Danish Web domain and its development from 2006 to 2015 is used as a case, based on the holdings in the Danish national Web archive. The following research questions guide the investigation: What has the Danish Web domain looked like in the past, and how has it developed in the period 2006-2015? Methodologically, we investigate to what extent one can delimit 'a nation' on the Web, and what characterizes the archived Web as a historical source for academic studies, as well as the general characteristics of our specific data source. Analytically, the article introduces a design for how this type of big data analyses of an entire national Web domain can be performed. Our findings show some of the ways in which a nation's digital landscape can be mapped, ie. on size, content types and hyperlinks. On a broader canvas, this study demonstrates that with hard- and software as well as human competencies from different disciplines it is possible to perform large-scale historical studies of one of the biggest media sources of today, the World Wide Web.

Epidemiologia ◽  
2021 ◽  
Vol 2 (3) ◽  
pp. 315-324
Author(s):  
Juan M. Banda ◽  
Ramya Tekumalla ◽  
Guanyu Wang ◽  
Jingyuan Yu ◽  
Tuo Liu ◽  
...  

As the COVID-19 pandemic continues to spread worldwide, an unprecedented amount of open data is being generated for medical, genetics, and epidemiological research. The unparalleled rate at which many research groups around the world are releasing data and publications on the ongoing pandemic is allowing other scientists to learn from local experiences and data generated on the front lines of the COVID-19 pandemic. However, there is a need to integrate additional data sources that map and measure the role of social dynamics of such a unique worldwide event in biomedical, biological, and epidemiological analyses. For this purpose, we present a large-scale curated dataset of over 1.12 billion tweets, growing daily, related to COVID-19 chatter generated from 1 January 2020 to 27 June 2021 at the time of writing. This data source provides a freely available additional data source for researchers worldwide to conduct a wide and diverse number of research projects, such as epidemiological analyses, emotional and mental responses to social distancing measures, the identification of sources of misinformation, stratified measurement of sentiment towards the pandemic in near real time, among many others.


2021 ◽  
Vol 49 (3) ◽  
pp. 3-11
Author(s):  
A. V. Sokov

This year in 2021, Shirshov Institute of Oceanology celebrated 75 years old. Shirshov Institute is the largest and oldest research center of seas and oceans in Russia. In the past and present of the Institute, there are many significant discoveries and developments for world oceanology, the most complex expeditions and large-scale international projects. I am sure that our future as a Center for the Study of the World Ocean will be no less rich and bright.


APRIA Journal ◽  
2021 ◽  
Vol 3 (2) ◽  
pp. 35-50
Author(s):  
Marijke Goeting

During the past decade, computers have broken through the barrier of human time. Today, computers can process data in milli-, micro- and even nanoseconds and can (inter) act autonomously in time frames that exceed our capacity to perceive and respond to. This produces a fundamental problem – a gap between human time and the time of computers – and raises important questions: how do big data and fast computation affect our experience and understanding of time? If a computer is able to deal with the world faster than we can, are we doomed to live forever in the past, however near the present? Or are we dealing with a technological extension of the present, and how might we be able to understand and experience this? By analysing theory and works of art, this text examines how to deal with the shock produced by microtemporal technologies.


Author(s):  
Joaquin Vanschoren ◽  
Ugo Vespier ◽  
Shengfa Miao ◽  
Marvin Meeng ◽  
Ricardo Cachucho ◽  
...  

Sensors are increasingly being used to monitor the world around us. They measure movements of structures such as bridges, windmills, and plane wings, human’s vital signs, atmospheric conditions, and fluctuations in power and water networks. In many cases, this results in large networks with different types of sensors, generating impressive amounts of data. As the volume and complexity of data increases, their effective use becomes more challenging, and novel solutions are needed both on a technical as well as a scientific level. Founded on several real-world applications, this chapter discusses the challenges involved in large-scale sensor data analysis and describes practical solutions to address them. Due to the sheer size of the data and the large amount of computation involved, these are clearly “Big Data” applications.


Author(s):  
Cheng Meng ◽  
Ye Wang ◽  
Xinlian Zhang ◽  
Abhyuday Mandal ◽  
Wenxuan Zhong ◽  
...  

With advances in technologies in the past decade, the amount of data generated and recorded has grown enormously in virtually all fields of industry and science. This extraordinary amount of data provides unprecedented opportunities for data-driven decision-making and knowledge discovery. However, the task of analyzing such large-scale dataset poses significant challenges and calls for innovative statistical methods specifically designed for faster speed and higher efficiency. In this chapter, we review currently available methods for big data, with a focus on the subsampling methods using statistical leveraging and divide and conquer methods.


Author(s):  
Khadija Ateya Almohsen ◽  
Huda Kadhim Al-Jobori

The increasing usage of e-commerce website has led to the emergence of Recommender System (RS) with the aim of personalizing the web content for each user. One of the successful techniques of RSs is Collaborative Filtering (CF) which makes recommendations for users based on what other like-mind users had preferred. However, as the world enter Big Data era, CF has faced some challenges such as: scalability, sparsity and cold start. Thus, new approaches that overcome the existing problems have been studied such as Singular Value Decomposition (SVD). This chapter surveys the literature of RSs, reviews the current state of RSs with the main concerns surrounding them due to Big Data, investigates thoroughly SVD and provides an implementation to it using Apache Hadoop and Spark. This is intended to validate the applicability of, existing contributions to the field of, SVD-based RSs as well as validated the effectiveness of Hadoop and spark in developing large-scale systems. The results proved the scalability of SVD-based RS and its applicability to Big Data.


1973 ◽  
Vol 30 (12) ◽  
pp. 2172-2177
Author(s):  
P. C. George

Small-scale fisheries have traditionally been the backbone of the fishing industry all over the world. Although large-scale mechanized fishing has come into the limelight recently, even such countries as have developed substantial capability in this direction still have a large fleet of small boats in coastal areas. The landings of this sector of the industry are always substantial, and in many countries they still dominate the picture. In India, small-scale fisheries landed almost 1.15 million tons in 1971. This figure has been increasing as motor-powered small craft have increased in numbers, although 70% of marine fish is still caught from nonpowered boats. Measures taken to increase fishing capacity, landings, and net fishermen’s income over the past 10 years include various kinds of loans and subsidies for the purchase of boats, motors, and nets; assistance for the construction of ponds in inland areas; organization of cooperatives; training programs for fishermen and supporting personnel, especially motor repairmen (with the cooperation of Norway); and gear and vessel research including pilot-scale demonstrations with new types of vessels and equipment.


2017 ◽  
Vol 13 (1) ◽  
Author(s):  
Rod Oram

Humankind has been searching for millennia for ways to govern itself at large scale and over great distances. Overwhelmingly, the dominant solution had been the creation of empires, defined as multi-ethnic or multinational states with political and/or military dominion over populations who are culturally and ethnically distinct from the ruling imperial ethnic group and its culture. In the modern Westphalian era of the past several centuries, a hybrid system of governance around the world emerged, comprising the nation state (in Europe and the Americas) and international empires (across Africa, Asia and Oceania).


2018 ◽  
Vol 33 (1) ◽  
pp. 73-88 ◽  
Author(s):  
Ester Pollack ◽  
Sigurd Allern

Transparency International’s yearly Corruption Perceptions Index ranks Scandinavia as one of the least corrupt regions in the world. However, during the past decades, large Scandinavian corporations in the telecommunications, oil and defence industries have – in their struggle for business contracts in other countries – been involved in several large-scale bribery scandals. There has also been a growing range of corruption cases in the Swedish and Norwegian public sectors. In many of these cases, investigative journalists have played a crucial role in the disclosure of corruption, sometimes cooperating across media organisations and countries, demonstrating the importance of journalism as a public good for democracy. In this article, we explore, discuss and analyse the work of and methods used by investigative journalists in revealing large-scale corruption related to the expansion of Nordic telecom companies in Uzbekistan.


2016 ◽  
Vol 5 (1) ◽  
pp. 115
Author(s):  
Maharani Widya Putri ◽  
Erwin Oktoma ◽  
Roni Nursyamsu

This descriptive qualitative research was about the analysis of figurative language in English stand-up comedy. The purposes of this study were to identify the types of figurative language and to describe the functions of figurative language found in the selected video of stand-up comedy show. The data source was taken from one of selected videos of Russell Peters stand-up comedy show. Russell Peters’s speech contained about figurative language in the video is observed. The data were collected through content analysis technique by collecting the verbal language used by Russell Peters. The first research questions was analyzed by McArthur (1992) theory and supported by Crystal (1994) theory to find out the types of figurative language found in English stand-up comedy. To answer the second research questions about the functions of figurative language found in English stand-up comedy was analyzed by Chunqi (2014) theory and suppoted by Kokemuller (2001) theory and Turner (2016) theory. After analyzing data, it was found that Irony was the most dominant figurative language used by Russell Peters in “Russell Peters Comedy Now! Uncensored” with 29.94%. It was happened because the kind of topics used by Russell Peters in that show were about ethnics (canadian, white people, black people, brown people and asian), society case (beating child) and culture (accent and life style of various ethnics in the world, habitual of various ethnics in the world). Irony and Hyperbole were needed dominantly in the performance, to entertain the audiences in the stand-up comedy show. The function of eleven types of figurative language which were used by Russell were concluded. The functions were to amuse people in comedic situations, to expand meaning, to explain abstract emotions, to make sentence interesting represented and give creative additions. Keywords: Figurative Language, Stand-Up Comedy, English Stand-Up Comedy


Sign in / Sign up

Export Citation Format

Share Document