scholarly journals Going Back in Time to Find What Existed on the Web and How much has been Preserved: How much of Palestinian Web has been Archived?

2021 ◽  
Author(s):  
Thaer Sammar ◽  
Hadi Khalilia

The web is an important resource for publishing and sharing content. The main characteristic of the web is its volatility. Content is added, updated, and deleted all the time. Therefore, many national and international institutes started crawling and archiving the content of the web. The main focus of national institutes is to archive the web related to their country heritage, for example, the National Library of the Netherlands is focusing on archiving website that are of value to the Dutch heritage. However, there are still countries that haven’t taken the action to archive their web, which will result in loosing and having a gap in the knowledge. In this research, we focus on shedding the light on the Palestinian web. Precisely, how much of the Palestinian web has been archived. First, we create a list of Palestinian hosts that were on the web. For that we queried Google index exploiting the time range filter in order to get hosts overtime. We collected in 98 hosts in average in 5-years granularity from the year 1990 to 2019. We also obtained Palestinian hosts from the DMOZ directory. We collected 188 hosts. Second, we investigate the coverage of collected hosts in the Internet Archive and the Common-Crawl. We found that coverage of Google hosts in the Internet Archive ranges from 0% to 89% from oldest to newest time-granularity. The coverage of DMOZ hosts was 96%. The coverage of Google hosts in the Common-Crawl 57.1% to 74.3, while the coverage of DMOZ hosts in the Common-Crawl was in average 25% in all crawls. We found that even the host is covered in Internet Archive and Common-Crawl, the lifespan and the number of archived versions are low.

Leonardo ◽  
1999 ◽  
Vol 32 (5) ◽  
pp. 353-358 ◽  
Author(s):  
Noah Wardrip-Fruin

We look to media as memory, and a place to memorialize, when we have lost. Hypermedia pioneers such as Ted Nelson and Vannevar Bush envisioned the ultimate media within the ultimate archive—with each element in continual flux, and with constant new addition. Dynamism without loss. Instead we have the Web, where “Not Found” is a daily message. Projects such as the Internet Archive and Afterlife dream of fixing this uncomfortable impermanence. Marketeers promise that agents (indentured information servants that may be the humans of About.com or the software of “Ask Jeeves”) will make the Web comfortable through filtering—hiding the impermanence and overwhelming profluence that the Web's dynamism produces. The Impermanence Agent—a programmatic, esthetic, and critical project created by the author, Brion Moss, a.c. chapman, and Duane Whitehurst— operates differently. It begins as a storytelling agent, telling stories of impermanence, stories of preservation, memorial stories. It monitors each user's Web browsing, and starts customizing its storytelling by weaving in images and texts that the user has pulled from the Web. In time, the original stories are lost. New stories, collaboratively created, have taken their place.


Author(s):  
Huibert Crijns ◽  
Anna Rademakers

The Memory of the Netherlands programme was created by the Koninklijke Bibliotheek (the National Library of the Netherlands) in cooperation with the Dutch Ministry of Education, Culture and Science to present a major national database of images of cultural heritage objects on the Internet. The article describes the background to the project, the collections that it contains, and the partnerships with other institutions that have been forged. Particular issues highlighted by the programme have been open access and copyright, contextualizing the content of the database, and the challenges of evolving Information Technology and the standardization of metadata. Memory of the Netherlands has been successful in creating a freely accessible database of more than 400,000 digital objects from 70 different cultural heritage institutions. However, the Dutch government, which financed the programme for its first ten years, has decided to end its funding and the Koninklijke Bibliotheek now has to consider how best to ensure the continued existence and accessibility of the project.


2020 ◽  
Author(s):  
Brewster Kahle

In our current era of disinformation, ready access to trustworthy sources is critical. “Fake news,” sophisticated disinformation campaigns, and propaganda distort the common reality, polarize communities, and threaten open democratic systems. What citizens, journalists, and policymakers need is a canonical source of trusted information. For millions, that trusted source resides in the books and journals housed in libraries, curated and vetted by librarians. Yet today, as we turn inevitably to our screens for information, if a book isn’t digital, it is as if it doesn’t exist. To address this gap, the Internet Archive is actively working with the world’s great libraries to digitize their collections and to make them available to users via controlled digital lending, a process whereby libraries can loan digital copies of the print books on their shelves. By bringing millions of missing books and academic literature online, libraries can empower journalists, researchers, and Wikipedia editors to cite the best sources directly in their work, grounding readers in the vetted, published record, and extending the investment that libraries have made in their print collections.


2021 ◽  
Author(s):  
◽  
Thomas Michael Malone

<p>The purpose of this thesis is to challenge the common notion that the internet has had a detrimental impact on the music industry, and on musicians’ ability to generate a viable income while still producing good music. Note, that the following arguments do not automatically extend to the effect that the web has had on books, patents, films, journalism or any other medium. The reason for this is because these different disciplines have individual characteristics that make them respond differently to the same socio-economic pressures. However, on the same token, it does not necessarily follow that the conclusions reached here are inapplicable to other activities: perhaps what is true for music in the following pages is also true for e.g. photography. Furthermore, I am not advocating for a free-for-all internet where behemoths like Google, Amazon and Facebook get to do whatever they wish. Although I am intrigued by such matters, the constraints of both time and space allow me only the possibility to focus on the subject that I am most familiar and passionate about. Furthermore, because I am painting a broad picture which encompasses many intellectual disciplines, many of which I am not an expert in, this work is to be considered more on the consistency of the overall argument rather than the minutia of its individual parts.</p>


Author(s):  
Chandrasekar Ravi ◽  
Praveensankar Manimaran

Since the advent of the web, the number of users who started using the internet for everyday purpose has increased tremendously. Most of the common purposes are to access their data whenever they want and wherever they want. So many companies have started providing these services to normal users. These companies store huge volume of data in the data centers. So protecting the integrity of the data is the main responsibility of these companies. Blockchain is one of the trending solutions that gives storage immutability to the users. This chapter starts with the working of blockchain and smart contracts and advantages and disadvantages of blockchain and smart contracts and then goes on to explain how blockchain can be integrated into the internet of things (IOT). This chapter ends with an architecture based on the proof-of-concept for access management, which is blockchain-based fully distributed architecture.


Author(s):  
Keith Sherringham ◽  
Bhuvan Unhelkar

The Internet wave that swept through business is likely to be seen as a ripple in a pond compared to the changes that are predicted from the adoption of mobility into business. Irrespective of industry sector, the mobile enablement (wrapping business around mobility) of business is expected to bring many opportunities and rewards; and like the Web enablement (wrapping business around the Internet) of business, a few challenges as well. Across all business areas, mobile business will need to support a mobile workforce, the operation of call (service) centres, and transaction processing and collaboration of virtual teams. Mobile business will also impact product offerings, the management of consumer choice and the focusing of communications with a sticky message. Mobile business will drive changes in management, revisions of business operations and the alignment of Information Communication Technology (ICT). This chapter discusses some of the common but important strategic elements to the successful mobile enablement of business.


Author(s):  
Barbara Sierman ◽  
Kees Teszelszky

In 2007, the Koninklijke Bibliotheek, the Dutch National Library (KB-NL), started the project ‘webarchiving’ based on a selection of Dutch websites. The initial selection of 1000 websites has currently grown into over 12,000 selected websites, crawled at different intervals. Although due to legal restrictions the current use is limited to the KB-NL reading room, it is important that the KB-NL includes the requirements of the (future) users in its approach to creating a web collection. With respect to the long-term preservation of the collection, we also need to incorporate the requirements for long-term archiving in our approach, as described in the Open Archival Information System (OAIS) Model ISO 14721: 2012. This article describes the results of a research project on webarchiving and the web collection of archived sites in the KB-NL, investigating the following questions. What is webarchiving in the Netherlands? What are the selection criteria of KB-NL and how are these related to what can be found on the Dutch web by the contemporary user? What is the influence of the choice of tools we use to harvest the final archived website? Do we know enough of the value of the web collection and the potential usage of it by researchers and how can we improve this value? This article will describe the outcomes of the research, the conclusions and advice that can be drawn from it and it is hoped will inspire broader discussions about the essence of creating web collections for long-term preservation as part of cultural heritage.


Author(s):  
Sandra Folie

AbstractIn the perception of literary scholars, the investigation of genre histories is still closely linked to ‘offline’ archival work. However, the Internet has been publicly accessible since 1991, and over the last thirty years, numerous new literary genres have emerged. They have often been proclaimed, defined, spread, marketed, criticized, and even pronounced dead online. By now, a great deal of this digital material is said to have disappeared. What many scholars do not consider, however, is that parts of the web are archived, for example by the Internet Archive and Wikipedia, which make their archives publicly available via the Wayback Machine and the history page respectively. This makes it possible to track early online definitions of contemporary genres and their development. In this paper, I will use the chick lit genre, which emerged in the second half of the 1990s, as a case study to show the benefits of including web archives in the reconstruction of contemporary genre histories. An analysis of both the first extensive and long-running fan websites, which are now offline but well-documented in the Internet Archive, and the history page of the Wikipedia article on chick lit will challenge some of the narratives that have long dominated chick lit research.


2003 ◽  
Vol 64 (4) ◽  
pp. 300-317 ◽  
Author(s):  
Mary F. Casserly ◽  
James E. Bird

Five hundred citations to Internet resources from articles published in library and information science journals in 1999 and 2000 were profiled and searched on the Web. The majority contained partial bibliographic information and no date viewed. Most URLs pointed to content pages with “edu” or “org” domains and did not include a tilde. More than half (56.4%) were permanent, 81.4 percent were available on the Web, and searching the Internet Archive increased the availability rate to 89.2 percent. Content, domain, and directory depth were associated with availability. Few of the journals provided instruction on citing digital resources. Eight suggestions for improving scholarly communication citation conventions are presented.


2021 ◽  
Author(s):  
◽  
Thomas Michael Malone

<p>The purpose of this thesis is to challenge the common notion that the internet has had a detrimental impact on the music industry, and on musicians’ ability to generate a viable income while still producing good music. Note, that the following arguments do not automatically extend to the effect that the web has had on books, patents, films, journalism or any other medium. The reason for this is because these different disciplines have individual characteristics that make them respond differently to the same socio-economic pressures. However, on the same token, it does not necessarily follow that the conclusions reached here are inapplicable to other activities: perhaps what is true for music in the following pages is also true for e.g. photography. Furthermore, I am not advocating for a free-for-all internet where behemoths like Google, Amazon and Facebook get to do whatever they wish. Although I am intrigued by such matters, the constraints of both time and space allow me only the possibility to focus on the subject that I am most familiar and passionate about. Furthermore, because I am painting a broad picture which encompasses many intellectual disciplines, many of which I am not an expert in, this work is to be considered more on the consistency of the overall argument rather than the minutia of its individual parts.</p>


Sign in / Sign up

Export Citation Format

Share Document