English/Arabic Bilingual Dictionary Construction using Parallel Texts from the Internet Archive

Author(s):  
M.A. Fattah ◽  
Fuji Ren ◽  
K. Shingo ◽  
A. Atlam
Author(s):  
Emily Kalah Gade ◽  
Sarah Dreier ◽  
John Wilkerson ◽  
Anne Washington

Abstract The Internet Archive curated a 90-terabyte sub-collection of captures from the US government's public website domain (‘.gov’). Such archives provide largely untapped resources for measuring attributes, behaviors and outcomes relevant to political science research. This study leverages this archive to measure a novel dimension of federal legislators' religiosity: their proportional use of religious rhetoric on official congressional websites (2006–2012). This scalable, time-variant measure improves upon more costly, time-invariant conventional approaches to measuring legislator attributes. The authors demonstrate the validity of this method for measuring legislators' public-facing religiosity and discuss the contributions and limitations of using archived Internet data for scientific analysis. This research makes three applied methodological contributions: (1) it develops a new measure for legislator religiosity, (2) it models an improved, more comprehensive approach to analyzing congressional communications and (3) it demonstrates the unprecedented potential that archived Internet data offer to researchers seeking to develop meaningful, cost-effective approaches to analyzing political phenomena.


Leonardo ◽  
1999 ◽  
Vol 32 (5) ◽  
pp. 353-358 ◽  
Author(s):  
Noah Wardrip-Fruin

We look to media as memory, and a place to memorialize, when we have lost. Hypermedia pioneers such as Ted Nelson and Vannevar Bush envisioned the ultimate media within the ultimate archive—with each element in continual flux, and with constant new addition. Dynamism without loss. Instead we have the Web, where “Not Found” is a daily message. Projects such as the Internet Archive and Afterlife dream of fixing this uncomfortable impermanence. Marketeers promise that agents (indentured information servants that may be the humans of About.com or the software of “Ask Jeeves”) will make the Web comfortable through filtering—hiding the impermanence and overwhelming profluence that the Web's dynamism produces. The Impermanence Agent—a programmatic, esthetic, and critical project created by the author, Brion Moss, a.c. chapman, and Duane Whitehurst— operates differently. It begins as a storytelling agent, telling stories of impermanence, stories of preservation, memorial stories. It monitors each user's Web browsing, and starts customizing its storytelling by weaving in images and texts that the user has pulled from the Web. In time, the original stories are lost. New stories, collaboratively created, have taken their place.


2005 ◽  
Vol 34 (1) ◽  
Author(s):  
Arjan van Dijk

In December 2004, Google Inc. announced its plans to digitize millions of books from prestigious libraries such as Harvard, Stanford, and the New York Public Library. Most of the books are in the public domain and will be available for free on the Internet. The Google initiative is one among many, including the American Memory Project of the Library of Congress, the San Francisco-based Internet Archive, and Gallica, the digital library of the Bibliothèque nationale de France. All of these programs offer free access to good-quality digital materials. Another common feature is that they are heavily funded.


2020 ◽  
Vol 81 (7) ◽  
pp. 359
Author(s):  
Shawnda Hines

Federal funding for librariesPublishers sue the Internet Archive (IA)


Sign in / Sign up

Export Citation Format

Share Document