Mining open source text documents for intelligence gathering

Author(s):  
Hsin-Chang Yang ◽  
Chung-Hong Lee
2016 ◽  
Author(s):  
John Andersson ◽  
Sebastian Berlin ◽  
André Costa ◽  
Harald Berthelsen ◽  
Hanna Lindgren ◽  
...  

2013 ◽  
Vol 9 (2) ◽  
pp. e1002854 ◽  
Author(s):  
Hamish Cunningham ◽  
Valentin Tablan ◽  
Angus Roberts ◽  
Kalina Bontcheva

2009 ◽  
Vol 28 (3) ◽  
pp. 143 ◽  
Author(s):  
Przemyslaw Skibiński ◽  
Jakub Swacha

In this paper we investigate the possibility of improving the efficiency of data compression, and thus reducing storage requirements, for seven widely used text document formats. We propose an open-source text compression software library, featuring an advanced word-substitution scheme with static and semidynamic word dictionaries. The empirical results show an average storage space reduction as high as 78 percent compared to uncompressed documents, and as high as 30 percent compared to documents compressed with the free compression software gzip.


2016 ◽  
Vol 49 (2) ◽  
pp. 538-547 ◽  
Author(s):  
Morteza Dehghani ◽  
Kate M. Johnson ◽  
Justin Garten ◽  
Reihane Boghrati ◽  
Joe Hoover ◽  
...  

2019 ◽  
pp. 208-213
Author(s):  
Kathy Peiss

The collecting missions made an imprint on the postwar world of books and information. The OSS and military efforts to acquire open-source intelligence propelled advances in library and information science already underway. A number of those involved in wartime acquisitions became pioneers in this field. The program of acquisition offered a prototype for open-source intelligence gathering after the war. These missions also contributed to a growing orientation among American libraries toward internationalism, in which collecting foreign holdings was deemed essential to American power. For the most part, however, the collections themselves attracted little notice. With the Holocaust awareness of the late twentieth century, the acquisition of looted Jewish books was investigated by the Justice Department and President Bill Clinton’s Commission on Holocaust Assets. Looted and displaced books remain part of the unfinished business of World War II.


2015 ◽  
Author(s):  
Morteza Dehghani ◽  
Kate M. Johnson ◽  
Justin Garten ◽  
Vijayan Balasubramanian ◽  
Anurag Singh ◽  
...  

Author(s):  
Liam R. E. Quin

This paper describes some modifications done to an open source text retrieval package to make it XML-aware, and contrasts this lexical approach, in which XML documents are primarily treated as sequences of characters rather than trees, with the W3C XPath 1.0 and XQuery 2.0 Full-Text facility. Specific usage scenarios are taken into consideration, including World Wide Web publication and the searching and analysis of text corpora for research purposes.


Sign in / Sign up

Export Citation Format

Share Document