A framework for technology-assisted sensitivity review

2019 ◽  
Vol 53 (1) ◽  
pp. 42-43
Author(s):  
Graham McDonald

More than a hundred countries implement freedom of information laws. In the UK, the Freedom of Information Act 2000 [1] (FOIA) states that the government's documents must be made freely available, or opened , to the public. Moreover, all central UK government departments' documents that have a historic value must be transferred to the The National Archives (TNA) within twenty years of the document's creation. However, government documents can contain sensitive information, such as personal information or information that would likely damage international relations if it was opened. Therefore, all government documents that are to be publicly archived must be sensitivity reviewed to identify and redact the sensitive information. However, the lack of structure in digital document collections and the volume of digital documents that are to be sensitivity reviewed mean that the traditional manual sensitivity review process is not practical for digital sensitivity review. In this thesis, we argue that sensitivity classification can be deployed to assist government departments and human reviewers to sensitivity review born-digital government documents. However, classifying sensitive information is a complex task, since sensitivity is context-dependent and can require a human to judge on the likely effect of releasing the information into the public domain. Moreover, sensitivity is not necessarily topic-oriented, i.e., it is usually dependent on a combination of what is being said and about whom. Through a thorough empirical evaluation, we show that a text classification approach is effective for sensitivity classification and can be improved by identifying the vocabulary, syntactic and semantic document features that are reliable indicators of sensitive or nonsensitive text [2]. Furthermore, we propose to reduce the number of documents that have to be reviewed to learn an effective sensitivity classifier through an active learning strategy in which a sensitivity reviewer redacts any sensitive text in a document as they review it, to construct a representation of the sensitivities in a collection [3]. With this in mind, we propose a novel framework for technology-assisted sensitivity review that can prioritise the most appropriate documents to be reviewed at specific stages of the sensitivity review process. Furthermore, our framework can provide the reviewers with useful information to assist them in making their reviewing decisions. We conduct two user studies to evaluate the effectiveness of our proposed framework for assisting with two distinct digital sensitivity review scenarios, or user models. Firstly, in the limited review user model, which addresses a scenario in which there are insufficient reviewing resources available to sensitivity review all of the documents in a collection, we show that our proposed framework can increase the number of documents that can be reviewed and released to the public with the available reviewing resources [4]. Secondly, in the exhaustive review user model, which addresses a scenario in which all of the documents in a collection will be manually sensitivity reviewed, we show that providing the reviewers with useful information about the documents that contain sensitive information can increase the reviewers' accuracy, reviewing speed and agreement [5]. This is the first thesis to investigate automatically classifying FOIA sensitive information to assist digital sensitivity review. The central contributions are our proposed framework for technology-assisted sensitivity review and our sensitivity classification approaches. Our contributions are validated using a collection of government documents that are sensitivity reviewed by expert sensitivity reviewers to identify two FOIA sensitivities, namely international relations and personal information. Our results demonstrate that our proposed framework is a viable technology for assisting digital sensitivity review. Supervisors Prof. Iadh Ounis (University of Glasgow), Dr. Craig Macdonald (University of Glasgow) Available from: http://theses.gla.ac.uk/41076

2020 ◽  
pp. 073889422093032
Author(s):  
Matthew J Connelly ◽  
Raymond Hicks ◽  
Robert Jervis ◽  
Arthur Spirling ◽  
Clara H Suong

We introduce the Freedom of Information Archive (FOIArchive) Database, a collection of over 3 million documents about state diplomacy. Substantively, our database focusses on the USA and provides opportunities to analyze previously classified (or publicly unavailable) corpora of internal government documents which include the raw—often full—text of those documents. We also provide within-country diplomatic records for the USA, UK, and Brazil. The full span of the data is 1620–2013, but it is mainly from the twentieth century. Our database allows scholars to view text and associated statistics online and to download and view customized datasets via an application programming interface. We provide extensive metadata about the documents, including the countries and persons they mention, and their topics and classification levels. The metadata includes information we extracted with domain-specific, customized natural language processing tools. To demonstrate the potential of this data, we use it to design and validate a new index for “country importance” in the context of US foreign policy priorities.


2021 ◽  
Vol 11 (18) ◽  
pp. 8506
Author(s):  
Mercedes Rodriguez-Garcia ◽  
Antonio Balderas ◽  
Juan Manuel Dodero

Virtual learning environments contain valuable data about students that can be correlated and analyzed to optimize learning. Modern learning environments based on data mashups that collect and integrate data from multiple sources are relevant for learning analytics systems because they provide insights into students’ learning. However, data sets involved in mashups may contain personal information of sensitive nature that raises legitimate privacy concerns. Average privacy preservation methods are based on preemptive approaches that limit the published data in a mashup based on access control and authentication schemes. Such limitations may reduce the analytical utility of the data exposed to gain students’ learning insights. In order to reconcile utility and privacy preservation of published data, this research proposes a new data mashup protocol capable of merging and k-anonymizing data sets in cloud-based learning environments without jeopardizing the analytical utility of the information. The implementation of the protocol is based on linked data so that data sets involved in the mashups are semantically described, thereby enabling their combination with relevant educational data sources. The k-anonymized data sets returned by the protocol still retain essential information for supporting general data exploration and statistical analysis tasks. The analytical and empirical evaluation shows that the proposed protocol prevents individuals’ sensitive information from re-identifying.


2016 ◽  
Vol 44 (3) ◽  
pp. 28
Author(s):  
GODORT Preservation Working Group

The GODORT Preservation Working Group urges the Government Documents Round Table (GODORT) to promote a national conversation about the value of preserving historic Government publications in multiple formats in order to serve a diverse public and to publicize the need for Government publications librarians to help the public access those publications. GODORT should urge ALA to ask the US Congress to appropriate funds for preservation of Federal Depository Library Program government publications. This money should be used for direct support of depository libraries who want to preserve their paper and digital government publications.


2017 ◽  
Vol 11 (1) ◽  
pp. 48-54
Author(s):  
Suhail Amin Tarafdar ◽  
Michael Fay

Data is frequently handled by GPs during their day-to-day work. This includes not only clinical data where patient information is handled, but also organisational data. Clinicians must be aware of the regulations that govern information handling. This article will discuss the Data Protection Act 1998, which governs personal information held on patient records. It will clarify the eight data protection principles and how they apply in practice. Thereafter, the article will discuss the Freedom of Information Act 2000, which gives the public rights to access certain data held by surgeries. The article will highlight important exemptions and grounds for refusing access to data.


Author(s):  
Manish Gupta

Pharming is emerging as a major new Internet security threat. Pharming has overtaken “phishing” as the most dangerous Internet scam tactic, according to the latest Internet Security Intelligence Briefing (Veri-Sign, 2005). Pharming attacks exploit the design and implementation flaws in DNS services and the way Internet addresses are resolved to Internet protocol (IP) addresses. There are an estimated 7.5 million external DNS servers on the public Internet (MF-Survey, 2006). Pharming attacks manipulate components of the domain and host naming systems to redirect Internet entering personal and sensitive information on their fake site. Financial services’ sites are often the targets of these attacks, in which criminals try to acquire personal information in order to access bank accounts, steal identities, or commit other kinds of fraud. The use of faked Web sites makes pharming sound similar to e-mail phishing scams, but pharming is more insidious, since users are redirected to a false site without any participation or knowledge on their part. Pharming is technically harder to accomplish than phishing, but also sneakier because it can be done without any active mistake on the part of the victim (Violino, 2005). The greatest security threat lies in the fact that a successful pharming attack leaves no information on the user’s computer to indicate that anything is wrong.


1994 ◽  
Vol 21 (1) ◽  
pp. 255-273 ◽  
Author(s):  
Onker N. Basu

In accounting research, the role of organizational leaders has been underrepresented. The limited research dealing with leadership issues has focused on the impact of leadership on micro activities such as performance evaluation, budget satisfaction, and audit team performance. The impact of leadership on the structure of accounting and audit systems and organizations has been ignored. This paper focuses on the impact that past Comptrollers General have had on the working and structure of one federal audit agency, the United States General Accounting Office (GAO). In addition, it also focuses on the influence of the two most recent Comptrollers General on one important audit related activity, i.e., the audit report review process. Using qualitative field research methods, this paper documents how the organizational leadership impacts its long-term audit practices and thereby influences auditing, especially in the public sector.


Author(s):  
Samyak Sadanand Shravasti

Abstract: Phishing occurs when people's personal information is stolen via email, phone, or text communications. In Smishing Short Message Service (SMS) is used for cyber-attacks, Smishing is a type of theft of sensitive information. People are more likely to give personal information such as account details and passwords when they receive SMS messages. This data could be used to steal money or personal information from a person or a company. As a result, Smishing is a critical issue to consider. The proposed model uses an Artificial Intelligence to detect smishing. Analysing a SMS and successfully detecting Smishing is possible. Finally, we evaluate and analyse our proposed model to show its efficacy. Keywords: Phishing, Smishing, Artificial Intelligence, LSTM, RNN


2012 ◽  
Vol 9 (4) ◽  
pp. 378-393 ◽  
Author(s):  
Alice Marwick

People create profiles on social network sites and Twitter accounts against the background of an audience. This paper argues that closely examining content created by others and looking at one’s own content through other people’s eyes, a common part of social media use, should be framed as social surveillance. While social surveillance is distinguished from traditional surveillance along three axes (power, hierarchy, and reciprocity), its effects and behavior modification is common to traditional surveillance. Drawing on ethnographic studies of United States populations, I look at social surveillance, how it is practiced, and its impact on people who engage in it. I use Foucault’s concept of capillaries of power to demonstrate that social surveillance assumes the power differentials evident in everyday interactions rather than the hierarchical power relationships assumed in much of the surveillance literature. Social media involves a collapse of social contexts and social roles, complicating boundary work but facilitating social surveillance. Individuals strategically reveal, disclose and conceal personal information to create connections with others and tend social boundaries. These processes are normal parts of day-to-day life in communities that are highly connected through social media.


2007 ◽  
Vol 9 (2) ◽  
Author(s):  
P. L. Wessels ◽  
L. P. Steenkamp

One of the critical issues in managing information within an organization is to ensure that proper controls exist and are applied in allowing people access to information. Passwords are used extensively as the main control mechanism to identify users wanting access to systems, applications, data files, network servers or personal information. In this article, the issues involved in selecting and using passwords are discussed and the current practices employed by users in creating and storing passwords to gain access to sensitive information are assessed. The results of this survey conclude that information managers cannot rely only on users to employ proper password control in order to protect sensitive information. 


Sign in / Sign up

Export Citation Format

Share Document