Towards automated preprocessing of bulk data in digital forensic investigations using hash functions

Harald Baier

doi:10.1515/itit-2015-0023

Towards automated preprocessing of bulk data in digital forensic investigations using hash functions

it - Information Technology ◽

10.1515/itit-2015-0023 ◽

2015 ◽

Vol 57 (6) ◽

Author(s):

Harald Baier

Keyword(s):

Data Structures ◽

Digital Forensics ◽

Hash Functions ◽

Site Investigation ◽

Process Models ◽

Approximate Matching ◽

Digital Forensic ◽

Bulk Data ◽

Digital Forensic Investigations ◽

Cryptographic Hash Functions

AbstractHandling bulk data (e. g. some terabytes of data) is a issue in contemporary digital forensics. Separating relevant data structures from irrelevant ones resembles finding the needle in the haystack. The article at hand presents and assesses automatic hash-based techniques to preprocess the input data with the goal to facilitate the investigator's job. We discuss concepts like blacklisting and whitelisting based on cryptographic hash functions and approximate matching, respectively. In case of two established process models for a lab and an on-site investigation, respectively, we describe how to jointly use these techniques to automatically get a pointer to the needle.

Download Full-text

Understanding the effects of removing common blocks on Approximate Matching scores under different scenarios for digital forensic investigations

10.5753/sbseg.2019.13966 ◽

2019 ◽

Author(s):

Vitor Hugo Moia ◽

Frank Breitinger ◽

Marco Aurélio Henriques

Keyword(s):

Data Structures ◽

Digital Forensics ◽

Similarity Score ◽

Approximate Matching ◽

Similarity Detection ◽

Digital Forensic ◽

Compact Representations ◽

Forensic Investigations ◽

New Interpretation ◽

Digital Forensic Investigations

Finding similarity in digital forensics investigations can be assisted with the use of Approximate Matching (AM) functions. These algorithms create small and compact representations of objects (similar to hashes) which can be compared to identify similarity. However, often results are biased due to common blocks (data structures found in many different ﬁles regardless of content). In this paper, we evaluate the precision and recall metrics for AM functions when removing common blocks. In detail, we analyze how the similarity score changes and impacts different investigation scenarios. Results show that many irrelevant matches can be ﬁltered out and that a new interpretation of the score allows a better similarity detection.

Download Full-text

Case Based Interpretation of Windows 10 Registry Forensics

International Journal of Innovative Computing ◽

10.11113/ijic.v8n1.165 ◽

2018 ◽

Vol 8 (1) ◽

Author(s):

Hasan Binjuraid ◽

Mazura Mat Din

Keyword(s):

Computer System ◽

Digital Forensics ◽

Second Phase ◽

Mass Storage ◽

Forensic Investigation ◽

Storage Devices ◽

Digital Forensic ◽

Digital Forensic Investigations ◽

Result Analysis ◽

Transfer Protocol

With the advancement in computer technologies, cybercrimes advanced too. As in today’s world, the technology knowledge to attack a computer is less than ever, with the help of advanced tools that does most of the work. Digital forensic investigations are crucial in solving this type of crimes, and it must be done professionally. Computer registries play a big part in the digital forensic investigation, it can help find artifacts that are left by the cybercrimes, dates of the crimes on the computer system and the user at the time of the crime. In this research, interpretation of these artifacts is the main focus, committees and jurors are the focus of the interpretations of the registries. Two types of cases are subject to investigation in this research. BitTorrent clients’ use for downloading illegal o copyrighted content, and three clients are chosen for this digital forensic investigation uTorrent, Vuze and BitComet. Theft using USB storage devices is the second type of case, where there are three types of USB devices Mass Storage Class, Picture Transfer Protocol and Media Transfer Protocol, each type of USB devices leaves different artifacts behind during insertion and removal. A web based dashboard will be developed to help with the process of interpretation the artifacts found in the registry of the computer system. A categorization process of each cybercrime case will be conduct to evaluate the severity of the case depending on the artifacts found in the digital forensics investigation process. The research methodology will consist of three phases. The first phase will be information gathering including literature review, requirements gathering and dataset gathering for the research. Performing digital forensics analysis will be the second phase and it includes planning, identification and reconnaissance. Last phase will include result analysis and discussion.

Download Full-text

PRECEPT: a framework for ethical digital forensics investigations

Journal of Intellectual Capital ◽

10.1108/jic-05-2019-0097 ◽

2020 ◽

Vol 21 (2) ◽

pp. 257-290

Author(s):

R.I. Ferguson ◽

Karen Renaud ◽

Sara Wilford ◽

Alastair Irons

Keyword(s):

Law Enforcement ◽

Intellectual Capital ◽

Digital Forensics ◽

The Other ◽

Privacy Rights ◽

Justice Theory ◽

Content Type ◽

Digital Forensic ◽

Digital Forensic Investigations ◽

The One

PurposeCyber-enabled crimes are on the increase, and law enforcement has had to expand many of their detecting activities into the digital domain. As such, the field of digital forensics has become far more sophisticated over the years and is now able to uncover even more evidence that can be used to support prosecution of cyber criminals in a court of law. Governments, too, have embraced the ability to track suspicious individuals in the online world. Forensics investigators are driven to gather data exhaustively, being under pressure to provide law enforcement with sufficient evidence to secure a conviction.Yet, there are concerns about the ethics and justice of untrammeled investigations on a number of levels. On an organizational level, unconstrained investigations could interfere with, and damage, the organization's right to control the disclosure of their intellectual capital. On an individual level, those being investigated could easily have their legal privacy rights violated by forensics investigations. On a societal level, there might be a sense of injustice at the perceived inequality of current practice in this domain.This paper argues the need for a practical, ethically grounded approach to digital forensic investigations, one that acknowledges and respects the privacy rights of individuals and the intellectual capital disclosure rights of organizations, as well as acknowledging the needs of law enforcement. The paper derives a set of ethical guidelines, and then maps these onto a forensics investigation framework. The framework to expert review in two stages is subjected, refining the framework after each stage. The paper concludes by proposing the refined ethically grounded digital forensics investigation framework. The treatise is primarily UK based, but the concepts presented here have international relevance and applicability.Design/methodology/approachIn this paper, the lens of justice theory is used to explore the tension that exists between the needs of digital forensic investigations into cybercrimes on the one hand, and, on the other, individuals' rights to privacy and organizations' rights to control intellectual capital disclosure.FindingsThe investigation revealed a potential inequality between the practices of digital forensics investigators and the rights of other stakeholders. That being so, the need for a more ethically informed approach to digital forensics investigations, as a remedy, is highlighted and a framework proposed to provide this.Research limitations/implicationsThe proposed ethically informed framework for guiding digital forensics investigations suggests a way of re-establishing the equality of the stakeholders in this arena, and ensuring that the potential for a sense of injustice is reduced.Originality/valueJustice theory is used to highlight the difficulties in squaring the circle between the rights and expectations of all stakeholders in the digital forensics arena. The outcome is the forensics investigation guideline, PRECEpt: Privacy-Respecting EthiCal framEwork, which provides the basis for a re-aligning of the balance between the requirements and expectations of digital forensic investigators on the one hand, and individual and organizational expectations and rights, on the other.

Download Full-text

Practical (Second) Preimage Attacks on the TCS_SHA-3 Family of Cryptographic Hash Functions

Journal of Information Processing Systems ◽

10.3745/jips.03.0021 ◽

2014 ◽

Keyword(s):

Hash Functions ◽

Cryptographic Hash Functions

Download Full-text

A Dataset of Photos and Videos for Digital Forensics Analysis Using Machine Learning Processing

Data ◽

10.3390/data6080087 ◽

2021 ◽

Vol 6 (8) ◽

pp. 87

Author(s):

Sara Ferreira ◽

Mário Antunes ◽

Manuel E. Correia

Keyword(s):

Machine Learning ◽

Digital Forensics ◽

State Of The Art ◽

Forensic Analysis ◽

Third Party ◽

Support Vector ◽

Multimedia Content ◽

Digital Forensic ◽

Video Frames ◽

Forensic Tools

Deepfake and manipulated digital photos and videos are being increasingly used in a myriad of cybercrimes. Ransomware, the dissemination of fake news, and digital kidnapping-related crimes are the most recurrent, in which tampered multimedia content has been the primordial disseminating vehicle. Digital forensic analysis tools are being widely used by criminal investigations to automate the identification of digital evidence in seized electronic equipment. The number of files to be processed and the complexity of the crimes under analysis have highlighted the need to employ efficient digital forensics techniques grounded on state-of-the-art technologies. Machine Learning (ML) researchers have been challenged to apply techniques and methods to improve the automatic detection of manipulated multimedia content. However, the implementation of such methods have not yet been massively incorporated into digital forensic tools, mostly due to the lack of realistic and well-structured datasets of photos and videos. The diversity and richness of the datasets are crucial to benchmark the ML models and to evaluate their appropriateness to be applied in real-world digital forensics applications. An example is the development of third-party modules for the widely used Autopsy digital forensic application. This paper presents a dataset obtained by extracting a set of simple features from genuine and manipulated photos and videos, which are part of state-of-the-art existing datasets. The resulting dataset is balanced, and each entry comprises a label and a vector of numeric values corresponding to the features extracted through a Discrete Fourier Transform (DFT). The dataset is available in a GitHub repository, and the total amount of photos and video frames is 40,588 and 12,400, respectively. The dataset was validated and benchmarked with deep learning Convolutional Neural Networks (CNN) and Support Vector Machines (SVM) methods; however, a plethora of other existing ones can be applied. Generically, the results show a better F1-score for CNN when comparing with SVM, both for photos and videos processing. CNN achieved an F1-score of 0.9968 and 0.8415 for photos and videos, respectively. Regarding SVM, the results obtained with 5-fold cross-validation are 0.9953 and 0.7955, respectively, for photos and videos processing. A set of methods written in Python is available for the researchers, namely to preprocess and extract the features from the original photos and videos files and to build the training and testing sets. Additional methods are also available to convert the original PKL files into CSV and TXT, which gives more flexibility for the ML researchers to use the dataset on existing ML frameworks and tools.

Download Full-text

A CellBE-based HPC Application for the Analysis of Vulnerabilities in Cryptographic Hash Functions

2010 IEEE 12th International Conference on High Performance Computing and Communications (HPCC) ◽

10.1109/hpcc.2010.113 ◽

2010 ◽

Cited By ~ 5

Author(s):

Alessandro Cilardo ◽

Luigi Esposito ◽

Antonio Veniero ◽

Antonino Mazzeo ◽

Vicenç Beltran ◽

...

Keyword(s):

Hash Functions ◽

Cryptographic Hash Functions

Download Full-text

Improving Forensic Triage Efficiency through Cyber Threat Intelligence

Future Internet ◽

10.3390/fi11070162 ◽

2019 ◽

Vol 11 (7) ◽

pp. 162 ◽

Cited By ~ 2

Author(s):

Nikolaos Serketzis ◽

Vasilios Katos ◽

Christos Ilioudis ◽

Dimitrios Baltatzis ◽

Georgios Pangalos

Keyword(s):

Information Security ◽

Digital Forensics ◽

Incident Response ◽

Security Controls ◽

Digital Forensic ◽

Threat Intelligence ◽

Cyber Threat ◽

Series Of Experiments ◽

Cyber Threat Intelligence ◽

Security Incidents

The complication of information technology and the proliferation of heterogeneous security devices that produce increased volumes of data coupled with the ever-changing threat landscape challenges have an adverse impact on the efficiency of information security controls and digital forensics, as well as incident response approaches. Cyber Threat Intelligence (CTI)and forensic preparedness are the two parts of the so-called managed security services that defendants can employ to repel, mitigate or investigate security incidents. Despite their success, there is no known effort that has combined these two approaches to enhance Digital Forensic Readiness (DFR) and thus decrease the time and cost of incident response and investigation. This paper builds upon and extends a DFR model that utilises actionable CTI to improve the maturity levels of DFR. The effectiveness and applicability of this model are evaluated through a series of experiments that employ malware-related network data simulating real-world attack scenarios. To this extent, the model manages to identify the root causes of information security incidents with high accuracy (90.73%), precision (96.17%) and recall (93.61%), while managing to decrease significantly the volume of data digital forensic investigators need to examine. The contribution of this paper is twofold. First, it indicates that CTI can be employed by digital forensics processes. Second, it demonstrates and evaluates an efficient mechanism that enhances operational DFR.

Download Full-text

User-Contributory Case-Based Reasoning for Digital Forensic Investigations

2012 Third International Conference on Emerging Security Technologies ◽

10.1109/est.2012.35 ◽

2012 ◽

Author(s):

Graeme Horsman ◽

Christopher Laing ◽

Paul Vickers

Keyword(s):

Case Based Reasoning ◽

Digital Forensic ◽

Forensic Investigations ◽

Digital Forensic Investigations ◽

Case Based

Download Full-text

Digital Forensic Investigations: Issues of Intangibility, Complications and Inconsistencies in Cyber-crimes

Journal of Cyber Security and Mobility ◽

10.13052/jcsm2245-1439.425 ◽

2016 ◽

Vol 4 (2) ◽

pp. 87-104 ◽

Cited By ~ 3

Author(s):

Ezer Osei Yeboah-Boateng ◽

Elvis Akwa-Bonsu ◽

◽

Keyword(s):

Digital Forensic ◽

Forensic Investigations ◽

Cyber Crimes ◽

Digital Forensic Investigations

Download Full-text

Building Secure and Fast Cryptographic Hash Functions Using Programmable Cellular Automata

Journal of Computing and Information Technology ◽

10.2498/cit.1002639 ◽

2015 ◽

Vol 23 (4) ◽

pp. 317 ◽

Cited By ~ 4

Author(s):

Alaa Eddine Belfedhal ◽

Kamel Mohamed Faraoun

Keyword(s):

Cellular Automata ◽

Hash Functions ◽

Cryptographic Hash Functions

Download Full-text