Information Assurance in a Distributed Forensic Cluster
: Processing Data for Forensics in a Distributed System

  • Nicholas Pringle

    Student thesis: Doctoral Thesis


    The first decade of the 21st Century has been described as a “Golden Age” in the development of Digital Forensics. Criminals naively used the new technology not realizing they were leaving easy pickings for the investigators on their trail. The evidence was mostly
    obvious, the software straightforward. Most importantly, the scale of the task was manageable. A “Case” was more often, one suspect, one investigator, one computer, one hard disk, one piece of analysis software, one report for one Authority. The Golden Age is over. Investigations are becoming increasingly multi-jurisdictional, with multiple items containing evidence from multiple suspects and ever increasing quantities of data. Investigators are struggling to keep pace with the changes and are possibly losing the battle.

    There have been several solutions proposed to regain the upper hand, amongst them is using what is collectively known as distributed processing running on clusters of PCs. Processing data in a forensically sound manner acceptable to the courts requires special
    measures when handling evidence. Existing systems in this area, like Hadoop and HTCondor, are designed for use in cases where users do not have to justify their actions to a legal authority. Appropriate procedures to attain a suitable “Chain of Evidence” have been
    developed as new forms of digital evidence have been identified, acquired, processed and presented in court. In these, the computer system used for analysis has been treated as a single point in the chain of evidence but in a distributed system, there could be hundreds of hosts connected via local and wide area networks. Currently no acceptable methods can assure “Chain of Evidence” in these ‘new’ distributed architectures.

    Within this thesis, we present a solution to this problem. FCluster and FClusterfs are the result of a design research methodology that addresses the problem by setting a design criterion, proposing a design, building it and then evaluating it against a number of metrics
    identified in the background.

    We find that, to be a complete solution, FCluster has to extend from the Acquisition of evidence through ingestion, distribution to processing. To overcome the latency problems common to distributed system we introduce a technique we call Jigsaw imaging and with it the prioritisation of data acquisition. It is implemented as a middleware, in a manner similar to Hadoop and HTCondor.

    This dissertation makes an original contribution to knowledge in the field of digital forensics by developing a technique that ensures the integrity of data as it passes from acquisition source, to storage and on to processing within a distributed computer architecture.
    Date of Award11 Aug 2015
    Original languageEnglish
    SupervisorMikhaila Burgess (Supervisor) & Andrew Blyth (Supervisor)

    Cite this