cfreds-2017-winreg



Overview

The Windows registry is a system-defined database in which applications and system components store and retrieve configuration data. The Windows operating system provides APIs to retrieve, modify, or delete registry items such as Keys, Values and Data.

From forensics point of view, the registry is one of primary targets for Windows forensics. It includes not only configurations of the operating system and user applications, but also meaningful artifacts that can be useful for identifying users’ behaviors and reconstructing their past events. Although Windows registry analysis techniques are already generally being used in the forensic community, there is a lack of scientific evaluation efforts on forensic tools, which can parse and interpret Windows registry internals. In this situation, NIST/CFTT project aims to enhance the reliability of Windows registry-related forensics by establishing methodologies for conformance and quality testing.

For achieving the overall aim described above, NIST/CFReDS project developed a reference Windows registry dataset ( cfreds-2017-winreg ). The dataset comprises:

  • ☑ User-generated registry files experimentally created based on the registry hive file format.
  • ☑ System-generated registry files extracted from modern Windows OSes from Vista to 10.
  • ☑ Ground truth data for all reference registry files.

A technical report is available for detailed information on development processes and ground truth.
   (Document v1.10 - last updated at May 17, 2018)



Dataset

User-generated Windows Registry Data

This data were created experimentally based on the specification of Windows registry file format.

Code Link (hash) Size Hive Count Generation Method
NR ugrd-nr.7z
5.86 MB
29 - [Windows] RegEdit (.REG)
- [Linux] Python script with hivex
NRD ugrd-nrd.7z 81.5 KB 51 - [Windows] RegEdit (.REG & manual)
- [Windows] PowerShell script
- [Linux] Python script with hivex
CR ugrd-cr.7z 14.5 KB 14 - [Windows] Python script
MR ugrd-mr.7z 37.0 KB 52 - [Windows] Python script
* UGRD stands for user-generated reference data.
† A new data '[nr]-08: naming convention' was included in 'NR' category. (last updated at June 4, 2018)


System-generated Windows Registry Data

This data were generated by Windows systems along with a scenario mimicking user behaviors.

Windows Link (hash) Size Hive Count Event Count Note
Vista sgrd-vista.7z 581 MB 120 347 - 4 volume shadow copies
7 sgrd-7.7z 529 MB 91 409 - 4 volume shadow copies
8 sgrd-8.7z 671 MB 113 416 - 3 volume shadow copies
8.1 sgrd-81.7z 756 MB 113 433 - 3 volume shadow copies
10 sgrd-10.7z 795 MB 104 467 - 3 volume shadow copies
10RS1 sgrd-10rs1.7z 734 MB 104 467 - 3 volume shadow copies
* SGRD stands for system-generated reference data.


Supplements

Assistance Tools for Automated Data Development

☆ Source codes are available here.

Category Description
ugrd-assistance-tools-win - .REG, Batch and PowerShell scripts for 'NR' and 'NRD' using Windows APIs - Python scripts for 'CR' and 'MR'
ugrd-assistance-tools-linux - Python scripts for 'NR' and 'NRD' using hivex
sgrd-vmpop-scenario - A Python script for virtual machine population and data extraction
- An example implementation using pyvmpop for 'cfreds-2017-winreg'