NIST is developing Computer Forensic Reference Data Sets (CFReDS) for digital evidence. These reference data sets (CFReDS) provide to an investigator documented sets of simulated digital evidence for examination. Since CFReDS would have documented contents, such as target search strings seeded in known locations of CFReDS, investigators could compare the results of searches for the target strings with the known placement of the strings. Investigators could use CFReDS in several ways including validating the software tools used in their investigations, equipment check out, training investigators, and proficiency testing of investigators as part of laboratory accreditation. The CFReDS site is a repository of images. Some images are produced by NIST, often from the CFTT (tool testing) project, and some are contributed by other organizations. National Institute of Justice funded this work in part through an interagency agreement with the NIST Office of Law Enforcement Standards.
In addition to test images, the CFReDS site
contains resources to aid in
creating your own test images. These creation aids will be
in the form of interesting data files, useful software tools and
procedures for specific tasks.
IMPORTANT
NOTE: This web site is under development and may change or be
reorganized at any time.
There are several uses envisioned for the data
sets, but we also expect that there will be unforeseen
applications. The four most obvious applications are testing
forensic tools, establishing that lab equipment is functioning
properly, testing proficiency in specific skills and
training laboratory staff. Each type of data set has slightly
different requirements. Most data sets can be used for more than
one function. For example, the Russian
Tea Room can be used to evaluate the behavior of a tool
to search UNICODE text or display UNICODE text. This set can also
be used as a skill test for an examiner to demonstrate proficiency
in working with UNICODE text or as a training exercise.
Data sets for tool testing need to be
completely documented. The user of the data set needs to know
exactly what is in the data set and where it is located. These
data sets should also provide specification for a set of explicit
tests. However, the user should have sufficient documentation to
develop and execute other test cases if necessary or desirable.
These data sets could be part of a realistic investigation
scenario, but it is easier to control expected results if each
data set is focused on a particular type of tool function.
Examples of focused function areas are string searching, deleted
file recovery and email extraction.
There will tend to be many small test images,
each focused on a particular feature for the tool function being
tested.
These data sets need to focus on issues in
acquisition, access and restoration of data. These data sets might
need to have a strong procedural component.
These data sets would be primarily
investigation scenario based tests to give a real flavor to the
data set. These would be similar to the data sets for proficiency
testing, but generally available.
The degree of documentation required for a data
set varies depending on the use of the data set. For example, a
data set for testing string searching requires absolute disk
addresses for strings located in unallocated space, but an
investigation scenario data set may only need to say that the file
at C:\mystuff\social-security-numbers.txt contains the information
to be found.
Several data set distribution schemes were
considered. Using actual hard disk drives was ruled out as too
costly and impractical. We will need to balance several factors,
including realism, cost, and practicality.
(NOTE: THESE DATA SETS ARE NOT FOR FEDERATED TESTING)
These are prototype data sets for public comment (JLYLE@NIST.GOV).
Some test sets are multi-skill holistic cases, e.g., the hacking case while other test
sets are focused on specific skills, e.g., non-English text
searching in the Russian Tea Room
case.
Data Set |
Description |
Hacking Case |
Any names in the image are fictional and do no
refer to real people. |
Data
Leakage Case |
Large, complex image involving intellectual
property theft |
Registry Forensics | Data Set for testing MS Windows Registry Extraction Tools |
Drone Images | Images from 60 drones and associated controllers, connected mobile devices and computers |
Russian Tea Room |
Unicode string
search in Russian or English (Bigendian) |
asb image, dd, E01 |
Unicode string
search in Russian (UTF-8) |
Create
a reference drive |
Create a drive
with known hash values. The creation process also verifies
that the computer hardware and the drive are working as
expected. |
Basic Mac image |
Mac File Systems
(HP OS Extended Journaling, HP OS Extended, HP OS Standard
& Unix) |
Rhino Hunt |
Look for images
(of a rhinoceros) in an image file
and network traces. |
Memory Images |
Live memory
capture images |
DCFL |
DCFL Control image |
Mobile Device Images | Chip-off /JTAG binary images |
Container Files | String searching on container and nested container files |
Deleted File Recovery |
Metadata based
deleted file recovery images |
File Carving |
Basic file
carving images |
File
Carving CFTT Images |
Images used for CFTT file carving test
reports |
Data Set |
Description |
String Search Test Data for use with Federated Testing 4.0 and later. |
Privacy Poilcy/Security Notice -- Disclaimer | FOIA |USAGov
Last
updated:
October 7, 2019
Technical
comments:
cftt@nist.gov
Search NIST website