The time may come when you need to examine a hard drive either because you need to recover some lost data or because you suspect an employee of a violation. Whether you intend to use your findings in court, for employee discipline, or just for your own information, using a good forensics tool and recognized forensics techniques will help you both recover your data and preserve its value as evidence. A tool I recommend is WinHex, made by X-Ways Software Technology AG of Germany.

What's WinHex?
WinHex is an advanced hex editor that includes powerful data recovery and analysis utilities. At $139 for a specialist license (required to enable advanced features), it is far cheaper than hard-core forensics applications such as EnCase ($2,495), yet offers many of the same features. This is not to minimize the value of EnCase, which has the advantages of a simple GUI, ability to examine RAID disks, and acceptance by law enforcement and the courts. However, for your purposes, you will not likely need such an expensive solution. For more information about WinHex, see the Daily Drill Down “WinHex: A powerful data recovery and forensics tool.”

Intro to digital evidence
To both effectively recover data when you need it and to protect the validity of your investigation, you should follow these guidelines, which were developed by computer forensics investigators. Note that these guidelines are for evidence that won't be lost when a machine is powered off.
  • Acquire the suspicious drive as soon as possible. The sooner you "freeze" the evidence in time, the more likely it is that you will be able to retrieve suspicious data.
  • Don't overstep your legal bounds. You probably have the right to seize an employee's work computer, but if you have any doubts, consult a lawyer.
  • Once you've acquired the disk, prevent any operating system activity that could change data or overwrite it. You especially want to protect information stored in free clusters, swap file space, or slack space (data on a cluster between the end of one file and the beginning of the next cluster, left over from a previous write operation).
  • Never boot from, start software from, or save anything to the suspect drive. Start all software from a separate disk and save all copies and analysis to another disk or media.
  • Verify the original disk with a hash. For additional verification, you can use a professional timestamp service.
  • Make an exact, sector-by-sector copy (clone) of the suspect disk.
  • Verify the clone with a hash.
  • Analyze and collect data. Log what you do. Work with the cloned disk noninvasively. You want to use programs (such as WinHex) that will explore the disk and make copies without changing file creation dates and other file access values.
  • You can also hash on a file level—matching values will prove that the sources and retrieved copies are the same.
  • Print hard copies as needed.

Removable drives: A forensics challenge
There are many solutions available for quickly attaching external hard drives to a computer. Drive adapters abound, from ribbon cable adapters, to cheap USB 2.0 enclosures on EBay, to more expensive IEEE 1394 (FireWire) boxes and removable drive bays.

From a forensics standpoint, the problem with external drives is that Windows OSs always write some data to an external drive when it's mounted; drives can't be mounted read-only. For example, upon first recognizing a drive, Windows creates a recycle bin if one isn't present, and makes other changes depending upon the OS and whether the disk is formatted with a FAT or NTFS file system. Without special drive-copying hardware that forces drives to be read-only, such as Guidance Software Inc.'s FastBloc LE or Logicube's USB Write-PROtect, your suspect data won't be pristine.

Assuming that for most of your less-formal analyses you can live with a slight amount of change to the suspect data, here are some tips for minimizing data loss in an external drive:
  • Before mounting the drives, disable all programs that write to drives, such as the Norton Protected Recycle Bin.
  • Unmount the disks by deleting the drive letter assignment in Windows 2000 Computer Management tool (you may need to reboot afterwards). At that point, the disks won't show in Windows Explorer, but you will still be able to access the disks physically in WinHex (WinHex will access them through the BIOS).

Creating hashes
Hash values are the equivalent of data fingerprints. They are useful for showing that two or more sets of data are identical, and for demonstrating that the data has changed. The chances of two different digital files having the same hash value are considered less than the chances of two people having identical DNA test results.

The hash algorithms commonly used in forensics work are MD5 (128 bit), and SHA (160-bit). WinHex can calculate both, as well as these additional file verification values: Checksums (8,16, 32, and 64 bit), Cyclic Redundancy Checks (called CRCs, 16 and 32 bit), a 256-bit version of SHA, and a method called PSCHF (256-bit).

Hashes are more secure than checksums because software exists to allow crackers to modify executable files and change insignificant bits in such a way as to produce the same checksum as the original. But those who study such things have concluded, as Warren G. Kruse II and Jay G. Heiser write in Computer Forensics, that it is "computationally infeasible at this time to counterfeit a robust hash algorithm such as MD5 and SHA."

Hashing files
As a computer examiner, you'll often create hash values for separate files as well as entire disks. The easiest way to learn hashing is to work on a simple document. Start WinHex and open a Word doc by selecting File | Open and browsing to the document.

Calculate an MD5 hash value by selecting Tools | Calculate Hash. From the pop-up windows, choose MD5 and click OK. Note its value. The value returned for my test doc was:
62661D6194B818DD67B1A48A7803A2AB

Using Windows Explorer, copy your file. Now open the copy in WinHex and calculate its hash value. You'll see that because the copy is identical, its fingerprint is also identical.

Here's where the results get interesting. Close the files in WinHex. Now open the copy in Word and make one small change: Add one space to the end. Save and close the file. Open it in WinHex and calculate its hash value. You'll see that even with this miniscule change, the new hash is completely different, showing that the file has been modified.

Once again, close the file in WinHex and reopen it in Word. Now remove the space at the end, save, close, and recalculate its hash value. Note that even though the file was supposedly returned to its original state, the hash value is completely different. Here are the hashes that were created with the examples listed:
  • Original file: 62661D6194B818DD67B1A48A7803A2AB
  • Copy: 62661D6194B818DD67B1A48A7803A2AB
  • Copy with space inserted in Word: A37857059E82908CA7B6B18DFEEF68D4
  • Copy with space deleted in Word: 3172284159CA1CD109BCA45D8C298EED

This result is curious. If you were careful, the only culprit could be Word itself. Word makes hidden changes to the data, such as recording a new revision number and date, when saving files. Even had you used the Undo command, the file would not be returned to its original value.

Note, however, that if you use WinHex's hex editor to add a space and then undo the result, you would get the expected results, meaning that the hash would again be identical to the original. The significance of this fact is that sometimes hackers store confidential data in unused areas of data files. If you have an original file cached elsewhere, comparing its hash to a copy on the suspect disk (even one with forged creation dates and access dates and times) can reveal if it has been tampered with.

Another lesson is that it's better to examine your recovered files with a viewer such as Quick View Plus rather than the program used to create the file, to prevent an inadvertent change to the evidence.

Hashing disks
The only difference between calculating disk hashes and calculating file hashes is the time involved. On my 500 MHz laptop connected to a 2.0 USB Drive enclosure, time to run the MD5 algorithm on the entire drive was about 1.5 hours.

First, un-mount the disk. Go to Start | Control Panel and double click Administrative Tools. From the Administrative Tools window, double click Computer Management. Then click the Disk Management folder. In a moment, you'll see a list of active drives, as shown in Figure A.

Figure A
The Windows 2000 Computer Management snap-in displays drives and their status.


Right click the drive you wish to deactivate and select Change Drive Letter And Path. In the Change Drive Letter dialog box, click Remove. Click Yes to dismiss the warning. The drive will no longer appear on Windows Explorer and will be unmounted.

Next, open the disk in WinHex by choosing Tools | Disk Editor. Double click the suspect drive in the Edit Disk window, as shown in Figure B. You want to access it physically, which will bypass Windows 2000 and access it through the BIOS.

Figure B
Hard disk 1 is the target of this investigation.


Select Tools | Calculate Hash. Choose the MD5 algorithm from the drop-down list. A progress bar shows the status of the calculation. When the calculation is complete, copy and paste the value in a log file for the target disk, which can be any type of file you like—text, Excel spreadsheet, Word doc. Do not save your log file to the disk you are investigating!

Creating disk clones
Cloning a disk requires that the target drive be the same size as the source. This means cloning to another hard disk of the same size or to a partition created to be the same size.

A clone, unlike a general Windows backup, is an identical, sector-by-sector copy of the disk. It includes the data in swap files, slack space, and currently unallocated clusters (free space).

To ensure that hash values will match in case any data is left over in the target disks reserve sectors, prepare the disk by initializing this space. To do so, access the target disk logically (the disk's file tables are only available when accessed through the operating system). Then select Tools | Disk Tools | Initialize Free Space. You can also run Initialize MFT Records if the disk is formatted with the NTFS file system.

To make sure there is enough space on the target, open both disks in WinHex, and use the Details tab to compare them, as shown in Figure C.

Figure C
The Total Capacity value of the Details windows allows you to verify that both disks are the same size.


If you haven't calculated the hash value of the source disk, do so now. Next, select Tools | Disk Tools | Clone Disk. Set the source and destination drives in the Clone Disk dialog box, as shown in Figure D. In this case, I am cloning Physical Disk 2 (3.8-GB disk) to Physical Disk 1 (3.8 GB) (both are laptop drives installed in USB 2.0 enclosures). Set the starting sector to 0 to equal the starting sector of the source disk.

Figure D
The start sector of the destination drive should be set equal to the start sector of the source disk.


You will be warned that the integrity of Hard Disk 1 may be severely damaged during the process. Take the chance and click OK. You'll also be warned that no undo operations will be available. Click OK. A progress bar notes the status of the operation.

After it's finished, calculate a hash for the destination disk and log the values. If all was successful, the fingerprint for both disks should match, as shown here.
Source: 78989F3DB1F03285376C59163603C355
Destination: 78989F3DB1F03285376C59163603C355

Having established that both the source and cloned disks are identical, take the source disk offline and lock it up. You are now ready to begin analyzing the copy.

Using WinHex Backup to clone a disk
An alternative to cloning a disk is to use WinHex Backup. This tool also creates a sector-by-sector copy of the source. You should create a backup instead of a clone if you lack a spare drive or a drive with enough space, or if you want to create a clone on read-only media, such as CD-R.

Backup writes the contents of a disk to archive files. The utility has the ability to split the files. By setting WinHex Backup to create volumes of 650 MB each, you can burn these volumes to CD-Rs using your own burning software and hardware. Later, you can restore the backup to a disk, which will create a clone of the source. Or you can analyze each CD-R individually within WinHex. The advantage of this latter method is that once written and closed, the media is read-only and prevents accidental or deliberate tampering with the evidence.

Open the source disk using Tools | Disk Editor. Remember that to keep Windows OSs from writing to the source disk, you need to unmount the drive as shown above, and access it physically. Make sure there is enough room on a different hard drive to temporarily store each volume. You'll need at least 650 MB of free space unless you configure WinHex to split the backup into smaller-size volumes.

Choose File | Make Disk Backup. In the Make Disk Backup dialog shown in Figure E, set a destination folder and name for the first file, or accept the default and let WinHex assign the name automatically. The backup will begin with file 000.whx and increment accordingly. Add a description of the backup if you wish. The sectors to be backed up are listed automatically. You shouldn't find it necessary to change these values.

Figure E
Set backup options in the Make Disk Backup dialog.


Check the Split Backup Into Archives checkbox. You can accept the default value of 650 MB or change the value. For burning onto CDs, 650 is the maximum size you should use.

The next two options slow down the backup process, but are worth the extra time. Check the Calculate MD5 checkbox. If MD5 doesn't appear, choose it by clicking the box on the right, which will open a drop-down list. This option verifies that your backup archive is identical to the source disk. I also suggest compressing the backup. WinHex compressed backups have an average space savings of 51 percent. During my test, I only needed three CD-Rs to store a 3.8 GB disk. Note that when later analyzing a compressed backup volume, you'll need enough disk space so that WinHex can expand the compressed volume to a temporary file. If your information is sensitive, you can also choose to encrypt the backup as well.

Click OK to proceed, and OK to confirm. A progress bar advances as the backup is made. WinHex pauses when each volume is complete until you tell it to write the next volume. During this time, you can burn the file to a CD and then delete the file to free up room for the next volume in the set.

When the backup is complete and the hash values establish that both the source and backup are identical, take the source disk offline and lock it up. From now on, you will work with the backup. Remember to add the hash value to your log and save it.

"Restoring" a backup to a clone
By burning a backup onto CD-Rs, you've essentially "frozen" the data you're investigating. You will be able to work on a copy from now on and protect both the original evidence and the copy from changes. Your hash value is evidence that the two sets are identical.

To create a clone from a backup, you'll need a disk of at least the same size (or greater) as the original. In my case, I had a spare 3.8-GB drive equipped with a USB enclosure, and I used it to create the clone. I initialized the unallocated space, to make sure any "left over" sectors were erased.

Place the CD-R containing the first backup volume in the drive. Start WinHex. Choose File | Load Backup, and open the first backup volume (000.whx if you used the default name).

In the Restoration dialog box shown in Figure F, the defaults should be acceptable in most cases. Make doubly sure that you set the correct destination disk before proceeding, as Restore will overwrite the sectors of the destination drive.

Figure F
Verify that your destination drive is correct. There's no Undo for a restore operation.


If you have enough space on your working hard drive and you want some extra safety, click the Write To Disk Only On Save Request radio button. Click OK. The Confirmation message gives you a chance to back out, informing you how many sectors will be overwritten. Click OK if you're sure.

One additional confirmation message gives you a last chance to back out. Click OK. WinHex will begin writing sectors. When finished, it will ask for the next file in the set. After Restore is complete, you will want to run an MD5 hash on the clone to verify that it is identical to the copy on the media.

Lock down data with WinHex
By using WinHex and key data protection techniques, you are able to create exact copies of suspicious data that are proven to be identical to the original. At this point, whatever your analysis reveals ought to be acceptable evidence in any forum. However, if you have any doubts about the procedure, consult with a lawyer and contact law enforcement. Many police jurisdictions have their own forensics labs and are willing to help with an investigation.