________________________________________________________________ 
Dupeless (VERSION 1.0)
Copyright (c) 1998 Ziff-Davis Publishing Company
Written by Neil J. Rubenking
First Published in PC Magazine, US Edition, April 7, 1998.
________________________________________________________________ 

About Dupeless:
Dupeless ferrets out duplicate files and helps you remove them from your hard disk. Comparison options let you define what constitutes duplicate files. For example, duplicates can have the same name but different contents, or the same contents but different names. Some same-named files are different versions of a dynamic link library or ActiveX control, and these duplicates can sometimes cause system problems. Others types of duplicates are simply space wasters3/4identical in size and contents, but stored under different names. 

Usage:
Dupeless is supplied as a self-contained INSTALL program. Run this program to install Dupeless and all its files in the directory of your choice. It will add Dupeless, its help file, and an uninstall utility to your Start menu. Dupeless runs under Windows 95, Windows NT 4, and Windows NT 3.51. Note that the Recycle Bin feature was added to Windows NT starting with Version 4. If you are running an earlier version of Windows NT, the option to move unwanted duplicates to the Recycle Bin will be disabled.

In order to find duplicates among all the files on your system, Dupeless needs to maintain information about all of those files at the same time. That could be a lot of information, so Dupeless uses temporary storage on disk rather than attempting to keep it all in memory. These temporary files are created in the same folder as Dupeless itself, and can total several megabytes in size. Thus you should install Dupeless on a drive that has sufficient free space.

Dupeless Options:
You don't need to run Dupeless every day, but every month or so you'll want to clean up your hard disk and recover any space wasted on duplicates. You'll want to start every session by setting or reviewing Dupeless's options. The Options dialog lets you specify what criteria Dupeless should use in deciding whether files are duplicates. It also lets you set the deletion method, choose which drives to scan, optionally set a starting directory for each drive, and set the colors that will be used to distinguish sets of duplicates in the output display. The Comparison and Drives options take effect on the next scan; the Deletion and Colors options take effect immediately.
Dupeless lets you define what constitutes "duplicates" in a number of ways, including:

* different versions of the same file (same name)
* copies of the same file (same name and size)
* identical copies of the same file (same name, size, and contents)
* unexpected identical files (same size and contents)

The Comparison options group box is where you make your selections.
Either file name or file size must be used in all comparisons. If neither of these comparison options is checked, the OK button and the time/date checkbox will be disabled.
You can only do a date/time comparison if you are also doing a file name or file size comparison. With the date/time box checked, two files won't be considered identical unless they have the same date/time stamp. In most cases, you should leave this check box blank. It is offered as a refinement in case the other choices produce too many apparent duplicates.

If you check the Compare file Names option, you also have the option to compare the files' version numbers. When you check the "display version info" option, Dupeless will display the version numbers of duplicate 32-bit DLL and EXE files, if available.
If you check the Compare file Sizes option, you also have the option to compare the files' contents. When you check the "compare contents" option, Dupeless will report only files with identical contents as duplicates. Files need not have the same name to be identified as identical in content.

Once you've chosen your Comparision options, you need to tell Dupeless where to look. The Options dialog lists all fixed disks and removable disks on your system. A check box to the left of each lets you select which disks to scan. Dupeless does not list CD-ROMs, network drives, or RAM disks. By default, fixed disks are checked and removable disks are not.

To check or clear a box, double-click anywhere in the corresponding row, or navigate to the desired row with the arrow keys and press Space. The context menu that appears when you right-click on a row also contains options to check or clear that row's box. Highlight any drive row and press the Browse button to set a starting directory for searching that drive, or right-click the row and choose Browse from the context menu. The OK button will be disabled if you clear all the drive check boxes.

	The Deletion options group lets you tell Dupeless what you mean by "delete checked files". In most cases, you should not choose permanent deletion. One of your applications may depend on finding a file in a particular location. Under Windows 95 or Windows NT 4, you can send the duplicates to the Recycle Bin. If you find you're still able to run all of your applications, then after a week or so you can feel free to empty the Recycle Bin.

If you'd rather archive the duplicates or if you're running Windows NT 3.51, you should use the third option3/4Move to Dupe$$$$ folder. Instead of deleting a file, Dupeless will move it, along with its path information, into a folder named Dupe$$$$ which it will create in the root directory of the same drive. For example, "C:\windows\system\useless.dll" would be moved to 
"C:\Dupe$$$$\windows\system\useless.dll". You can use your favorite compression program to archive the duplicates, if you wish. As with the Recycle Bin option, if you find that after a week or so none of your applications show any signs of needing the removed duplicates, you can delete the Dupe$$$$ folders.

Dupeless uses alternating stripes of color to distinguish different groups of duplicate files. By default, the stripes are yellow and cyan. However, the Options dialog lets you choose your own two favorite colors instead. Be sure to choose colors that are distinct from each other and contrast well with the text color.

Exclusions:
Using the Exclusions dialog, you can instruct Dupeless to ignore all files in and below a particular folder, all files that have a specific name, or all files with a particular extension. The first time you run Dupeless, it will insert its own folder and your system-defined TEMP folder into the list of folders to ignore. If you're running Windows 95, it will also add the SYSBCKUP folder, which is found in the Windows folder. You will probably want to add other particular folders to this list, such as the history and temporary folders used by your Internet browser. 

The list of filenames to ignore initially contains MSCREATE.DIR and ANTI-VIR.DAT. Many systems will have dozens or even hundreds of files matching one of these names. Depending on your system's particular collection of files, you may want to add other names to this list. You must add specific filenames; wildcards are not allowed. The list of extensions starts off containing .$$$ and .TMP, both commonly used for temporary files. It's quite common to find temporary files whose contents are identical to those of the corresponding permanent file. Again, wildcards are not allowed.

If you wish to try scanning without exclusions, you don't have to empty the three lists. Instead, clear the Enable Exclusions check box at the top of the dialog box. However, regardless of whether exclusions are enabled, Dupeless will always exclude any directory named Dupe$$$$ or Recycled found in the root directory of a drive. As mentioned above, the Dupe$$$$ folders are created by Dupeless itself to hold unwanted duplicate files.

Scanning, Reviewing, And Removing:
Once you've set the Options and Exclusions, you can start Dupeless scanning for duplicates. If you're not comparing file contents, the process is usually over in a few minutes. The file contents comparison can take quite some time, so you may want to start it just before you go to lunch. When Dupeless has finished scanning for duplicates, it will list all the sets of duplicate files in a grid. While scanning is in process, the Abort Scan menu item and corresponding taskbar button are enabled. Use the Abort feature with care, as it will stop the scan, discard any partial results, and reset Dupeless to its pre-scan condition.

Regardless of which comparison options were chosen, Dupeless displays the name, path, size, and date/time stamp for each file. If you checked the Version option a fifth column will display version information, if available, for any 32-bit EXE or DLL files in the grid. All files in a group of duplicates will be displayed with the same background color. By default, the colors alternate between pale blue and yellow, but you can choose different colors in the options dialog.

Check the box at the far left of a row to indicate that you want this file removed. To check or clear a box, double-click anywhere in the corresponding row, or navigate to the desired row with the arrow keys and press Space. The context menu that appears when you right-click on a row contains options to check or clear that row's box, as well as several options that act on the entire group. The Check Group (Except One) option unchecks the selected row, but checks all other files in its group. The Uncheck Group option will clear all check boxes for the selected row's group. Finally, as some searches result in very large groups, the Count Group option will report the number of files in the selected row's group.

When you've carefully reviewed the results and checked off the files to be removed, choose Delete checked files from the Action menu, or press the corresponding speed button. Depending on which deletion option you chose, Dupeless will permanently delete the checked files, move them to the Recycle Bin, or move them to a \Dupe$$$$ folder in the root directory of the same drive. If Dupeless successfully processes a file, it deletes the corresponding line from its list; otherwise, it leaves the line in place, check-mark and all. Some files may no longer be duplicates after the deletion process, and these are removed from the list as well. Finally Dupeless adjusts the background colors for any remaining items so that the groups of duplicate files still alternate between the two background colors.

You may wish to print the results of a search, or save them for later reference. Select "Save list to file" from the Action menu, or press the corresponding speed button. This will store the grid's contents in a Rich Text Format (RTF) document named DUPELESS.RTF, in the same folder as the Dupeless program. Almost any modern word processor can import this type of file. The results are displayed in a tabular form, similar to the grid display within Dupeless itself. Groups of duplicates are separated by a solid horizontal line, and items checked for deletion are marked as strikeout (like this). If you print this document from within your word processor, the header on each printed page will indicate the date on which the search was performed as well as the criteria that were used in comparing files.

You'll probably be amazed at the number of duplicate files Dupeless finds the first time you use it. And even though the price of immense hard drives continues to drop, there's never a good reason to waste disk space.

Support for Dupeless:
Support for the free utilities offered by PC Magazine can be 
obtained electronically in the discussion area of PC 
Magazine's Web site. Go to the URL 
http://www.pcmag.com/discuss.htm/ and select the Utilities 
area. You can also access the Utilities discussion area from the 
utility's download page. The authors of current utilities 
generally monitor the discussion area every day. You may 
find an answer to your question simply by reading the 
messages previously posted. If the author is not available and 
you have a question that the sysops can't answer, the editor of 
the Utilities column, who also checks the area each day, will 
contact the author for you.

Neil Rubenking is the contributing technical editor of PC Magazine.
 Sheryl Canter is the editor of the Utilities column and a 
contributing editor of PC Magazine.


