DRIVESPACE 3 DISASTER RECOVERY KIT

by Dean Trower
email:  deant@cs.monash.edu.au

This kit is freeware.  It may be freely distributed so long as all files are included unmodified.

==================================================================================================================================================
==================================================================================================================================================
==================================================================================================================================================

This ReadMe file first explains some things about how DriveSpace works, then what you should do to recover data from corrupted DriveSpace drives, and finally explains what the accompanying utilities do, and how to use them.

NOTE that this information pertains to DRIVESPACE 3.0 only, which is the version of DriveSpace that comes with Windows 95 OEM Service Pack 2, with the Windows 95 PlusPak!, and of course with Windows 98.  Earlier versions of DriveSpace WORKED DIFFERENTLY, so some of the information contained here may not apply, and the programs will almost certainly not work.





INTRO
=====
Back in October 97, my computer crashed, corrupting my hard drive in the process, and in particular the DriveSpace file in which I had a large number of important files that I hadn't bothered backing up anywhere.  It wasn't until June the next year that I finally managed to recover all my data, and in the intervening time I looked everywhere I could think of for help with the problem, because at the time I new nothing about how DriveSpace worked.  I searched the Web for info (in vain).  I tried MicroSoft Technical support ($35 bought me only the advice that I should type "SCANDISK /MOUNT"... and as it turned out this was very, very bad advice.  If you haven't already done so, *********** DO NOT ATTEMPT TO EITHER MOUNT OR USE SCANDISK TO REPAIR A DAMAGED DRIVESPACE DRIVE ************, at least until you've read what I have to say below).  I sent my Harddrive to 3 different data-recovery companies (including overseas), and inquired at several more... but most didn't do DriveSpace problems, one didn't have the time (though the person running that particular company DID give my some invaluable information that in the end helped me solve my problem), and one company said they would do it... for $10,000's and no guarantees of success.
  Finally, I got fed up with looking for other people's help, and decided to try recovering the data myself, even though I didn't know anything about it.  To my surprise, it didn't take me very long to work out how to get at my data;  I wrote some programs to extract it manually, which got me back most of my disk data, albeit all scrambled up, and then I spent about a week trawling through approx. 400MB of scrambled clusters, piecing my files together by visually looking for parts of them using a hex editor.  Amazingly, I was able to recover EVERY file that I had lost (that I cared about).  It was lucky for me that most of my files were visually recognizable (i.e. MS Word files, program source code, etc.).
  I had intended to write up what I had discovered and post that together with my recovery programs to the appropriate newsgroups, but somehow I never got around to it... I didn't think there'd be too many people still using DriveSpace anyway (it's obsolete:  large Harddrives are cheap these days, and any large drive ought to be formatted with FAT32, which precludes the use of DriveSpace).
  Then in February '99, someone who had found my previous newsgroup requests for help emailed me to ask if I had solved my problems, and as a result I decided that I really ought to write up what I knew, and post it for the benefit of others in similar situations.  This file is the result.  Of course, by that time I'd forgotten some of what I'd learned about DriveSpace, so the information here is not complete, and some of it might not be accurate (you have been warned!).  But for what it's worth, this file contains everything I remember.  The programs included are also just quick hacks, which I originally intended only for my own, once-off use.  They may be quick and dirty, and not well written, but they worked for me, and maybe they'll help you, so they're included here too.  They're written in Borland Pascal 7.0, and I've included both source code and executables.
  Note that the information here, and the accompanying programs, are not intended for use by computer novices.  You need not be an experienced programmer, but if you don't know what a FAT is, or how it stores data, or if you've never edited disk information directly using a Disk Editor, then this recovery kit is NOT for you - better find someone with more technical computer savvy, and get them to help you.

--------------------------------------------------------------------------------------------------------------------------------------------------
LEGAL DISCLAIMER:
No guarantees of any kind are associated with either the information in this file or with the accompanying programs.  Everything here is provided "as is", and I will not be held liable for any damage or loss of data resulting in any way from its use or misuse.  To the best of my knowledge the information is accurate and the programs work as described, but responsibilty for the consequences of using any of this lies entirely with the user.
--------------------------------------------------------------------------------------------------------------------------------------------------

Having said that, if you are having a problem, please do feel free to email me for help - my email (current at the time of writing) is at the top of this ReadMe file.


Some of the information here came from the MS-DOS resource kit (mostly this just explained how to call the Drivespace compression/decompression engine from within a DOS program, and that's what's done in the accompanying program "DECMPRESS"), anbd from the MSDN Visual Developer Studio '97 CD (look under Windows 95 DDK, "compression";  there's no section under the heading of "DriveSpace".  The information there is rather incomplete, as it is intended only as an update for the MS-DOS 6.x Programmer's Reference, which I didn't have).  Mainly, however, the information here came from my own experiments with DriveSpace.
  These experiments consisted of my creating a (small ...say 2MB) drivespace CVF file, copying it to create a second, identical drivespace drive, and then writing cluster data to one of the drives (as a mounted drive, not as a file), and then finding the differences between the two CVF files.  Using this technique, it did not take long (a day or three, and I was starting almost from scratch) for me to work out how data is recorded in the drivespace CVF, and what the important parts of the CVF were.  I wasn't able to discover everything about the CVF format, but I discovered enough about it to do data recovery.









HERE'S WHAT I KNOW ABOUT THE CVF STRUCTURE:
===========================================

(1)  OK, the basics:  A drivespace drive is just a file on the host disk, called a "CVF" (Compressed Volume File), usually placed in the root directory and usually called something like "DRVSPACE.001".  On bootup, Drivespace looks for files of this name (and .000, .002, etc), and tries to mount them as drivespace compressed volumes.  (I think if the file is a .000, this means drivespace hides the host drive and replaces it's drive letter, otherwise it adds the CVF as a new additional drive... I'm not sure though, and I think this can be overridden anyway).  It is important to realize that, apart from the special treatment Drivespace gives to these CVF files, THERE IS NOTHING SPECIAL ABOUT THEM.  You can safely copy or rename a CVF, you won't lose data (although DriveSpace won't find or mount renamed CVF's), and if you copy a CVF from one computer to another, you have effectively copied the entire drivespace drive - nothing else is needed.


(2)  Data written to the mounted drivespace drive actually gets written into the CVF file, usually in compressed form.  Compression is on a per-cluster basis;  that is, each CLUSTER of the mounted drive is compressed separately, and written to sectors of the CVF file.  The cluster size of drivespace volumes is 32Kb, and a "sector" of the CVF file can just be taken to mean a 512 byte block, aligned on a 512 byte boundary within the file.  So, when a cluster is written to the mounted drive, it actually gets stored in anywhere from 1 to 64 sectors of the CVF.  If it can't be efficiently compressed, it is stored uncompressed over (up to) 64 sectors.  Within the CVF, each sector belongs to a unique drive cluster; that is, if (for example) a cluster's data compresses to 513 bytes, two sectors within the CVF are taken up storing it, even though only 1 byte of the second sector is used (the rest is just wasted space).  NOTE:  Data is only stored in compressed form if this means fewer sectors are needed to store it, otherwise it is written to the CVF uncompressed.

(3)  Here are the parts of the CVF, in order (as best as I was able to determine, and as far as I remember):


Header  (at start of first sector of CVF file)
BitFat (follows header ???)
1st data area  (starts at 2'nd or 3'rd sector, I don't quite remember)
MDFAT   (somewhere around $80000 hex into the file, if I recall correctly (perhaps it was $60000?))
Mounted drive boot sector (uncompressed)
Mounted drive FAT (1 copy only, uncompressed)
Mounted drive root dir (usually uncompressed ???)
2nd data area
End sector


>>> The HEADER is some bytes of information right at the start of the file.  The first part of it looks just like a physical drive boot-sector entry, giving no. of sectors per cluster and so on, in exactly the same format.  Then there's extra bytes that pertain to drivespace files only, with information such as compression ratio, etc.... but I don't know what most of this stuff is (I tried playing with it but apart from altering the reported compression ratio, which has no real effect on anything except reported free space, everything I tried caused the CVF to fail to mount).  What I DO know is that for the same PC, OS, version of DriveSpace and host drive, the header is the same each time, regardless of the size of the drivespace volume you create.  This means that if the header info gets corrupted, you can just create a new drivespace volume, and then copy the first sector of the new CVF to the first sector of your corrupted one, and it'll be fixed.  (If the header changes depending on host drive, etc, I don't know.  I only had one host drive and system to play with).


>>> The BITFAT is a section of the CVF for which there is a single bit corresponding to each cluster of the mounted drive.  (I *think* that the bitfat follows the header, but I could be wrong, it might be after the mounted drive boot sector...).  I don't know what the bitfat is for, but the information stored in it seems to pertain only to the last disk operation performed - e.g. if you write a file to the drive, the bits in the bitfat corresponding to the file's clusters get set, AND ALL OTHER BITS ARE SET TO ZERO.  So it's probably only used internally by DriveSpace - it ISN'T USED FOR PERMANENT DATA STORAGE.  As far as data recovery goes, then, the contents of the bitfat are irrelevant.  YOU CAN SAFELY WIPE THE BITFAT, YOU WON'T LOSE ANY DATA.


>>>The MDFAT is the most important data structure within the CVF apart from the compressed disk data itself.  It is a bit like a standard hard drive FAT, except that it's function is to record in which sectors within the CVF file each cluster of the mounted drive is stored.  It is arranged as a long list of 40-bit (5 byte) records, one for each cluster of the mounted drive, and it begins at a sector somewhere around offset $80000 into the CVF file (at least, it did on my system).  The first four (I think four?...) records (i.e. 20 bytes) are NOT used to map clusters, but thereafter each record corresponds to a cluster of the mounted drive - i.e. first record for first cluster, 2nd record for second cluster, and so on.  (The first four records are usually zero, but interestingly I tried deleting the drive's root directory from the CVF, and when I then went and saved a file to the drive, a new root directory was created and was pointed to by the 4th MDFAT record, so it's possible that the 1st four records were originally intended to point to the Boot Sector, 1st and 2nd FATS, and root dir respectively... but that's just my speculation.)
Each record contains the following information, formatted as follows:
--------------------------------------------------------------------------------------------------------------------------------------------------
bits 0-22:  Position of cluster information within the CVF file.  This is not a raw byte offset, rather it refers to the SECTOR at which the
(I think?)  cluster data begins - i.e. it is a byte offset divided by 512.  Also it does not start at the beginning of the file, but at the
            beginning of the 1st data area.  Thus "0" means the first sector in the 1sta data area, "1" means the 2nd sector, and so on.
            The counting scheme is not affected by the fact that the MDFAT and other stuff divides the two data areas, so some values of this
            field never get used because they would point to the middle of the MDFAT or boot sector, etc.  In other words, as far as this count
            field is concerened, the whole CVF is just one contiguous bunch of sectors, starting with the first sector of the 1st data area.

bits 26-31: These six bits contain the (number of sectors-1) used to store the cluster.  An fully used uncompressed cluster takes up 64 sectors,
(I think?)  but with compression fewer sectors may be needed.  Also, since clusters are assumed to be full of zeros by default, it takes no
            storage space to record padding zeros at the end of a cluster, so even uncompressed clusters can sometimes be stored in less than 64
            sectors.

bits 32-37:  These six bits contain the (number of sectors-1) within the cluster that contain nonzero information.  In other words, this is the
(I think?)   approximate size of the buffer that will be needed to hold cluster information AFTER decompression.  If this is <63, the remaining
             sectors will appear on the mounted drive automatically filled with zeros.

4 other bits somwhere within the field:  
  One bit is set to '1' if the cluster is allocated (belongs to a file) '0' otherwise.    (I think this is bit 39, the MSB)
  One bit is set to '1' if the cluster is stored in uncompressed form, '0' otherwise.     (I think this is bit 38)
  One bit is set to '1' if the cluster data is fragmented (i.e. not stored in one contiguous block), '0' otherwise.
  One bit is set to '1' if the cluster is "marked", '0' otherwise.  Clusters are marked when they have been recompressed by the "Compression Agent" utility, to prevent it attempting to recompress them again.  Marking serves no other purpose.

The 1 remaining bit within each field is unused as far as I know, and should be zero.

*** My description of these various flags and fields is accurate, as far as I know, but WHERE they are positioned within each 40 bit MDFAT record I only vaguely remember, and I could be wrong; so if you need to know for sure, do your own experiments to find out.

NB Remember that records are stored on disk least significant byte (LSB) first, so that when viewed in a hex editor, the first, leftmost byte represents bits 0-7, and so on.  The last byte (I think?) of an MDFAT entry is often "8x" or "Cx" (depending on if the data is compressed).

NB If the 'allocated' bit is zero, cluster data will still be visible on the mounted drive, but DriveSpace can erase it at any time to make room for new data - similarly to the contents of a deleted file.  I have noticed that some DOS 'unerase' utilities are not MDFAT aware, in that they check a newly undeleted files FAT entries, but THEY DO NOT SET THE CORRESPONDING MDFAT ENTRIES BACK TO 'ALLOCATED'.  This means that your undeleted file might look fine, but then it's contents might simply disappear sometime later when you write to the disk.  As this has never happened to me however, I conjecture that perhaps DriveSpace itself sets the MDFAT entries back to 'allocated' whenever it finds a file that is using them.
-------------------------------------------------------------------------------------------------------------------------------------------------


>>> The MOUNTED DRIVE BOOT SECTOR, FAT, and ROOT DIR follow on immediately after the end of the MDFAT (I think), and are just uncompressed images of what they would be on the mounted drive (except there's only one FAT, or was on my system, anyway).  For some reason the Root directory is stored uncompressed, but if you remove it (zero out it's entry in the CVF) it can sometimes be automatically re-created in compressed form.
  One important thing to know about these parts of the CVF and about the MDFAT as well, is that their position within the CVF doesn't change depending on the size of the CVF.  So, whether you create a DriveSpace volume of 1MB or 200MB, THE MDFAT ALWAYS BEGINS AT THE SAME POSITION OFFSET FROM THE BEGINNING OF THE FILE.  Indeed, even if you specify a DriveSpace volume of only 500k, the corresponding DRIVESPACE.001 file will be over 1MB, because that's how big it has to be to put the MDFAT and the rest in the right place.


>>> The DATA areas contain the information from the clusters of the mounted drive.  Uncompressed data is just stored as-is in the CVF just as it would appear in a cluster of the mounted drive (except sectors filled with just '0' at the end of a cluster don't get stored).  Compressed data for a cluster begins with one of the 2-byte headers $4D4A or $5153 (the latter I think is for "ultrapack" compression, which is only done by the Compression Agent utility), followed by the compressed data.  NB once again note that in a hex editor the bytes appear LSB first, so you see sectors that begin with "4A 4D" or "53 51".  I don't know the format of the actual compressed data, but you don't need to know this to compress or decompress:  There are standard DOS interrupts and also Windows API routines to do these jobs, and they are documented under the heading of "MRCI" (Microsoft Realtime Compression Interface).  Check out any comprehensive reference on DOS interrupts for details, or look at the source code for the accompanying program "DCMPRESS.PAS".  Note that the decompression engine only gets loaded with Drivespace, so Drivespace MUST be loaded for these interrupt or API calls to work; i.e. you must have a MOUNTED drivespace volume on your system somewhere (if there are no drives to mount at boot-up, DriveSpace doesn't get loaded).
  Normally each cluster is stored in a contiguous set of (up to 64) sectors in the CVF.  However, if the DriveSpace drive has ever been really full, it is possible that a cluster was written to the drive, but that there were no unused areas of the CVF large enough to store all the cluster's data in one contiguous block.  In such a case, the cluster would be stored in the CVF in two or more pieces - it would be fragmented.  There's only one bit in a cluster's MDFAT entry that says whether it's fragmented or not, so information on where each fragment is, and how big it is, is stored at the start of the first fragment, before the data (or compression header).  The format for this "Fragment Header" is described below (it's accurate, I got it straight from the MSDN CD).  Usually there won't be any fragmented clusters anyway:  They only happen if the disk gets really, really full, and also, running Defrag eliminates (un-fragments) them (I think).  If a cluster is fragmented, the MDFAT entry for the number of sectors used to store the cluster contains the number of sectors used to store THE FIRST FRAGMENT ONLY (including fragment header).  For compressed fragmented clusters, the compressed data follows on immediately after the fragment header, but for uncompressed fragmented clusters, the actual data begins in the second sector, and the first sector contains only the fragment header, with the remainder of this sector being padded with the value 45544550 hexadecimal (that is, the repeating hexadecimal byte sequence 50 45 54 45).
-------------------------------------------------------------------------------------------------------------------------------------------------
The CVF entry for a fragmented cluster begins with the following fragment header, at the start of the first CVF sector used to store that cluster's data:

--- First 4 bytes store a count of (number of fragments - 1).
--- The remainder of the fragment header consists of a series of 4-byte records, one for each fragment (including the first).

--- The format of each such record is as follows:

bits 0-22:  Sector within the CVF where fragment begins.  This is interpreted the same way as the number in the first 23 bits of an MDFAT entry.
bits 23-25: RESERVED
bits 26-31: These six bits store (number of sectors in fragment - 1)

Note that each of these records is essentially formatted the same way as the lower 4 bytes of an MDFAT entry.
As before, remember that when using a disk editor, the order of bytes is LSB -> MSB, so the first or leftmost byte represents bits 0-7, etc.
-------------------------------------------------------------------------------------------------------------------------------------------------


>>> The END SECTOR is just the last sector of the CVF file, and contains all zeros except for a two-byte code right at the start (or end? I forget).  It's sole purpose appears to be to signal the end of the file.










OK, I HAVE A CORRUPTED CVF, WHAT DO I DO TO GET MY DATA BACK?
=============================================================

Normally, if the CVF gets badly corrupted, the drive won't mount.  Alternatively, it might mount but appear to be empty or full of junk, or give errors when you try to read from it.  Note that just overwriting the data areas of the CVF WON'T CAUSE ANY SERIOUS PROBLEM:  You'll just lose whatever bit of data happens to get overwritten, but the mounted drive will still work normally otherwise (you may get read errors from if compressed cluster data isn't valid).  Similarly, if the boot sector, FAT, or root dir gets corrupted, the drive will mount (maybe with an error message), and work just like a regular physical drive with the same problem.  So, the only kind of corruption for which specific knowledge of DriveSpace is required is if either the file header or the MDFAT gets corrupted (as a mentioned earlier, the contents of the BitFAT don't appear to matter much; corruption there would not stop you mounting the drive or accessing files as usual, as far as I know).

Fixing a corrupted CVF header is very easy:  Just create a new drivespace volume (it can be as small as you like, the size doesn't matter), and overwrite the first sector of your corrupt CVF with the first sector of the newly created one.  Then mount the drive, that's all.  (After checking your files are all there, you should probably run ScanDisk to fix up any less serious problems).

BUT STOP!!!   DON'T DO ANYTHING UNTIL YOU'VE READ THE REST!!!

If your problem is only that the header of your CVF has been overwritten, then running ScanDisk with the /MOUNT option should fix it (i.e. type SCANDISK DRVSPACE.00x /MOUNT on the command line, where x=0,1,2,... as necessary.  Note that this option does NOT show up on the help screen for Scandisk that you get when you type "ScanDisk /?", but it's there anyhow).  If however, the MDFAT is also corrupted, then Scandisk won't be able to fix it (obviously).  Scandisk /MOUNT will force the CVF to mount regardless, then zero-out any invalid MDFAT entries (as well fixing or re-creating the mounted volume boot sector, FAT, and root dir, if they are corrupt).  This means that your data may still be in the CVF somwhere, but with no MDFAT entry pointing to it, it won't show up on the newly re-mounted drive - the drive may appear empty, or partially empty, wherever the MDFAT entries for the corresponding clusters have been zeroed out.  It may also contain some junk, wherever corrupt bytes "accidentally" look like valid MDFAT entries.  To get back data lost in this way, you have to simply trawl through the CVF (uncompressing compressed clusters as you go), searching for whatever it is you've lost.  That's what the accompanying programs are designed to do.
  Another potentially very serious problem is if your HOST drive gets corrupted, the FAT entries for the CVF might get erased or corrupted.  Now, CVF's are normally stored on the host drive in one or two contiguous blocks (at least on my system they were) and DriveSpace does not permit CVF's to get fragmented; so you shouldn't have too much of a problem putting the CVF back together, more-or-less.  However, if you are reconstructing the FAT entries for the CVF, it is easy to accidentally skip clusters or add extra clusters in at the start of the file or of the 2nd fragment.  This has the disastrous consequence of (1) moving the data within the CVF out of alignment from where the MDFAT says it should be, yeilding a total mess (very likely, you won't be able to decompress data since you'll get the starting points wrong, and most clusters will just give "read error"s when you try to access them, and (2) even worse:  Shifting the MDFAT, boot record, mounted drive FAT, and root dir up or down a few clusters within the CVF will cause them to be inaccessible, or appear corrupted.  As a result, SCANDISK WILL OVERWRITE THEM!!!!!!!!!!!!!!!

>>> Even after the crash or whatever damaged your CVF, the MDFAT may be OK.  But if you've lost or gained a cluster or two at the start of the CVF in the process of repairing it, then running SCANDISK, or even just trying to mount the drive may destroy the MDFAT and make recovering your data much, much, MUCH harder.  I know, because that's what happened to me!

**************************************************************************************************************************************************
IF YOU HAVE A DRIVESPACE CVF THAT GETS DAMAGED IN SOME WAY, BACK IT UP (COPY THE .00x FILE) BEFORE ATTEMPTING TO MOUNT IT OR RUN SCANDISK ON IT!
**************************************************************************************************************************************************

So, if you've got a CVF that doesn't mount, or that mounts but doesn't contain the data you know should be there, here are the steps you should take:

(1)  Rename the CVF so that DriveSpace doesn't try to mount it.

(2)  If the problem involved damage to the host drive's FAT, repair it as you would for any file, but take particular care that you don't accidentally add or leave out clusters at the start of the CVF file (or 2nd fragment).  Remember that the start of the CVF should be a header that looks a lot like boot sector information.

(3)  Before attempting to mount the drive, use a hex editor to visually inspect the CVF and check that the MDFAT, boot sector, etc. are all in their correct locations.  To find out just where their correct locations are, create a new DriveSpace volume (small is OK, say 2MB), and create a second copy of the new DRVSPACE.00x file.  Save a short file (<512 bytes) in the new drive's root dir, and also use a disk editor (e.g. Norton DISKEDIT) to write some data (also <512 bytes) to the first cluster of the mounted drive (this is unnecessary if the file was already put in the first cluster, though).  After doing this, compare the mounted CVF with the copy you made before the operation (say using the MSDOS FC utility):  You should find the following differences:

i.   One or two bits in the BitFat will have changed.  You can ignore this, except to note where the BitFat is.

ii.  The root directory will have changed.  Look for your filename in the CVF, you'll find out where the root dir is supposed to be.

iii. The FAT will have changed.  It'll be before the root dir and after the boot sector, and since you know (or can find out from its directory
     entry) which cluster your file was saved to, you can work out where the start of the FAT is.

iv.  The boot sector will have changed, since it contains some bytes recording free space on the drive.  You should be able to recognize the boot
     sector data;  it'll look almost exactly like the boot sector data for your host drive.  Note that the boot sector has a few dozen bytes of
     information at the start, and then is just zeros mostly.  The free space bytes are not near the start, though.

v.   The actual data you wrote to you disk will show up somewhere in the CVF.  It'll be uncompressed, so you can search for it if you want, since
     clusters containing <512 bytes fit entirely into one sector, and no space would be saved by using compression.  Where it is isn't important.

vi.  Finally, and most importantly, the MDFAT will have changed.  There will now be a non-zero entry in it for each cluster written to the disk.
     The entry pointing to the data in the mounted drive's first cluster will be located at the very start of the MDFAT (or at least, 20 bytes
     from the start), so you can use it to find out where the MDFAT begins.


(4)  If your damaged CVF seems to have the MDFAT or other structures shifted a few clusters away from where they ought to be, then you've added or skipped clusters when you reconstructed the file, and you should go back and fix this.

(5)  If your MDFAT appears to be missing or completely overwritten or corrupt, go directly to step (10).

(6)  BACK UP YOUR DAMAGED CVF if at all possible, then try to mount it.  If you can't, try SCANDISK drvspace.00x /MOUNT.

(7)  Can you see your files?  Is the data OK?  If so, copy everything off the DriveSpace drive to another drive, delete the DriveSpace drive,
back up all your important files regularly, and for god's sake don't use DriveSpace for anything at all, ever again. (Buy a bigger HDD and use FAT32 instead!).

(8)  If you can't see the files that should be there, or if they don't contain the right data, it might be because directories or the FAT got corrupted.  Check for this and attempt to fix it just as you would for a physical drive.  Use a disk editor to search for the *contents* of your files on the mounted drive.  If they are there somewhere, the problem is almost cerainly the FAT not the MDFAT.  If none of this seems to be the case, or if you get disk errors or read errors when trying to read from the drive, then...

(9)  There may be corruption in the MDFAT.  Check it manually for the files or directories of interest, using the info I've provided.  An alternative cause is that an extra or a skipped cluster (inserted or removed AFTER the MDFAT position in the CVF) has moved the actual data which the MDFAT points to.  Look at what the various entries in the MDFAT point to.  For compressed clusters, the MDFAT should point to a sector beginning with "4A 4D" or "53 51" (exception: fragmented clusters).  For uncompressed data, the MDFAT very probably OUGHT to be pointing to the start of the uncompressed data (the exception is if two uncompressed clusters are stored one after the other in the CVF).  If these conditions aren't being met, see if they would be if everything was moved up or down a whole number of clusters.  If that fixes the problem, then insert or remove clusters from the CVF as necessary, and go back to step (7).

(10) If you still can't access the files you want to recover, then either they have really been overwritten (in which case they are gone, and there's absolutely nothing you can do about it), or else (more likely) they are still there, but the MDFAT entry pointing to them is gone.  If this is the case, there are two strategies you can use to recover them.  One is to attempt to reconstruct the MDFAT.  The other is to ignore the MDFAT and attempt to reconstruct your files directly.  Both of these require that you painstakingly trawl through the data in the CVF, looking for fragments of your important files, and working out which fragments belong to which files, and how they fit together.  Fortunately, the fragments are 32K in length (unless you've got fragmented clusters), so small files will usually be in one piece.  Also, if you've used DeFrag on the drive regularly, you may be lucky and many of your larger files may turn out to be in one piece as well.  The programs accompanying this ReadMe file are designed to help you perform this task.  Instructions on their use follows.







USING THE ACCOMPANYING PROGRAMS:
================================

The 4 programs included in this kit are intended to help you manually decompress the compressed data in a CVF, and then to recover fragmented files from the resulting decompressed but probably scrambled data. To do the recovery, you'll need (in addition to the CVF) a hard drive with enough space to hold the entire *decompressed* contents of the corrupted drive.  You'll also need to have DriveSpace installed and active in order for the "DCMPRESS" program to work (i.e. there has to be a mounted drivespace volume somewhere on your system - but NOT the damaged one; you should rename it so DriveSpace doesn't recognize it).  Plus you'll need a hex editor or viewer like Norton Disk Editor.  I also found the
simple 4DOS "list" utility very useful, since it's hard to navigate through long files in DiskEdit (note that most general-purpose editors do NOT cope well with trying to open files 100's of MB long!!!).  Be warned, though, LIST does seem to have a few bugs.

IMPORTANT TIP FOR USERS OF DISKEDIT:
I've discovered that when searching through a long file, the version of DiskEdit I was using searched MUCH faster (maybe 10 times faster) if I started it with the name of the file as a command line parameter, rather than navigating to the file later on.
i.e. you should start diskedit by typing "DISKEDIT myfile" rather than just "DISKEDIT".

Note that to use DiskEdit (or any other disk editor) to WRITE to a disk, you have to be in MS-DOS mode (Not just a DOS window, you need to use the "Boot in MS-DOS mode" or "Command prompt only" option), and you have to lock the drive you want to write to first.  If it's drive C: you wish to write to, this is accomplished by typing "LOCK C:" at the DOS prompt.



(1) DECMPRESS
-------------
The purpose of DCMPRESS is to decompress all the compressed data within a CVF whose MDFAT has been damaged, so that the data becomes accessible.

The usage for DCMPRESS is as follows:    DCMPRESS <corrupt CVF> <uncompressed data file> <cluster info file>

In order for DCMPRESS to work, DriveSpace must be loaded.  The easiest way to ensure this is to create a new, empty DriveSpace volume on one of the drives of your system.  DCMPRESS ought to work under Windows, but nevertheless I recommend running it in MS-DOS mode.  Note that DCMPRESS does not alter the CVF in any way, it simply creates 2 new files, with names and in locations of your choice, so you cannot cause further damage to your CVF by running DCMPRESS.  DCMPRESS works as follows:
  DCMPRESS simply reads through the entire CVF, looking for sectors that begin with the compression header "4A 4D" or "53 51", then it reads in these sectors and the sectors that immediately follow, up to either (i) 64 sectors or (ii) the next sector that begins with "4A 4D" or "53 51", whichever is less.  This group of sectors is then passed to the DriveSpace decompression engine, which decompresses them into a 32K cluster (padded at the end with zeros, if necessary), which is written to the uncompressed data file.  If the decompression engine returns an error, the number of the cluster (within the uncompressed data file, starting at zero) is written (in binary form) into the cluster info file.  Errors may occur because the data is corrupt, but another very common cause is that the data did not decompress to fill a whole cluster.  DCMPRESS makes no use whatever of MDFAT information, so it does not know how much space a give chunk of compressed data is supposed to fill.  DCMPRESS simply assumes that a whole cluster's worth will be filled.  But, if the original cluster empty sectors at the end (which is often the case for the last cluster of a file - it won't be completely full), then the decompression engine will return an error, even though the data decompressed correctly.  So, the cluster info file can be treated as an approximate list of end-of-file clusters (as well as corrupt clusters).
  Note also that the uncompressed data file contains ONLY the data that was originally compressed within the CVF;  data that was stored in the CVF in uncompressed form will not be copied into this file.
  DCMPRESS is not foolproof:  Since it blindly looks for sectors beginning with "4A 4D" or "53 51", it will NOT decompress information from fragmented clusters (which begin with a fragment header, not "4A 4D", etc.).  Also, it may attempt to decompress clusters that do not contain compressed data, if they happen to begin with "4A 4D" or "53 51" (this is harmless, it'll just leave an empty error cluster in the uncompressed data file). It may also truncate data in a valid compressed cluster, if that cluster's compressed information happens to contain a sector that (coincidentally) begins with "4A 4D" or "53 51" mid-way through.
  When DCMPRESS is running, it will display a message after each MB of CVF that it has processed, to keep you appraised of its progress.



(2) MAPRECVR
------------
Often, a file stored within the CVF will have some of its clusters compressed, and some uncompressed (particularly the last cluster, if it contains <512 bytes).  So if you find part of a file in the "uncompressed data file" produced by DCMPRESS, other parts of it may still be in the CVF in uncompressed form, nearby to where the compressed data for that file came from.  To help you work out where, in the CVF, a particular cluster from the "uncompressed data file" originally came from, MAPRECVR produces a list relating each cluster in the "uncompressed data file" to the sector in the CVF which it came from.

The usage for MAPRECVR is as follows:    MAPRECVR <corrupt CVF> <output file> [<cluster info file>]

MAPRECVR does not require DriveSpace, and will happily run in a Windows DOS box.  Its only effect is to create an output file with a name and at a location of your choosing.  The output of MAPRECVR is a file containing a very long list, a section of which might look something like this:


     .
     .
     .
00003200  00028000
00003A00  00030000 *
00005200  00038000 *
     .
     .
     .

The numbers are byte offsets, in hexadecimal.  The numbers in the left hand column are the positions within the CVF file where cluster information was decompressed from, and the corresponding numbers in the right hand column are the positions in the "uncompressed data file" generated by DCMPRESS, where the cluster information was written to.  If the optional parameter [<cluster info file>] is provided to MAPRECVR, then the output file will contain an asterix next to each list entry that generated a decompression error.  Often this just means that that cluster is the last cluster of a file.
  Note that due to limitations in the way MAPRECVR was written, it cannot handle cluster info files of more than (approx) 640K in size, and will give an out-of-memory error in such cases.  As a noted earlier, these are quick-and-dirty hacker's tools, not polished applications!  If you have this problem, you can still use MAPRECVR without the optional cluster info file, or you can divide your cluster info file into two or more parts and use them one at a time:  Only the sections of the list corresponding to the part you use will get asterix'ed.  (The cluster info file is just a sorted list of cluster numbers, 4 bytes each.  It has no other structure).
  When MAPRECVR is running, it will display a message after each MB of CVF that it has processed, to keep you appraised of its progress.



(3) RECOVER and CHKRECVR
------------------------
Once you've uncompressed the data in your CVF, and generated a map file using MAPRECVR, you then face the long and thankless task of trawling through both the original CVF and the "uncompressed data file" produced by DCMPRESS, looking for fragments of your important files.  Usually there is just way too much information to look through visually in any detail, and you have to use your editor's "search" feature to look for things that you think your files are likely to contain.  Good things to search for are file headers (e.g. all MS WORD files begin with a characteristic sequence of bytes), as well as names or phrases that you remember your files included.  Files are also often marked with the name or username of the person who created them, so this is something else to look for.  You may also want to look for directory information, which might remind you which files you need to look for, and will also tell you what the file sizes are supposed to be (also the starting cluster of the file, which can be a rough guide to where in the CVF it was stored).
  If you had to recover a file fragment separately each time you found one, the job would take forever.  A much better idea is just to keep track of all the fragments you find, working out which goes with which and for which file, and then recover all your files in one go at the end.  That is what RECOVER and CHKRECVR are designed for.

The usage for these two programs is as follows:

CHKRECVR  <recovery info file> <output file> [<cluster info file>]
RECOVER  <corrupt CVF> <uncompressed data file> <recovery info file>

The "cluster info file" (as produced by DCMPRESS) is an optional argument to CHKRECVR.  Once again, a limitation of CHKRECVR is that it can run out of memory if the input files you pass it are too big.  As with MAPRECVR, you can work around this problem by dividing them into pieces.  Neither of these two programs do anything other than create new files on disk, neither require DriveSpace, and they will both run happily in a Windows DOS box.
  The "recovery info file" is where you store the information you have gathered about the locations and sizes of the various fragments of your files.  For each file you wish to recover, it contains the name of the file (path may be included), optionally the size that the file ought to be, and then an ordered list of the fragments making up the file.  The format is as follows:


>>>  A semicolon and anything following it up to the end of the line is ignored, so semicolons may be used for comments.  ; like this!

>>>  Whitespace (spaces, tab characters, etc.) are ignored.

>>>  The entry for each file begins with a '#' character as the first character on the line.  This is optionally followed by an 8.3 format filename, which can include a path (NB long file names are not supported).  If the filename is omitted, the file will be called "UNKNOWN.xxx", where xxx is a number.  After this (on the same line), there may be a comma and a file length.  The file length (in bytes) can be a decimal number, or it can be given in hexadecimal with a '$' sign in front.  Here are a few examples:

--------------------------------------------------------------------------------------------------------------------------------------------------
# Myfile.doc, 56729          ; a word document, file length 56729 bytes

< fragment data >            ; this is where you put the list of file fragments you found for Myfile.doc

# ,$36aF                     ; an unknown file, length 0x36AF hexadecimal

< fragment data >

#                            ; an unknown file, of unknown length
< fragment data >            

# Mypict.gif                 ; a gif file, of unknown length 
< fragment data >
--------------------------------------------------------------------------------------------------------------------------------------------------

>>>  The fragment data consists of a list of fragments, over one or more lines, in the order that they should be put together to reconstruct the file.

>>>  Each fragment is described either by a single hexadecimal number (no preceding '$' this time), or by a range in the form (<start>-<finish>).

>>>  Characters 'a' to 'f' and 'A' to 'F' are equally valid in hexadecimal numbers. Leading zeros are ignored.

>>>  Multiple fragments on the same line must be separated by commas.

>>>  By default each number represents a byte offsets within the "uncompressed data file", but if the first character on the line is a "U", then numbers on that line represent byte offsets within the CVF instead.

>>>  Each fragment must begin at the start of a cluster (uncompressed data file) or sector (CVF), and if it isn't at the start of a cluster/sector, the starting address is rounded down so that it is.

>>>  When a range is specifed, it is INCLUSIVE, EXCEPT when the finish value is the first byte of a cluster/sector.  In other words,
"1E400-1E619" represents a range of 512+32 = 544 bytes, but if "1E400-1E800" is a range to be taken from the CVF (i.e. if it is on a line that begins with the letter "U"), then it represents the two sectors beginning at 1E400 and 1E600 repspectively, and does NOT include the sector starting at 1E800.  The same effect would be accomplished by writing "1E400-1E7FF".

>>>  If a range is not a whole number of clusters/sectors, then it is rounded up, unless it is the LAST fragment of a file.  So the line:

U 1E400-1E619, 1E800, 1F000          ; part of a sector omitted mid-file

would have the same effect as:

U 1E400-1EA00, 1F000


>>>  The last fragment of a file may be written as a range ending in the dot '.' character, like this:

  2838000-28B8000, 28E0000, 2D70000-.

If no file length was provided, then this will work the same is if the "-." wasn't there.  But if a file length was provided for the file, the size of the final fragment will be computed so that the file size is correct.

Here is an example of a valid entry for a single file:

--------------------------------------------------------------------------------------------------------------------------------------------------
# C:\MYDOCU~1\Myfile.doc, 99374629    ; A very long document!

  FC68000-FE90000, 521B0000, 121F8000-122C8000     ; clusters from the uncompressed data file
  3ADF0000, 99C8000-99E8000                        ; note that fragments may not be in correct order in the CVF or uncompressed data file!!!
                                                   ; blank lines are ignored
U 668200-698200                                    ; 6 clusters that never got compressed, stored in the CVF in uncompressed form

  15B98000-.                                       ; takes as large a fragment as necessary to make the file 99374629 bytes long altogether.

; comments can go anywhere!!

# Nextfile.xxx                        ; entry for the next file starts here
--------------------------------------------------------------------------------------------------------------------------------------------------



The RECOVER program actually performs the recovery operation, and creates the files that you specify in the recovery info file.  The CHKRECVR program can be used first, to check that your recovery info file is OK.  It performs the following checks:

>>>  It checks that the syntax of your input file is OK.
>>>  It checks that no two files have the same name.
>>>  It checks that no two files overlap (i.e. have clusters or sectors in common).
>>>  When the file size for a file is provided, it checks that it contains the appropriate number of clusters for its specified size, and if the
     final fragment is a range, it checks that it works out to exactly the right size.
>>>  If the optional "cluster info file" parameter is used, then it also checks that no file contains a cluster with a decompression error in the
     middle of the file (but at the end is OK, as discussed above).

CHKRECVR outputs a list of the errors that it finds to the output file you specify.  This can then be viewed with any text editor. Note that you may ignore these errors if you so choose;  RECOVER will run regardless.


==================================================================================================================================================
==================================================================================================================================================
==================================================================================================================================================

That's about all I know about DriveSpace, and all I can tell you about recovering data from damaged CVF's.

I hope it helps, and good luck getting your data back!!!

==================================================================================================================================================
==================================================================================================================================================
==================================================================================================================================================

If this kit has helped you, you may express your appreciation by recompensing me for the time it took to create it, and send me a monetary donation appropriate to the use you got out of it.

No amount is too large or too small, and cash ($US or $Australian) is preferred over cheques, if possible (appropriately wrapped & concealed for postage, of course).

Send your donation to:


(before August '99:)

DEAN TROWER
Flat 3, 9-11 Browns Rd
CLAYTON, VIC
AUSTRALIA 3168



(after August '99:)

DEAN TROWER

672A Orrong Rd.
TOORAK, VIC
AUSTRALIA 3142


==================================================================================================================================================
==================================================================================================================================================
==================================================================================================================================================