(tar)cpio
Prev:
Extensions Up:
Archive Format
Comparison of `tar' and `cpio'
==============================
*(This message will disappear, once this node revised.)*
Here is a summary of differences between `tar' and `cpio'. The
accuracy of the following information has not been verified. The
following people contributed to this section, mainly through a survey
conducted in 1991. The remainder of this section does not otherwise
try to relate topics to people.
Bent Bertelsen dmdata@login.dkuug.dk
David Hoopes talgras!david
Guy Harris guy@auspex.com
Kai Petzke wpp@marie.physik.tu-berlin.de
Kristen Nielsen dmdata@login.dkuug.dk
Leslie Mikesell les@chinet.chi.il.us
FIXME: Reorganize the following material
`tar' handles symbolic links in the form in which it comes in BSD;
`cpio' doesn't handle symbolic links in the form in which it comes in
System V prior to SVR4, and some vendors may have added symlinks to
their system without enhancing `cpio' to know about them. Others may
have enhanced it in a way other than the way I did it at Sun, and which
was adopted by AT&T (and which is, I think, also present in the `cpio'
that Berkeley picked up from AT&T and put into a later BSD release--I
think I gave them my changes).
(SVR4 does some funny stuff with `tar'; basically, its `cpio' can
handle `tar' format input, and write it on output, and it probably
handles symbolic links. They may not have bothered doing anything to
enhance `tar' as a result.)
`cpio' handles special files; traditional `tar' doesn't.
`tar' comes with V7, System III, System V, and BSD source; `cpio'
comes only with System III, System V, and later BSD (4.3-tahoe and
later).
`tar''s way of handling multiple hard links to a file can handle
file systems that support 32-bit inumbers (e.g., the BSD file system);
`cpio's way requires you to play some games (in its "binary" format,
i-numbers are only 16 bits, and in its "portable ASCII" format, they're
18 bits--it would have to play games with the "file system ID" field of
the header to make sure that the file system ID/i-number pairs of
different files were always different), and I don't know which `cpio's,
if any, play those games. Those that don't might get confused and
think two files are the same file when they're not, and make hard links
between them.
`tar's way of handling multiple hard links to a file places only one
copy of the link on the tape, but the name attached to that copy is the
*only* one you can use to retrieve the file; `cpio's way puts one copy
for every link, but you can retrieve it using any of the names.
>What type of check sum (if any) is used, and how is this
calculated.
See the attached manual pages for `tar' and `cpio' format. `tar'
uses a checksum which is the sum of all the bytes in the `tar' header
for a file; `cpio' uses no checksum.
>If anyone knows why `cpio' was made when `tar' was prasent >at
the unix scene,
It wasn't. `cpio' first showed up in PWB/UNIX 1.0; no
generally-available version of UNIX had `tar' at the time. I don't
know whether any version that was generally available *within AT&T* had
`tar', or, if so, whether the people within AT&T who did `cpio' knew
about it.
On restore, if there is a corruption on a tape `tar' will stop at
that point, while `cpio' will skip over it and try to restore the rest
of the files.
The main difference is just in the command syntax and header format.
`tar' is a little more tape-oriented in that everything is blocked
to start on a block boundary.
>Is there any differences between the ability to recover crashed
>archives between the two of them. (Is there any chance of
recovering >crashed archives at all.)
Theoretically it should be easier under `tar' since the blocking
lets you find a header with some variation of `dd skip=NN'. However,
modern `cpio''s and variations have an option to just search for the
next file header after an error with a reasonable chance of re-syncing.
However, lots of tape driver software won't allow you to continue past
a media error which should be the only reason for getting out of sync
unless a file changed sizes while you were writing the archive.
>If anyone knows why `cpio' was made when `tar' was prasent >at
the unix scene, please tell me about this too.
Probably because it is more media efficient (by not blocking
everything and using only the space needed for the headers where `tar'
always uses 512 bytes per file header) and it knows how to archive
special files.
You might want to look at the freely available alternatives. The
major ones are `afio', GNU `tar', and `pax', each of which have their
own extensions with some backwards compatibility.
Sparse files were `tar'red as sparse files (which you can easily
test, because the resulting archive gets smaller, and GNU `cpio' can no
longer read it).
automatically generated by info2www version 1.2