Colour Computer Disk-To-Tape Transfer Utility
Version 1.1

by Jean-Franois Morin
Copyright 1998
=============================================


This package is freeware, though copyright remains with the author,
Jean-Franois Morin (jfmorin@altavista.net). Free distribution of this
package is encouraged, though it may not be altered or sold for profit.


Table of contents
-----------------
1. Introduction
2. User's guide
3. File syntax description
   3.1 Basic (not BASIC!) knowledge
   3.2 Syntax used for machine language programs on disk
   3.3 Syntax used for tokenised BASIC programs on disk
   3.4 Syntax used for ASCII disk files
   3.5 Syntax used for cassette files
4. Conclusion



1. Introduction
---------------

This program has been devised to help cassette-based Tandy Colour Computer
(CoCo) and CoCo emulator users to use disk-based files found on the Internet
on their system. To keep it accessible to any type of file, it takes as input
a standard PC file, which it copies into a virtual cassette (.CAS) file
according to the standard settings used by the CoCo.

Why starting from a standard PC file?

- Files may be readily extracted from any virtual disk image (.DSK file) or
  real disk according to the disk format:

  - An RSDOS file may be transferred to a PC file using Jeff Vavasour's PORT
    utility provided with both CoCo2 and CoCo3 emulators.

  - An OS-9 file may be transferred to a PC file using an utility such as
    Carey Bloodworth's OS-9 <-> PC file copier, which is available on the
    ftp://os9archive.rtsi.com site. Transferring OS-9 files on tape may be
    useful for ASCII documents created under OS-9 having to be viewed and/or
    edited by a cassette-based user.

- It allows any standard PC document to be transferred to a CoCo cassette
  without PORTing it to a CoCo disk image, which would be sort of an useless
  step given its PC-based origin.

- It makes the program operation independent from the original virtual disk
  size (35, 40, or 80 tracks) or format (RSDOS or OS-9), if any.

Like any virtual cassette file produced with either CASIN or an emulator, the
files produced by DiskTape are fully transferable to real cassettes with Jeff
Vavasour's CASOUT utility.

Section 2 briefly shows how to use DiskTape. For the more investigative
reader, section 3 details the syntax of disk and cassette files as used by
CoCo systems.



2. User's guide
---------------

The program execution syntax (from the MS-DOS prompt) is the following:

disktape [source-file] [dest-file]

where source-file is the file to be transferred to a virtual cassette file,
and dest-file is the destination cassette file. If only one parameter is
specified, it is assumed to be the source file, and DiskTape prompts the user
for the destination file. If no parameters are specified, the user will be
prompted for both files.

DiskTape then reads the input file and tries to determine its type. The
following cases may occur:

- DiskTape determines the file to be a tokenised BASIC program (file type =
  0). Such a file should be binary (file mode = 0).

- DiskTape determines the file to be an ML program (file type = 2). Such a
  file should be binary too (file mode = 0).

- DiskTape is unable to determine the file type. The following cases are
  possible:

  - The file is an ASCII BASIC program (file type = 0, file mode = 255).
  - The file is a data file (file type = 1, file mode = 255).
  - You are trying to translate a non-CoCo file.

Beware! Some non-CoCo files may happen to begin with a $00 or a $FF byte (see
section 3), thus being detected as ML or tokenised BASIC programs,
respectively.

If the file is determined to be binary, its expected length is provided in
the leading information. Therefore, the following cases are likely to happen:

- The file length matches the expected length (see section 3 for details).

- The file is shorter than expected, so DiskTape immediately terminates the
  program using the current checksum. This guarantees the resulting cassette
  file from loading failure due to a checksum-based IO error. However, the
  program may function improperly.

- The file is longer than expected, so DiskTape prompts the user upon two
  possibilities:

  - to discard the remainder and terminate the file immediately, or
  - to translate the remainder without regard to the expected file length.

Since there is no file length information in an ASCII file, the expected file
length is automatically set to the real file length in that case.



3. File syntax description
--------------------------

I wrote this section as a reference for both the curious reader and the CoCo
developer desperately looking for details about those formats used by CoCo
disk and cassette I/O functions. Section 3.1 defines some terms necessary to
understand the following format descriptions. Sections 3.2 and 3.3 describe
the formats of ML and tokenised BASIC disk files, respectively. Section 3.4
mentions some facts about ASCII file handling. Finally, section 3.5 describes
the format of virtual cassette (.CAS) files.


3.1 Basic (not BASIC!) knowledge
--------------------------------

On disk and cassette files, addresses and file lengths are represented as
16-bit integers (spanning values 0 to 65535), thus as two bytes within the
resulting file:

- The Most Significant Byte (MSB) represents the 8 first bits of such a 16-bit
  number. These bits have values 256 to 32768 within the 16-bit integer, so we
  must multiply it by 256 to get its real value (or contribution) to that
  integer.

- The Least Significant Byte (LSB) represents the 8 last bits of that number.
  These bits have values 1 to 128 within the 16-bit integer.

The 16-bit integer value is thus given by MSB * 256 + LSB.

The MSB/LSB distinction (thus the order) is important because A * 256 + B is
equivalent to B * 256 + A if and only if A = B, the probability of which is
1/256! In cassette and disk files, 16-bit integers are always represented in
the (MSB, LSB) order. However, this order may be reversed in other types of
files (not necessarily CoCo emulator-based files), for example CoCo snapshot
(.PAK) files.

ML is a current abbreviation for Machine Language.


3.2 Syntax used for machine language programs on disk
-----------------------------------------------------

- Byte 0 = program type ($00 for ML)
- Byte 1 = program code size (MSB)
- Byte 2 = program code size (LSB)
- Byte 3 = start address (MSB)
- Byte 4 = start address (LSB)
- Bytes 5 to X = program code of length corresponding to
  L = (Byte 1 * 256 + Byte 2)
- Bytes L+5 to L+7 = end-of-program token ($FF, $00, $00)
- Byte L+8 = execute address (MSB)
- Byte L+9 = execute address (LSB)

The total file length is thus L+10, where L represents the program code
length. It DOES NOT include the end-of-program token.


3.3 Syntax used for tokenised BASIC programs on disk
----------------------------------------------------

- Byte 0 = program type ($FF for BASIC)
- Byte 1 = program code size (MSB)
- Byte 2 = program code size (LSB)
- Bytes 3 to L+1 = program code of length corresponding to
  L = (Byte 1 * 256 + Byte 2 - 2)
  The program code is made of program lines of the following form:
  - A 16-bit (MSB, LSB) pointer to the next line. This pointer is likely to be
    an offset of $0C00 on cassette-based systems, or of $0E00 on disk-based
    ones. However, these pointers are updated, when the program is loaded,
    according to the current starting address for BASIC programs. To convince
    yourself of this, try different PCLEAR statements before loading the
    program...
  - A 16-bit integer (MSB, LSB) representing the program line
  - The tokenised BASIC code for the current line
  - An end-of-line byte ($00)
- Bytes L+2 and L+3 = end of program token ($00, $00)

The total file length is thus L+2, where L represents the program code length.
It DOES include the end-of-program token. This is the reason for the -2 in
the computation of L.

Take care of this difference when handling/editing binary CoCo files!


3.4 Syntax used for ASCII disk files
------------------------------------

There are no bytes in addition to the file content itself. There is thus no
specific ASCII file syntax, and the file length simply corresponds to the
real file length.

However, there may be a formatting difference regarding carriage returns.
Indeed, there are three existing syntaxes:

- CR ($0D) only: used by the CoCo and the Macintosh
- LF ($0A) only: used by Unix systems
- CR + LF ($0D, $0A): used by MS-DOS and Windows systems

Nevertheless, transferring an ASCII file using the LF or CR+LF standard does
not seem to hinder its functioning on a CoCo. I transferred a PC-formatted
(CR+LF) ASCII BASIC program with DiskTape, and it loaded and worked properly
on the CoCo.

If you experience any problem, please notify me accordingly, so that I can
fix it promptly.


3.5 Syntax used for cassette files
----------------------------------

I found a reference about the .CAS file format in a bit.listserv.coco
newsgroup message some time ago. It was quite instructive, although it
contained some erroneous information about block checksum computing. I found
the right computation method by trial and error, which is the following:

- A header of 128 bytes ($55).
  Headers may be longer without problem, but it may be risky to reduce their
  length below 128 bytes, which is the default length used by CoCo systems.
- The title block, whose data is 15 bytes long:
  - The block header ($3C, $00, $OF)
  - B01-B08: 8 bytes for the filename. This length is fixed, and empty
    characters are filled by spaces.
  - B09: File type byte ($00 = BASIC, $01 = Data, $02 = ML)
  - B10: File mode byte ($00 = binary, $FF = ASCII)
  - B11: Gap flag ($00 = continuous, $FF = with gaps). By default, the CoCo
    uses a continuous flow for binary files and a flow with gaps for ASCII
    files. The DiskTape utility keeps this standard.
  - B12-B13: Execution address (MSB, LSB), a 16-bit integer (see section 3.1)
    representing the program execution address. It is set to zero for files
    other than ML programs.
  - B14-B15: Start address (MSB, LSB), a 16-bit integer (see section 3.1)
    representing the program start address. It is set to zero for files other
    than ML programs.
  - Checksum = [15 + Sum(B01 to B15)] mod 256 (Since the sum is an 8-bit
    integer, the "mod 256" operation is automatic.)
- A header of 128 bytes ($55).

- Then we have the data blocks, which have the following format:
  - The block header ($55, $55, $3C, $01)
  - The block length L (one byte), where 0 <= L <= 255. L may be equal to
    zero: such a block is neither invalid nor considered when loaded by a
    CoCo system (I experienced it!).
  - The block data (L bytes)
  - Checksum = [(1 + L + Sum(L data bytes)] mod 256
  - If the gap setting is on, a header of 128 bytes ($55).

- When all data blocks have been written, the following block terminates the
  file:
  - The block header ($55, $55, $3C, $FF)
  - The block length ($00)
  - Checksum ($FF)



4. Conclusion
-------------

I spent some time fine-tuning and improving this program before releasing it.
However, I may have overlooked some little, subtle bugs in my code. If you
happen to discover a bug, please notify me accordingly, managing to carefully
describe the situation in which it happens. Also, please note that any
comment or suggestion towards improving this utility will be welcome.

Jean-Franois Morin
jfmorin@altavista.net
