





                    The CS-Libraries

                     A Database Kit



                       ComBits
                    P.O. Box 3303
                    2280 GH Rijswijk
                    The Netherlands






    Copyright (c) 1994-1996 by ComBits, the Netherlands.
        All Rights Reserved.




                          1    Contents




1    Contents. . . . . . . . . . . . . . . . . . . . . . . . .

2    Preface . . . . . . . . . . . . . . . . . . . . . . . . .
    2.1 Contacting ComBits . . . . . . . . . . . . . . . . . .
    2.2 Legal Matters. . . . . . . . . . . . . . . . . . . . .
        2.2.1 Disclaimer . . . . . . . . . . . . . . . . . . .
        2.2.2 Royalties and runtime limitations. . . . . . . .
        2.2.3 Trademarks . . . . . . . . . . . . . . . . . . .

3 Introduction . . . . . . . . . . . . . . . . . . . . . . . .

4 Overview . . . . . . . . . . . . . . . . . . . . . . . . . .

5 Debugging. . . . . . . . . . . . . . . . . . . . . . . . . .

6 Runtime Libraries. . . . . . . . . . . . . . . . . . . . . .

7 Compiler Options . . . . . . . . . . . . . . . . . . . . . .
    7.1 Hardware . . . . . . . . . . . . . . . . . . . . . . .
    7.2 Floating point . . . . . . . . . . . . . . . . . . . .
    7.3 Watcom . . . . . . . . . . . . . . . . . . . . . . . .
    7.4 Borland. . . . . . . . . . . . . . . . . . . . . . . .
    7.5 Visual C++ . . . . . . . . . . . . . . . . . . . . . .

8 Standard Types & Definitions . . . . . . . . . . . . . . . .

9 Runtime Errors and Messages. . . . . . . . . . . . . . . . .
    9.1 What to expect?. . . . . . . . . . . . . . . . . . . .
    9.2 Changing the way messages are displayed. . . . . . . .
    9.3 Message related functions. . . . . . . . . . . . . . .
    9.4 Database class messages. . . . . . . . . . . . . . . .

10 Temporary files . . . . . . . . . . . . . . . . . . . . . .

11 Buffering . . . . . . . . . . . . . . . . . . . . . . . . .

12 PAGE-Class. . . . . . . . . . . . . . . . . . . . . . . . .
    12.1 Introduction. . . . . . . . . . . . . . . . . . . . .
    12.2 Storing data in the header-page . . . . . . . . . . .

13 Lock files. . . . . . . . . . . . . . . . . . . . . . . . .
    13.1 Name of a lock file . . . . . . . . . . . . . . . . .
    13.2 Controlling lock files. . . . . . . . . . . . . . . .

14 Read-Only databases . . . . . . . . . . . . . . . . . . . .
    14.1 Class member functions. . . . . . . . . . . . . . . .

15 TBASE-class . . . . . . . . . . . . . . . . . . . . . . . .
    15.1 Introduction. . . . . . . . . . . . . . . . . . . . .
    15.2 Using TBASE . . . . . . . . . . . . . . . . . . . . .
    15.3 Creating a Database . . . . . . . . . . . . . . . . .
    15.4 Opening . . . . . . . . . . . . . . . . . . . . . . .
    15.5 Closing . . . . . . . . . . . . . . . . . . . . . . .
    15.6 Appending Records . . . . . . . . . . . . . . . . . .
    15.7 Deleting Records. . . . . . . . . . . . . . . . . . .
    15.8 Page Utilization. . . . . . . . . . . . . . . . . . .
    15.9 Locating Records. . . . . . . . . . . . . . . . . . .
    15.10 Functions in alphabetical order. . . . . . . . . . .

16 BTREE-class . . . . . . . . . . . . . . . . . . . . . . . .
    16.1 Introduction. . . . . . . . . . . . . . . . . . . . .
    16.2 BTREEx Classes. . . . . . . . . . . . . . . . . . . .
    16.3 Multiple Keys . . . . . . . . . . . . . . . . . . . .
    16.4 Current Pointer . . . . . . . . . . . . . . . . . . .
    16.5 Using Btrees. . . . . . . . . . . . . . . . . . . . .
        16.5.1 Creating. . . . . . . . . . . . . . . . . . . .
        16.5.2 Opening . . . . . . . . . . . . . . . . . . . .
        16.5.3 Inserting . . . . . . . . . . . . . . . . . . .
        16.5.4 Searching . . . . . . . . . . . . . . . . . . .
        16.5.5 Current . . . . . . . . . . . . . . . . . . . .
        16.5.6 Deleting. . . . . . . . . . . . . . . . . . . .
        16.5.7 Closing . . . . . . . . . . . . . . . . . . . .
    16.6 Functions in alphabetical order.. . . . . . . . . . .

17 CSDBGEN . . . . . . . . . . . . . . . . . . . . . . . . . .
    17.1 Introduction. . . . . . . . . . . . . . . . . . . . .
    17.2 Overview. . . . . . . . . . . . . . . . . . . . . . .
    17.3 Features. . . . . . . . . . . . . . . . . . . . . . .
    17.4 Limitations . . . . . . . . . . . . . . . . . . . . .
    17.5 Definition file . . . . . . . . . . . . . . . . . . .
    17.6 Tokenizing. . . . . . . . . . . . . . . . . . . . . .
        17.6.1 How does it work? . . . . . . . . . . . . . . .
    17.7 When is a substring indexed?. . . . . . . . . . . . .
    17.8 Compound indexes. . . . . . . . . . . . . . . . . . .
        17.8.1 A simple example. . . . . . . . . . . . . . . .
        17.8.2 A more complex example. . . . . . . . . . . . .
        17.8.3 Compound & Tokenizing Indexes . . . . . . . . .
        17.8.4 Locating an Entry . . . . . . . . . . . . . . .
    17.9 Export to dBASE . . . . . . . . . . . . . . . . . . .
    17.10 Importing from dBASE . . . . . . . . . . . . . . . .
    17.11 Exporting/Importing to/from ASCII. . . . . . . . . .
    17.12 Starting a new database. . . . . . . . . . . . . . .
    17.13 Opening a database . . . . . . . . . . . . . . . . .
    17.14 Current Record . . . . . . . . . . . . . . . . . . .
    17.15 Accessing fields . . . . . . . . . . . . . . . . . .
    17.16 DATE fields. . . . . . . . . . . . . . . . . . . . .
    17.17 Changing the record layout.. . . . . . . . . . . . .
    17.18 Member functions in alphabetical order . . . . . . .
    17.19 Warning. . . . . . . . . . . . . . . . . . . . . . .
    17.20 A Large Example. . . . . . . . . . . . . . . . . . .

18 VRAM  . . . . . . . . . . . . . . . . . . . . . . . . . . .
    18.1 Introduction. . . . . . . . . . . . . . . . . . . . .
    18.2 Creating. . . . . . . . . . . . . . . . . . . . . . .
    18.3 Opening & Closing . . . . . . . . . . . . . . . . . .
    18.4 VRAM Pointers . . . . . . . . . . . . . . . . . . . .
    18.5 Fragmentation . . . . . . . . . . . . . . . . . . . .
    18.6 Root. . . . . . . . . . . . . . . . . . . . . . . . .
    18.7 Functions in Alphabetical order.. . . . . . . . . . .

19 VBASE . . . . . . . . . . . . . . . . . . . . . . . . . . .
    19.1 Introduction. . . . . . . . . . . . . . . . . . . . .
    19.2 Using VBASE.. . . . . . . . . . . . . . . . . . . . .
    19.3 Relocating records. . . . . . . . . . . . . . . . . .
    19.4 Limitations.. . . . . . . . . . . . . . . . . . . . .
    19.5 Functions in alphabetical order.. . . . . . . . . . .

20 VBAXE . . . . . . . . . . . . . . . . . . . . . . . . . . .
    20.1 Introduction. . . . . . . . . . . . . . . . . . . . .
    20.2 Working.. . . . . . . . . . . . . . . . . . . . . . .
    20.3 Files . . . . . . . . . . . . . . . . . . . . . . . .
    20.4 Prototypes. . . . . . . . . . . . . . . . . . . . . .

21 OLAY. . . . . . . . . . . . . . . . . . . . . . . . . . . .
    21.1 Introduction & Overview . . . . . . . . . . . . . . .
    21.2 Buffering . . . . . . . . . . . . . . . . . . . . . .
    21.3 Performance . . . . . . . . . . . . . . . . . . . . .
    21.4 Core Functions. . . . . . . . . . . . . . . . . . . .
        21.4.1 Creating. . . . . . . . . . . . . . . . . . . .
        21.4.2 Opening . . . . . . . . . . . . . . . . . . . .
        21.4.3 Reading and Writing . . . . . . . . . . . . . .
        21.4.4 Insert & Delete . . . . . . . . . . . . . . . .
        21.4.5 Filesize & bottom . . . . . . . . . . . . . . .
        21.4.6 Closing . . . . . . . . . . . . . . . . . . . .
    21.5 Additional functions. . . . . . . . . . . . . . . . .
    21.6 Import & Export . . . . . . . . . . . . . . . . . . .
    21.7 Sequential functions. . . . . . . . . . . . . . . . .
        21.7.1 Sequential functions in alphabetical order. . .
        21.7.2 Miscellanious functions . . . . . . . . . . . .

22 DLAY. . . . . . . . . . . . . . . . . . . . . . . . . . . .
    22.1 Performance . . . . . . . . . . . . . . . . . . . . .
    22.2 Member functions. . . . . . . . . . . . . . . . . . .

23 IBASE . . . . . . . . . . . . . . . . . . . . . . . . . . .
    23.1 Introduction. . . . . . . . . . . . . . . . . . . . .
    23.2 Using IBASE . . . . . . . . . . . . . . . . . . . . .
    23.3 Using IBASE . . . . . . . . . . . . . . . . . . . . .
        23.3.1 Creating. . . . . . . . . . . . . . . . . . . .
        23.3.2 Opening . . . . . . . . . . . . . . . . . . . .
        23.3.3 Appending Records . . . . . . . . . . . . . . .
        23.3.4 Reading . . . . . . . . . . . . . . . . . . . .
        23.3.5 Writing . . . . . . . . . . . . . . . . . . . .
        23.3.6 Inserting . . . . . . . . . . . . . . . . . . .
        23.3.7 Deleting. . . . . . . . . . . . . . . . . . . .
        23.3.8 Closing . . . . . . . . . . . . . . . . . . . .
        23.3.9 Miscellaneous functions . . . . . . . . . . . .

24 CSDIR . . . . . . . . . . . . . . . . . . . . . . . . . . .

25 CSINFO. . . . . . . . . . . . . . . . . . . . . . . . . . .

26 CSERROR . . . . . . . . . . . . . . . . . . . . . . . . . .

27 CS4DBASE. . . . . . . . . . . . . . . . . . . . . . . . . .
    27.1 Introduction. . . . . . . . . . . . . . . . . . . . .
    27.2 Converting. . . . . . . . . . . . . . . . . . . . . .
    27.3 Example . . . . . . . . . . . . . . . . . . . . . . .
    27.4 Importing large databases . . . . . . . . . . . . . .

28 CSTOOLS . . . . . . . . . . . . . . . . . . . . . . . . . .
    28.1 Introduction  . . . . . . . . . . . . . . . . . . . .

29 CSKEYS. . . . . . . . . . . . . . . . . . . . . . . . . . .
    29.1 CSKEYS.exe. . . . . . . . . . . . . . . . . . . . . .

30 DATE. . . . . . . . . . . . . . . . . . . . . . . . . . . .
    30.1 Example . . . . . . . . . . . . . . . . . . . . . . .
    30.2 Initialising. . . . . . . . . . . . . . . . . . . . .
    30.3 Converting Strings. . . . . . . . . . . . . . . . . .
    30.4 Obtaining date info . . . . . . . . . . . . . . . . .
    30.5 Comparing dates . . . . . . . . . . . . . . . . . . .
    30.6 Arithmetic. . . . . . . . . . . . . . . . . . . . . .
    30.7 Miscellaneous . . . . . . . . . . . . . . . . . . . .

31 HEAP. . . . . . . . . . . . . . . . . . . . . . . . . . . .
    31.1 Purpose . . . . . . . . . . . . . . . . . . . . . . .
    31.2 When to use it? . . . . . . . . . . . . . . . . . . .
    31.3 Using HEAP. . . . . . . . . . . . . . . . . . . . . .
    31.4 Functions in alphabetical order.. . . . . . . . . . .

32 Alloc-Logging . . . . . . . . . . . . . . . . . . . . . . .
    32.1 Introduction. . . . . . . . . . . . . . . . . . . . .
    32.2 Replacements. . . . . . . . . . . . . . . . . . . . .
    32.3 Logging . . . . . . . . . . . . . . . . . . . . . . .
    32.4 Memory Leaks. . . . . . . . . . . . . . . . . . . . .

33 csSTR . . . . . . . . . . . . . . . . . . . . . . . . . . .




                          2    Preface


Nowhere days even the simplest of applications seems to need some
sort of database. Despite this, C++ and consequently most
compilers, have very little, if any, support for it. Roughly speaking you
have the choice between the very basic file IO as defined by the ANSI
standard, or resort to the other extreme and use one of the truly colossul
DataBase Management Systems (DBMS).

Neither option is very appealing.  Basic file IO' is so basic, it will take a
yr to come up with a database application. And despite its simple
design it's still not all that easy to use. In particular newcomers seem to
struggle with it.
On the other hand, using a DBMS is often even less fun. From our own
experience we recall having written a 400 KB application but having to
ship a 15 MB package due to the large X database used.
Apart from that, DBMS's have their roots firmly in the late sixties and
early seventies. They were designed with a mainframe in mind, and are
indeed equipped with all the flexibilty and user-friendly-ness which has
made the mainframe a dying species.

With this library we believe we are offering a third option. One that is
easy to use, poweful and still produces small, fast stand-alone
executables.




2.1 Contacting ComBits

You can reach us, preferably, by E-mail.
The address is: CSLIB@ComBits.nl.

If you don't have E-mail access you can reach us by traditional mail:
        COMBITS
        P.O. Box 3303
        2280 GH Rijswijk
        The Netherlands

Or FAX: +31703960172
Voice:  +31703932300

Please remember, it is GMT +100 over here!


2.2 Legal Matters

2.2.1 Disclaimer

EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT
HOLDER AND/OR OTHER PARTIES PROVIDE THIS SOFTWARE "AS
IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR
IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A
PARTICULAR PURPOSE. THE ENTIRE RISK AS TO THE QUALITY
AND PERFORMANCE OF THE SOFTWARE IS WITH YOU.  SHOULD
THE SOFTWARE PROVE DEFECTIVE, YOU ASSUME THE COST OF
ALL NECESSARY SERVICING, REPAIR OR CORRECTION. IN NO
EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO
IN WRITING WILL THE COPYRIGHT HOLDER BE LIABLE TO YOU
FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL
OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR
INABILITY TO USE THE SOFTWARE (INCLUDING BUT NOT LIMITED
TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR
LOSSES SUSTAINED BY YOU OR THIRD PARTIES), EVEN IF SUCH
HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE
POSSIBILITY OF SUCH DAMAGES.

2.2.2 Royalties and runtime limitations

The CS-Libraries can be used in a commercial software product without
any royalties as long as the number of copies sold annually does not
exceed 10.000.

No part of these libraries may be used in a product which is in any way a
competitor of the CS-Libraries.


2.2.3 Trademarks

IBM and OS/2 are registered trademarks of International Business
Machines Corporation.
MS-DOS and Windows are registered trademarks of Microsoft
Corporation.
Borland C/C++ and dBASE are registered trademarks of Borland
International Inc.
WATCOM is a trademark of WATCOM International Corp.














                                Part


                                One

























 The next part of the documentation presents an introduction and
             discusses  topics of general interest.





                          3 Introduction


Both for historical and practical reasons this library is presented as two
distinct, independent sections. The database part is called CSDB
which is short for Combits Software DataBases, the other section, CSA,
contains general purpose, and portability functions.

This library concentrates on supplying a C++ software developper with
the means to quickly implement database and database-alike
functionality in his/her applications. This is done by supplying a set of
easy-to-use C++ classes. There are classes for fixed lenght records,
variable length records, indexes and so fort.

In addition, a program generator, CSDBGEN, is available, which uses
the database classes in this library as building blocks for other, more
complex and powerful, databases.

There are also classes in CSDB which operate in the grey area betweeen
 file-extensions' and  databases'. Because these classes tackle the
limitations of standard file IO on a very low level, they are very flexible
and can therefore be of great value when the approach of the traditional
database proves too rigid.

This library does not pretend to compete with the multi-user Database
Management systems. Its purpose is to supply a rich set of tools to
overcome the disk-storage problems occurring in single-user
applications.


                            4 Overview


This chapter will give a quick overview of the contents of the package.

The CSDB section contains (mainly) the following classes:
    TBASE:      A class for reading and writing fixed length records.
    BTREE:      A btree+ to be used as an index.
    VRAM:       A 'database' organized as a heap.
    VBASE:      A database class for variable length records.
    VBAXE:      As VBASE but for very large databases.
    OLAY:       A file system which can insert and delete!
    DLAY:       As OLAY but for large files.
    IBASE:      Fixed length records but with the ability to insert or
                delete a record anywhere in the database.

The CSA section contains general-purpose functions and classes.
Among others, the following classes are included:
    csDATE:     To store and manipulate dates.
    csTIME:     To represent the time.
    csSTR:      Strings.
    CSDIR:      To traverse a directory.
    QALLOC:     Quick and dirty way to do dynamic memory
                allocations.
    HEAP:       To efficiently allocate many small blocks from the
                heap.

Command line utilities (DOS, NT, OS/2, Linux):

    CSDIR:      Lists the CS-databases in a directory.
    CSINFO:     Displays information about a CS-database.
    CSDBGEN:    Important program generator.
    CSERROR:    Utility to convert the error file to C++ source.
    CSKEYS:     Displays the return value of the cskey() function.
    CSMALLOC:   Tests the allocation log for memory leaks.
    CS4DBASE:   Conversion utiltity to read dBASE files.




                            5 Debugging


Of each library two versions exist, one to be used during debugging
and one intended for normal 'production'. The difference is that in
the debug-version a lot more tests are done, and so many more errors
are reported.

The idea is to use the debug version during development and recompile
with the production version when ready. The debug version is identical to
the production version, but with additional tests. The 'working code' is
100% identical. This means there are no subtle differences between the
two versions.

The production version however can be substantial faster, up to two
times, depending on the circumstances.

To give an example:
    The TBASE class has a function to read a record from the database.
    For this function to operate properly, the class/database needs to be
    'opened'. In the debug version this is tested with every call to the
    read function. In the production version it is never tested.
    Note that in a decently written and tested application this error
    should not occur.  Forgetting' to call the open function should be
    corrected in the debugging fase and if the open function fails it
    should be trapped well before the call to the read function.

There are many more errors like this. Errors that should not occur when
the application is tested and (almost) ready but which can easily emerge
in the development stage.




                        6 Runtime Libraries


There are many libraries included in the package but only one has to
be linked in at any given time. They indeed have to be linked in
because they are  static libraries', not  dynamic libraries'.


The name of a library is build up according to the next syntax:
Meaning:
    - The first two characters are always  cs'
    - The third character indicates the compiler:
        B:  The Borland C++ compiler
        W:  The Watcom C++ compiler
        V:  The Visual C++ (Microsoft) compiler
        G:  The GNU C compiler
    - The fourth character indicates the platform:
        D:  Dos
        W:  Windows 16 bit
        N:  NT, Windows95, Win32s
        O:  OS/2
        L:  Linux
    - The fifth character indicates the memory model:
        M:  Memory model (Dos, 16 bits Windows)
        C:  Compact model (Dos, 16 bits Windows)
        L:  Large model (Dos, 16 bits Windows)
        H   Huge model (Dos, 16 bits Windows)
        F:  Flat model (NT, OS/2, Linux)
    - The sixth characters indicates debug or production version:
        P:  Production
        D:  Debugging
    - The extension is always: .lib



Example:

   csBWCD.lib

This the 16 bits Windows library for Borland, using the Compact memory
model and the Debug version.


Of course not all combinations do exist. Not all plataforms are supported
by all compilers and vice versa.
    -   Linux is only supported by the GNU compiler.
    -   The GNU compiler is only supported under Linux.
    -   OS/2 is only supported by the Borland- and Watcom compilers.
    -   The Compact, Large and Huge memory models are only
        available under DOS and 16 bits Windows.
    -   The flat memory model is only available under NT,OS/2 and
        Linux.



                        7 Compiler Options


With the ever growing number of compiler options it becomes
increasingly harder to keep everybody happy. As a rule we take the
compiler defaults, even if we know it isn't the fastest.

This chapter shows which options were used while compiling the
CS Libraries. Some of the options are really  options', but a few others
are crucial.
In particular the  alignment' option can spell disaster. Some
structures in the CSDB header files are alignment dependent
and if the appropriate alignment option isn't used, the resulting
executable will simple crash while no compiler-  or linker-error will ever
be generated!


7.1 Hardware

Because the debug version isn't compiled for speed any way, we
grasped the opportunity to assume  worst case' hardware. By doing so, it
can be used to support out-dated hardware like the ancient 8086
processor. Because we have this escape route, the production version
can make reasonable assumptions about the available CPU. As a rule
the DOS libraries are compiled for a 80286 CPU, 16 bits Windows
libraries for a 80386 CPU and the 32 bits libraries for a 80486 CPU.


7.2 Floating point

There is almost no floating point arithmetic used in the CS-Libraries.
However, a few functions use julian-date routines to e.g. calculate the
number of days between two dates. If this is the case, this documentation
will clearly mention it.
Because floating point is used so rarely, we make very conservative
assumptions about the available hardware support.


7.3 Watcom

Next the options used with the Watcom C++ compiler.

Debug version, 32 bits:
    -3r -zld -otexn -zp1 -fpi -fp2 -bt=nt

Prodution version, 32 bits:
    -5r -zld -otexn -zp1 -fpi -fp2 -s -bt=nt

Debug version, DOS:
    -0 -zld -otexn -zp1 -fpi  -bt=dos

Prodution version, DOS:
    -2 -zld -otexn -zp1 -fpi -s -bt=dos

Debug version, 16 bits Windows:
    -3 -zw -zld -otexn -zp1 -fpi -bt=windows

Prodution version, 16 bits Windows:
    -3 -zw -zld -otexn -zp1 -fpi  -s -bt=windows


Explanation:
    -0      8086 instructions.
    -2      80286 instructions
    -3      80386 instructions.
    -3r     Register calling, assuming 80386 or better.
    -5r     Register calling, optimized for pentium but runs on 386 or
            better.
    -zld    Suppress generation of library file names.
    -otexn  Use all optimizations, except  no pointer aliasing'.
    -zp1    1 byte packing of structures. (default) CRUCIAL!
    -fpi    Support both emulation and 80x87, depending on how you
            link.
    -fp2    Use 80287 instructions.
    -s      Don't check stack overflow.


7.4 Borland

Next the options used with the Borland C++ compiler.

Debug version, 32 bits:
    -3 -N -G -O2 -k -f

Prodution version, 32 bits:
    -4 -G -N- -O2 -k- -ff

Debug version, DOS:
    -1- -N -f  -O2 -k -Ff

Prodution version, DOS:
    -2 -N- -f -O2 -G -k- -Ff

Debug version, 16 bits Windows:
    -3 -N -f -O2 -G -k -Ff -WSE

Prodution version, 16 bits Windows:
    -3 -N- -f -ff -O2 -G -k- -Ff -WSE


Explanation:
    -1-     8086 instructions.
    -2      80286 instructions
    -3      80386 instructions.
    -4      80486 instructions.
    -f      Emulate floating point.
    -ff     Fast floating point.
    -O2     Fastest code.
    -G      Optimize for speed.
    -Ff     Automatic far data.
    -k      Standard stack frame.
    -k-     No standard stack frame.
    -N      Check stack overflow.
    -N-     Don't check stack overflow.

Defaults:   Signed characters.


7.5 Visual C++

Next the options used with the Visual C++ compiler.

Debug version, 32 bits:
    -Og -Oi -Ot -Oy -Ob1 -Gf -Gy -GB -DWIN32

Prodution version, 32 bits:
    -Gs -Og -Oi -Ot -Oy -Ob2 -Gf -Gy -GB -DWIN32

Debug version, DOS:
    -f- -G0 -Ot -Ob1 -On -Oc -Oe -Og -Ol -Oo -Gf -Gy -D_DOS -Ge

Prodution version, DOS:
    -f- -Gs -G3 -Ot -Ob2 -OV9 -On -Oc -Oe -Og -Ol -Oo -Gf -Gy
    -D_DOS

Debug version, 16 bits Windows:
    -f- -Oc -Oe -Og -Oi -Ol -On -Oo -Ot -Ob1 -G3 -Gf -Gy -Gw -Ge

Prodution version, 16 bits Windows:
    -f- -Oc -Oe -Og -Oi -Ol -On -Oo -Ot -Ob2 -OV9 -G3 -Gf -Gy -Gw -Gs


Explanation:
    -Oc     Common subexpression optimization.
    -Oe     Enable register allocation.
    -Ol     Loop optimization.
    -On     Disable  unsafe' optimizations.
    -Oo     Post code optimization.
    -Og     Global optimizations.
    -Oi     Enable intrinsic functions.
    -Ot     Favor code speed.
    -Oy     Enable frame pointer omission.
    -Ob1    Expand only  __inline' functions.
    -Ob2    Expand inline  any suitable' function.
    -OV9    Expand even  large' inline functions.
    -Gf     String pooling.
    -Gy     Separate functions.
    -GB     'Blended' CPU.
    -Gs     No stack checking.
    -Ge     Enable stack checking.
    -f-     Select optimizing compiler.
    -G0     8086 instructions
    -G2     80286 instructions
    -G3     80386 instructions.


                  8 Standard Types & Definitions

This chapter describes some types and definitions used throughout
this library. They are platform independent and therefore portable.

Defined in cstools.h:

    FALSE       0
    TRUE        1

Types defined in cstypes.h:

    S8:         Singed 8 bit,
    U8:         Unsigned 8 bit,
    S16:        Singed 16 bit
    U16:        Unsigned 16 bit
    S32:        Singed 32 bit
    U32:        Unsigned 32 bit
    csCHAR:     Signed character

These definitions are used extensively in the function prototypes.

However, on many occasions (particularly function returns) the range of
the variable is not (all that) important. In these cases int s are used
because that's normally the fastest.


The accompanying max & min values are also defined:


    S8_MIN:     -128
    S8_MAX:      127
    U8_MAX:      255
    S16_MIN:    -32768
    S16_MAX:     32767
    U16_MAX:     65535
    S32_MIN:    -2147483648
    S32_MAX:     2147483647
    U32_MAX:     4294967295



                   9 Runtime Errors and Messages


As a rule, errors are signaled by a return value of FALSE, rather then
by a visible message. In this way controle over runtime messages is
placed in the hands of the developer using our libraries.
On the other hand, we try to strike a balance between ease-of-use and
controle. Having to check the return value of each and every function to
trap highly unlikely errors, rapidly becomes very cumbersome.


9.1 What to expect?

The main rule is to expect a return value of TRUE on success and a
value of FALSE in case of failure.
However, sometimes a message is displayed and there are even cases
where the exit() function is called.


We distinguish a few categories of errors:
    - Extremely Improbable Error
        Normally meaning hardware failure. Because it's so unlike, you
        will be tempted to  forget' testing the return value because  it will
        never go wrong'.
        A visible message is generated.
    - Misuse of Functions
        Calling functions in the wrong order, calling functions with wrong
        parameters, etc.. Because this type of error should only occur in
        the earliest development stage, during debugging, we think it's
        justified to display a message or even exit the program.
        It is typical for the debug versions of the libraries to test for this
        type of error.
    - Irrecoverable Error
        The kind of error that occurs many function calls deep. In
        particular the dreaded  out of memory' error (often) falls in this
        category. Only event-handling can deal with this type of error
        but we don't use it because of its considerable drawbacks.
        A message is displayed and the exit() function is called.


9.2 Changing the way messages are displayed

All the message functions eventually display their messages through a
call to csmess_disp(char *). Under DOS, this function writes the
message to the screen by using the standard puts() function. With
Windows (16 & 32 bits) a standard message box is called.

Fortunate, this function can easily be altered!
Before being displayed by the csmess_disp() function, every message is
converted into a single string. This makes changing the message
function very easy. Only a single function, which accepts a character
pointer, needs to be supplied.

The next function is intended to do that:

void csmess_set_fun( void (* fun)(char *));


//  Example (Dos):

    #include "csmess.h"
      void display(char *s)
      {
         //  This function is going to be used
         //  to display messages.
         printf("%s",s);
      }
      void main(void)
      {
          csmess_set_fun(display);
          //  From now on, all the messages are
          //  displayed by calling the  'display()' function.
     }



To restore the default, the next function can be used:

void csmess_reset_fun(void);


//  Example:

      void main(void)
      {
          csmess_reset_fun();
          // Restores the default
          // Display function.
      }


Function prototypes are in csmess.h.


9.3 Message related functions

The entire mechanisme of displaying messages can be switched on and
off with the a few (global) functions. To avoid confusion: these are normal
C-type functions, not C++ class member functions.

Function prototypes are in csmess.h.

void csmess_off(void);
                With this function, messages can be suppressed.
                Whether you are using the standard message function
                or has it replaced with your own, after a call to this
                function no message will be displayed.

void csmess_on(void);
                To be used in conjunction with the csmess_off()
                function. After a call to 'csmess_on()' messages will be
                displayed again.

int csmess_onoff(void);
                Returns TRUE if message displaying is switched on,
                FALSE otherwise.

void csmess_onoff(int sw);
                If called with sw unequal zero, messages will be
                displayed. Otherwise not.


//  Example (DOS):

#include "csmess.h"

      void work(void)
      {
        // Switch messages off.
          csmess_off();
        // Execute some critical code.
          csmess_on();
            // Switch messages back on.
      }

      void main(void)
      {
               work();
      }




9.4 Database class messages

Sofar we have been discussing the global message functions. On top of
that, all database classes have a few message functions too.

If a functions fails, it returns a value of FALSE, but it also sets the value
of a local error variable. Each class instance has its own error variable
and a member function to set, and to read it.

U16 error_nr(void);
                This function returns the value of the error variable.
                After the function call, the value is reset to zero. All the
                database classes from the CSDB-Library have this
                member functions.





//  Example (DOS):

    #include "iostream.h"
    #include "cstbase.h"

      int main(void)
      {

        TBASE tb;   // A database for fixed length records.
                // Documented in chapter 15.
        if(!tb.open("example.dbf"))
        {
            cout<<"Error nr: "<<tb.error_nr()<<endl;
            return 1;
        }

        return 0;
      }



void error_nr(U16 ErrNr);
                Sets the value of the error variable. All the database
                classes from the CSDB-Library have this member
                functions. (Mainly intended for internal use.)


U16 display_error(void);
                Obtains the latest error by calling  error_nr()'. This
                error number is then converted into a string by reading
                the  error.err' message file. Finally, it's displayed by
                using the global message functions described in the
                first part of this chapter. If the obtained error number is
                zero, no message is displayed. The return value is the
                error number. On return, the value of the error variable
                will be  reset to zero.
                All the database classes from the CSDB-Library have
                this member functions.




//  Example (DOS):

    #include "iostream.h"
    #include "cstbase.h"

      int main(void)
      {
        TBASE tb; // A database for fixed length records.
        if(!tb.open("example.dbf"))
        {
            tb.display_error();
            return 1;
        }
        return 0;
      }


                        10 Temporary files


Temporary files are created through the use of the cstmpname()
function, discussed in the CSTOOLS chapter 28.
This means, the environment variables TEMP and TMP are checked  to
determine which subdirectory has to be used. TMP is checked first and if
it doesn't exist, TEMP is checked.
Temporary files can be as large as the databases they are belonging to.
So, make sure the environment variables don' t point to some small ram-disk or an insufficiently large partition.



























                               Part

                               Two









Part Two of the documentation starts with a few chapters that apply
to all database classes. Further on, it will explain how this library
     can be used to build traditional relational databases.
To do so, it uses a TBASE class to store records and a BTREE class
                          for indexes.
  A program generator, CSDBGEN, is discussed which 'automates'
 the process of building more complex databases out of TBASE and
                             BTREE.
These two classes can also be used seperately. In particular the
BTREE class is very useful. It is really easy expandable and can be
tailored to a specific purpose by supplying one single function.
Simple databases with only one index can be build with just the
                           BTREE class.



                           11 Buffering


Except for RBASE, all database classes are build on top of a very
solid buffer system. This buffer system lets you control how much
memory can to be used for buffering. All the database classes offer the
opportunity to specify this amount with the call to the open() function.
This implies that the buffer size can be specified for every class instance
individually!

Buffers are allocated on the fly, up to the maximum specified. The
advantage is that the maximum may never be reached, saving valuable
memory for other purposes.

The buffering-system itself will stop allocating memory when the
heap is exhausted, but if dynamic memory allocation is used
somewhere further on, your program may still terminate with a
message of the type 'out of memory'.

It should be noted that this is only a problem with the MS-DOS operating
system. Any other (real) operating system uses virtual memory which
makes sure that your program will work with any reasonable assumption
about the available amount of memory.

For performance reasons, however, it is better not to rely on virtual
memory. If you allocate more buffers then the OS is capable of holding in
physical memory, performance will drop dramatically. Buffers which are
paged out, are far worse then no buffers at all!

                           12 PAGE-Class

12.1 Introduction

The PAGE-class constitutes a  kind of 'foundation'  for most of the
other classes in the CSDB library.  It is derived from a class 'BUFF'
which takes care of the required buffering. (Described in chapter 11.)

The idea is to do disk IO in chunks of 2 Kb. This is close to the optimal
size for the average harddisk. These blocks are kept aligned with the
sectors of the harddisk, which improves speed considerablely.
A harddisk always reads an entire sector, even if you only need, let's say,
10 bytes. Things become even worse if the 10 bytes you are requesting
just happen to cross a sector boundary. In that case the harddisk will
read 2 entire sectors. Assuming a sector is 1024 bytes, this means that
2*1024=2048 bytes are read just to obtain your 10 bytes!

To avoid this kind of inefficiency, the PAGE-class does its disk IO in
pages of 2048 bytes while making sure every page is aligned with the
harddisk sectors. This also means that the indispensable file-header has
to be at least one sector. To avoid complications, a file-header is used
which has the same size as a page, 2048 bytes by default.

It should be noted that this entire scene is undone by using a
disk compression utility, like double space, stacker and the
alike.

Therefore, if you are concerned about performance, it is better not to use
these utilities. More over, a disk compressor will slow down your
application considerably when several files are used heavily 'at the same
time'.  This situation will almost inevitably arise with any serious
application which uses more then one database, or even a single
database with many indexes.


12.2 Storing data in the header-page

As explained above, the header page is quite large. This page is used to
store al kind of important variables. However, there is still much space
left. Of the 2048 bytes, only about 170 are used.

An application using databases is like to have some variables of his own
which need to be saved between close/open sequences. It seems the
remaining space in the header page is a convenient place store such
data. This can save you an additional configuration file and all the error
trapping involved.

To aid in this, three functions are made public:
    int data_2_header(void * ptr,U16 length);
    int header_2_data(void * ptr,U16 length);
    U16 max_data_in_header(void);

U16 max_data_in_header(void);
                This function returns the maximum number of bytes
                which will still fit in the header page. This is simply the
                size of the header-page minus what is used to store
                the variables of the class.
                The class needs to be open.

int data_2_header(void * buffer,U16 length);
                Copies data from buffer 'buffer' to the empty space in
                the header page. The variable 'length' indicates the
                number of bytes to be copied. This figure is not stored
                anywhere. It is the programmers' responsibility to
                retrieve the right number of bytes later on.
                The class needs to be open. TRUE is returned on
                success, FALSE otherwise.

int header_2_data(void * buffer,U16 length);
                The counterpart of the previous function. This function
                copies 'length' number of bytes from the header page
                to 'buffer'.
                The class needs to be open. TRUE is returned on
                success, FALSE otherwise.


                           13 Lock files



To prevent multiple applications from accesing the same database,
lock files are created whenever a database is opened for writing. The
lock files are removed when the database is closed again. If a database
is opened in read-only mode, no lock file will be generated.

When an attempt is made to open a database while the lock file exists,
an error is generated.


13.1 Name of a lock file

The name of the lock file is derived from the database name by replacing
the first two characters of the extension by exclamation marks. Under
Linux, everything after the last dot is considered the  extension'. If no
extension exists, an extension consisting of two exclamation marks will
be added to the database name to form the name of the lock file.



 Example:

    Database name               Lock file name
    -------------               --------------

    database.dbf                database.!!f
    db.strings.index            db.strings.!!dex
    data                        data.!!



csCHAR *lock_file_name(csCHAR *DBName,csCHAR *LockName);
                This global C function can be used to obtain the name
                of a lock file.  DBName' is the name of the database,
                 LockName' must be a pointer to a buffer large
                enough to hold the name of the lock file.  The function
                fills the  LockName' buffer with the name of the lock file
                corresponding with database  DBName'.
                The return value is the  LockName' pointer.


13.2 Controlling lock files

Next is a set of member functions, common to every database class,
which can be used to control lock files.

void use_lock_file(int TrueOrFalse);
                When called with FALSE, the use of lock files is
                switched off. When called with TRUE, the use of lock
                files is switched on. The class default is to use locking.
int use_lock_file(void);
                Returns TRUE if the use of lock files is switched on,
                FALSE otherwise.
int lock_file_exist(csCHAR *DBNAME);
                This function accepts the name of a database file and
                returns TRUE if the corresponding lock file exists.
                Otherwise it returns FALSE.
                At least one class in this library uses two files to store
                its data. Therefore checking for lock files is a process
                which depends on the type of database used. This is
                why this is a class member function rather then a
                global C function.
int remove_lock_file(csCHAR *DBNAME);
                This function takes the name of a database, calculates
                the corresponding lock file name and tries to remove
                that. If the lock file doesn't exist or couldn't be
                removed, the function returns FALSE. Otherwise it
                returns TRUE.



// Example:
// Error checking omitted for conciseness.

    #include "iostream.h"
    #include "csvbaxe.h"

      void main(void)
      {
        VBAXE vb;

        if(vb.lock_file_exist("example.dbf"))
        {
            cout<<"The lock file exists"<<endl;
            vb.remove_lock_file("example.dbf"))
            cout<<"The lock file is removed"<<endl;
        }

        vb.open("example.dbf");
            //
            // Use the database.
            //
        vb.close();

      }



                      14 Read-Only databases

It is possible to open a database in read-only mode. If done so, no
changes can be made and no time-stamps will be updated. On the
level of the operating system the database file is also openend  read-only'
which makes it possible to use static databases on read-only
devices like CD-ROMS.

It is valid to open an database in read-only mode while it is already in use
by another program. That is, the existence of lock files will not result in an
runtime error.


Trying to add, delete or write records while in read-only mode is
considered an error and therefore should be avoided. Still, with
the current implementation, writing an record is simply ignored
while adding or deleting result in a runtime error. This behaviour may
change with future releases, please do not rely on it!


14.1 Class member functions

Next a set of class member function, common to all database classes,
which control the application of the  read-only' mode.

int read_only(int TrueOrFalse);
                When called with TRUE, the database is opened in
                read-only mode. Otherwise the database is opened for
                writing. The function has to be called before the
                database is opened! It is an error to call read_only()
                while the database is already open.
                The function returns TRUE on success, FALSE
                otherwise.
int read_only(void);
                Same as read_only(TRUE);
int is_read_only(void);
                The function returns TRUE if the database is opened
                in read-only mode, FALSE otherwise.



//  Example:
// Error checking omitted for conciseness.

    #include "iostream.h"
    #include "csdlay.h"

      void main(void)
      {
        DLAY vb;                // The miracle class.

        char some_data[100];

        vb.define("example.dbf");   // Create database.

        vb.read_only();         // Before open().
        vb.open("example.dbf");     // Open database

        // Test for read-only mode and append some data.
        if(!vb.is_read_only()) vb.append(some_data,7);
        else                   cout<<"Read only, cannot append data."<<endl;

        vb.close();             // Close database.
      }




                          15 TBASE-class


15.1 Introduction

The TBASE class is intended as a simple, fast way to access records
on disk. It assumes a fixed record size and does its IO on a record-by-record basis (contrary to field-by-field).

This means:
    1) TBASE is unaware of something like 'fields'. The idea is to use  a
        C structure as record and to do all the accessing of fields with
        the standard C operators. This approach is undoubtedly faster
        then supporting access on a field-by-field basis as done by
        dBASE.
    2) No indexes. TBASE just reads or writes records, nothing else.

NOTE:   From this it is clear that with the TBASE class alone no decent
        database application can be build. Therefore, a separate
        BTREE class is supplied which can be used as an index.


15.2 Using TBASE

The next small example gives an impression of how to use the class.

As can be seen from this example, there is no 'record pointer' as in
dBASE. The functions to read and write a record, simply take an
additional parameter indicating the record number.



// A very simple example.
// Error checking omitted for conciseness.

  # include "CSTBASE.H"

   void main(void)
   {
       typedef struct
       {
        char name[20];          // The field 'name'
        char street[40];        // The field 'street'
        long salary;            // The field 'salary'
        // All the other fields you may require.
       }record;                 // The record layout is now defined.

       TBASE db;
       record rec;

       db.open("demo.dbf",110);  // Assuming the file is already created.
                                 // Use 110 Kb for buffering.
       db.read_rec(9,&rec);      // Read record number 9 into
                                 // variable 'rec'.
       rec.salary=0;             // Change salary.
       db.write_rec(9,&rec);     // Write the record back to position 9.
                            // (Any other existing position is also possible.)
       db.close();          // Is also done automatically
                            // by the class destructor.
    }





15.3 Creating a Database

Before a database can be used it has to be 'created'. This is done
through a call to the 'define()' function. Of course this is needed only
once.
Because TBASE doesn't use fields, the function takes only two
parameters: the filename of the database, and the record size.

Syntax: int define(char * name,U16 reclen);


//   This example creates a database 'demo.dbf'.

    #include "iostream.h"
    #include "CSTBASE.H"
    void main(void)
    {
       typedef struct
       {
          char name[20];
          char street[40];
          char city[25];
       } record;

       TBASE db;
       if(!db.define("demo.dbf",sizeof(record)))
       {       // Return value FALSE: display the error.
           db.display_error();
       }

   }



15.4 Opening

Before a record can be read, the database has to be opened through a
call to the open() function.

This open() function also takes a parameter indicating the amount of
memory to be used for buffering. The memory for the buffers is NOT
allocated at the moment of the call to open(), but during the use of the
database. Memory is allocated when needed, up to this maximum.

As explained in chapter 11 about buffering, using up too much
memory for buffering is dangerous on an operating system
without virtual memory like MS-DOS.

Syntax: int open(char *name, S16 kb=32);


  // Example:
  // Opening the existing database 'demo.dbf' with 40 Kb for buffers.

   #include "CSTBASE.H"
   void main(void)
   {
       TBASE db;
       if(!db.open("demo.dbf",40)) db.display_error();
   }



15.5 Closing

Closing the database involves writing all the buffered data back to disk
and freeing all allocated memory. The close() function is intended for this
purpose. If the close() function is not explicitly called in the application,
the class destructor will call it.

Because there can be a long interval between the last time the database
is used and the moment where the destructor is reached, it still makes
sense to call the close() function 'by hand'.

Syntax: int close(void);


// Example:
// Error checking omitted for conciseness.

    #include "CSTBASE.H"
    void main(void)
    {
       TBASE db;

       db.open("demo.dbf",40);
       db.close();
    }



15.6 Appending Records

A special function is needed to add a record to a database: the
append_rec() function.
Note: The write_rec() can only overwrite an already existing record.

Syntax: S32 append_rec(void *data);
                'data' is a pointer to a record.

Syntax: S32 append_rec(void);
                This function can be used to add a record to the
                database without instantly filling it with a record. For
                the time being, this record will contain 'garbage'.

// Example:
// Error checking omitted for conciseness.

    #include "CSTBASE.H"
    void main(void)
    {
      typedef struct
      {
         char name[20];
         char street[40];
         char city[25];
      } record;

      TBASE db;
      record rec;

      db.define("demo.dbf",sizeof(record));   //Create new database

      db.open("demo.dbf",40);

      strcpy(rec.name,"J.Q. Querlis ");
      strcpy(rec.street,"Avenue 120");
      strcpy(rec.city,"Bombay");

      db.append_rec(&rec);            // The database now contains 1 record.
      db.close();
   }



15.7 Deleting Records

Deleting a record cannot be accomplished instantaneously. A 'delete bit'
is used to distinguish deleted records.

Deleting a record by setting the 'delete bit' doesn't alter much. E.g. record
9 remains record 9 if you delete record 8.
The function 'is_delet()' has to be called to detect whether-or-not a record
is 'deleted'.

The 'pack()' function can be used to physically remove all the deleted
records from the file.

int is_delet(long r);
                Returns 0 if the record 'r' is not deleted, and 1
                otherwise.

void delet(long r);
                Sets the delete bit for record 'r'.

void undelet(long r );
                Resets the delete bit for record 'r'.

void pack(void);
                Removes all the records with the delete bit set from the
                database. This is done without the use of temporary
                files.


15.8 Page Utilization

Normally the TBASE class does its IO in pages of 2 Kb. It fits an integer
number of records on these pages. This approach can lead to a large
chunk of unused space on the pages, particularly if you are using large
records. On average the slack will be a half record.

Solution:
    This waste of disk space can be avoided by using pages which have
    the same size as the record. This means that the pages will no
    longer be aligned with the sectors of your harddisk!

The function to accomplish this is:  smallest_page().

The define() function of TBASE considers slacks up to 30% acceptable. If
it doesn't manage to find a page size which produces a slack of less then
30%, it calls smallest_page().

void smallest_page(void);
                The function has to be called before the define()
                function. Because this changes the entire layout of the
                database file, this cannot be altered once the
                database is created.


// Example:
// Error checking omitted for conciseness.

    #include "CSTBASE.H"

    void main(void)
    {

       typedef struct
       {
          char name[20];
          char street[40];
          char city[25];
       } record;

       TBASE db;

       db.smallest_page();                 //Before the define!

       db.define("demo.dbf",sizeof(record));

   }



15.9 Locating Records

In the examples given so far, a record is first read into a local variable
and, after being altered, written back to disk. It seems there is room for
improvement here. After the record is copied into the local variable it is in
memory twice, once in the variable and again in the database buffers.

If you know what you are doing, some performance increase can be
gained from obtaining a pointer directly into the buffer system.
The 'locate_rec()' function does just that.

When you are working in the database buffers through a pointer, there is
no way the buffer system can tell if data is altered. Therefore, it's the
programmers' responsibility to indicate whether or not modifications are
going to take place.

char *locate_rec(long rec);
                This function returns a pointer to record 'rec'. It
                assumes that no alterations are going to take place.
                This means that the buffer is not written back to disk!

char *locate_rec_d(long rec);
                The additional '_d' stands for dirty buffer. The function
                returns a pointer to record 'rec'. It is assumed
                alterations ARE going to take place. The buffer is
                marked  dirty' and is therefore written back to disk
                when memory is needed to store another page.


IMPORTANT!!
The locate functions return a pointer directly into the buffer
system. Nothing less and nothing more. Any member function
of the same class instance which MAY cause disk IO, can therefore alter
the contents of the buffers, making your pointer 'point' to an entirely
different record! When using these functions, it is highly advisable to do
all the reading or writing to the record before calling any other TBASE
member function.





// Example, of the locate_ function.
// Error checking omitted for conciseness.

    #include "CSTBASE.H"

    void main(void)
    {

       typedef struct
       {
          char name[20];
          int age;
       } record;

       TBASE db;
       record *rec;

       db.define("demo.dbf",sizeof(record));
       db.open("demo.dbf");                // Use default 32 Kb for buffers.

       for(int i=1;i<=12;i++) db.append_rec(); // Append 12 records.

       rec=(record *)db.locate_rec_d(7);       // Obtain pointer to record 7.
                                   // '_d' because we will 'write'.

       rec->age=34;                // That's all it takes to make an
                                   // alteration. No need for a
                                   // 'write' function.
       db.close();                 // Not strictly necessary.

   }



15.10 Functions in alphabetical order.

Function prototypes are in 'cstbase.h'.

S32 append_rec(void *data);
                Append a record to the database. The newly created
                record is filled with the data from buffer 'data'. The
                function returns the number of the new record (which is
                equal to the number of records in the database).
S32 append_rec(void);
                Same as the previous function, only this time the new
                record is filled with binary zero's.
int close(void);    Closes the database. If the database is already closed,
                    nothing happens. TRUE is returned on success,
                    FALSE otherwise.
int define(char *name,U16  reclen);
                Creates a TBASE file named 'name'. The parameter
                'reclen' indicates the size of the records. TRUE is
                returned on success, FALSE otherwise.
void delet(S32  rec);
                Sets the delete bit of record 'rec'.
int empty(void);    Removes all the records from the database.
                    Afterwards there will be zero records left, but the
                    database will still be open. TRUE is returned on
                    success, FALSE otherwise.
int is_delet(S32  rec);
                Returns the value of the delete bit of record 'rec'.
                TRUE means the delete bit is set, FALSE means the
                bit is not set.
U16 lengthrec(void);
                Returns the length of a record. Because TBASE works
                with fixed length records, this value is the same for all
                records. It's the same value used in the call to define().
char *locate_rec(S32  rec);
char *locate_rec_d(S32  rec);
                Functions to return a pointer to record 'rec' directly into
                the buffer system. Please read the paragraph 15.9
                about this topic before using these functions.
S32 numrec(void);
                Returns the number of records currently in the
                database. Whether or not a record is marked for
                deletion makes no difference.
int open(char *name,S16 kb=32);
                Opens the existing database 'name' while using 'kb Kb
                ram for buffering. TRUE is returned on success,
                FALSE otherwise.
int open(void);     Returns TRUE if the database is open, FALSE
                    otherwise.
int pack(void); Removes all the records with the delete bit set from the
                database. This is done without the use of temporary
                files. TRUE is returned on success, FALSE otherwise.
void read_rec(S32 rec, void *buff);
                Copies the contents of record 'rec' into buffer 'buff'.
int save(void); Writes all buffered data back to disk. The database
                header block is also updated. The database remains
                open. TRUE is returned on success, FALSE otherwise.
void set_delet(S32  rec,int ToF);
                Changes the value of the delet bit of record 'rec'. If
                'ToF' is TRUE, the delete bit is set, otherwise it is
                reset.
void smallest_page(void);
                Set the page size to the smallest value possible. That
                is, a page will have the same size as a record. This
                means pages will no longer be alligned with the
                harddisk sectors. The function has to be called before
                the define() function. It changes the entire layout of the
                database file so it cannot be changed once the file is
                created.
void undelet(S32  rec);
                Resets the delete bit of record 'rec'.
void write_rec(S32 rec, void *buff);
                Overwrites the contents of record 'rec' with the data
                from buffer 'buff'. Record 'rec' needs to exist, the
                function cannot be used to append a record.


                          16 BTREE-class


16.1 Introduction

A btree is a system, developed several decades ago, to store data in
some predetermined order. The btree is capable of maintaining this
order even under insertions and deletions.
Btrees are capable of locating, inserting or deleting a specific record with
only a few disk-IOs, even when very large.

Of course there is a price to be payed for this: the disk space occupied by
the btree is not fully utilized. Worst case, only 50% of the space is used,
but on average 75% is used.

Btrees are convenient as indexes on a database. Of each record in the
main database, the key field and the corresponding record number are
stored together in the btree. Whenever a record with a certain key value
has to be located, the btree is capable of quickly ( without much disk IO)
finding the required entry. Once this is done it is also clear which record
from the main database has to be read, because this record number is
stored together with the key value in the btree.

There are several 'flavours' of btrees. The one implemented in this library
is known as a 'btree+'.


16.2 BTREEx Classes

Unfortunate it's a little bit cumbersome to use one BTREE class for every
type of data. Therefore, there are several minor variations on the main
BTREE class to account for the different variable types supported by C.
Each type has its own class.

Classes:
    BTREEb  For binary data.
    BTREEi  For integers.
    BTREEl  For longs.
    BTREEc  For characters.
    BTREEf  For floats.
    BTREEd  For doubles.
    BTREEa  For ASCII data. (Strings)

All these classes are derived from the BTREE class. Mainly, they only
differ in one function. This is the function needed to compare keys.

You can easily define a new BTREE class for a new type of variable. The
only thing to do is to define a int t_key(void *a,void *b) function which
returns:

        >0    if  a>b
         0    if  a==b
        <0    if  a<b

with 'a' and 'b' pointers to the new type of variables.


// Example:
// Say, you have your data stored in a C structure.

// Something like:

      typedef struct
      {
        char name[20];
        int  number;
      }

// Which you want to have sorted on the 'number' field.

class BTREEnew: public BTREE
{
    virtual int t_key(void *a,void *b)
    {
       return ((record *)a)->number-((record *)b)->number;
    }
    virtual int class_ID(void)  { return -100; }
 };


That's all!
The value '-100' in the class_ID() function is not all that important. Its
purpose is to give other library functions the opportunity to distinguish
between the classes. The value just has to be different from any value
the other classes have. This can be accomplished by choosing any
negative number.


16.3 Multiple Keys

Because of its nature you would expect a 'key' to appear only once in a
btree. However, on many occasions it turns out there is a need for storing
the same key more then once, but with a different data part. E.g. this
happens when you use a btree as an index on a database and in  the
field  you are indexing a certain value appears more then once. In that
case, you want to store the key value several times but with a different
data part, namely the record number in the database.

By default a key value can appear only once in the btree. If you try to
insert a second entry with the same key value, it will simply replace the
existing one. The option 'multiple_keys()' can be set  to alter that.

Syntax: void multiple_keys(int  YesNo);
        void multiple_keys_YES(void);
        void multiple_keys_NO(void);


When the function 'multiple_keys()' is called with the argument set to
TRUE, the btree will store a key value more then once.

It is important to realize that the btree only keeps the key values
sorted, NOT the data values. This means that when you are
searching for a particular key/data value, the btree is capable of quickly
locating the required key but has to find the correct data value by
sequentially traversing all the inserted data belonging to that particular
key.

Btrees are intended to give quick access to key values by keeping them
sorted, this does not apply to data values. Expecting anything else is
misusing the btree. If you want quick access to a large number of
different data values, all belonging to the same key, you need a different
approach. The best thing to do is to construct a new btree and use a key
which is a concatenation of the original key and the original data part.

Setting the multiple_keys option has to take place before the 'define'.


// Example
// Error checking omitted for conciseness.

#include "csbtree.h"

void main(void)
{

    typedef struct
    {
        char name[20];
        int  age;
    } record;

    BTREE bt;

    bt.multiple_keys_YES(); // Must be called before 'define'
    bt.define("btree.dat",sizeof(record),sizeof(long));

    // By now a btree 'btree.dat' is created in the current working
    // directory with the multiple-keys option switched on.

}




16.4 Current Pointer

Contrary to TBASE, the btree class does use a 'current' pointer. When
using btrees, the need to obtain the next (or previous) entry arises so
often, it's inevitable.

The btree class spends as little time as possible on maintaining this
current  pointer. Therefore you should assume it is NOT set, unless you
have strong reasons to believe otherwise.

A limited set of functions can be used to set the current pointer. After
that, the 'next()' and the 'previous()' functions can be used to move to the
next resp. the previous entry.
When these functions 'fail', which can be noticed from their return value,
you should assume the current pointer is not set (any more).

The next functions can be used to set the current pointer:
- all the search functions.
    E.g.  search_gt(),  find() etc.
- all the min() and max() functions.
    E.g. max_key(), min() etc.
- the insert() function.

The current pointer can be moved back and forth with the next() and
previous() functions.

Once the current pointer is set, the 'current()' functions can be used to
obtain the key value and/or the data part.

Any other function can, and probably will, render the value
of the current pointer undefined!



// Example
// The next example displays the contents of a btree with
// 'strings' as key fields.
// It assumes that the btree 'demo.dbf' exists and that the
// key fields are less then 100 bytes.

// Error checking omitted for conciseness.

#include "iostream.h"
#include "csbtree.h"

void main(void)
{
     char  buffer[100];
     BTREEa bt;

     bt.open("demo.dbf",250);  // Does not set the current pointer.

                      // Make the first entry the 'current'.
     if(bt.min())     // This returns FALSE only if the btree is empty.
     do
     {
        bt.current_key(buffer);  // Read the 'current' key value.
        cout<<buffer<<endl;      // Display it.
      } while(bt.next());        // Move the current pointer 1 position.

      bt.close();
}



16.5 Using Btrees

The next paragraph will try to sort the public member functions according
to  purpose.

16.5.1 Creating

    void multiple_keys_YES(void);
    void define(char *name,int key_length, int data_length);

16.5.2 Opening

    int open(char *name, int kb_buffer);

16.5.3 Inserting

    void insert(void *key,void *data);

16.5.4 Searching

    int   search(void *key,void *Data);
    int   search_gt(void *key,void *Key,void *Data);
    int   search_ge(void *key,void *Key,void *Data);
    int   search_lt(void *key,void *Key,void *Data);
    int   search_le(void *key,void *Key,void *Data);
    int   search_dat_..(void *key,void *Data);
    int   search_key_..(void *key,void *Key);
    int   find(void *key,void *data);
    int   find(void *key);
    int   max(void *Key,void *Data);
    int   min(void *Key,void *Data);

16.5.5 Current

    int   current(void *Key,void *Data);
    int   current_key(void *Key);
    int   current_dat(void *Data);
    int   tBOF(void);
    int   tEOF(void);

16.5.6 Deleting

    int   delet(void *delete_key);
    int   delet(void *delete_key,void *data_value);

16.5.7 Closing

    void  close(void);
16.6 Functions in alphabetical order.

The function prototypes are in "CSBTREE.H".

void  close(void);
                Closes the btree after use. All buffers are flushed and
                all the allocated memory is freed. This function is also
                called by the class destructor if needed.
int   current(void *Key,void *Data);
int   current_key(void *Key);
int   current_dat(void *Data);
                Returns the key and/or data part of the entry the
                current pointer is pointing at.
                Parameters:
                Key     Pointer to the buffer to which the key value
                        has to be copied.
                Data    Pointer to the buffer to which the data part
                        has to be copied.
                If the current pointer has not been set, the functions
                return FALSE and no data is written into the buffers. If
                the current pointer is set, the functions return TRUE
                and the appropriate data is copied into the buffers.
void define(char *name, int key_length,int dat_length);
                This function is needed to initially create the file on
                disk. This is only needed once, before the first call to
                open().
                Parameters:
                name        The filename of the new BTREE.
                key_length  The number of bytes in the key. This
                            parameter is even needed when its value
                            is 'obvious' from the btree type.
                dat_length  The number of bytes in the data part.

    Example:

    If you want to use a btree as index on a string field
    you will need:
    - a btree of type ASCII: BTREEa.
    - a key_length equal to the length of the string field
      in the database record.
    - a data part which is capable of holding the number
      of the record in the database. This will normally be
      a 'long'.

    #include "cstbase.h"
    #include "csbtree.h"

    void main(void)
    {
        typedef struct
        {
           char   name[30];
           float   income;
        }record;

        TBASE  db;
        BTREEa index;

        db.define("demo.dbf",sizeof(record));
        index.define("demo.ndx",30,sizeof(long));
        // By now the database and its index are created.

    }



int   delet(void *delete_key);
                Searches for key 'delete_key' and, if present, removes
                it from the btree. If the option multiple_keys is set to
                'yes' there can be more then one entry with the
                specified key. Under that circumstances, all these
                entries will be removed. The function delet() can also
                accept a parameter with the value of the data part.
                That function should be used for deleting when
                multiple keys are used.
                The function returns TRUE is something is deleted,
                FALSE otherwise.
int   delet(void *delete_key,void *data_value);
                The same as the previous delete function but this time
                the value of the data part is also specified. Only the
                entry which matches both the key and the data part is
                removed from the btree.
void  empty(void);
                Removes all the entries in the btree. The btree needs
                to be open. Upon return, the btree will contain zero
                keys and will still be open.
int   find(void *key,void *data);
int   find(void *key);
                These functions 'test' if a certain key value is present in
                the btree. When a btree is used with multiple keys, it
                can be necessary to specify the data part to uniquely
                identify an entry.
                TRUE is returned when the required entry is found,
                FALSE otherwise.
void  insert(void *key,void *data);
                Inserts a new entry in the btree. 'key' is a pointer to the
                key field and 'data' is a pointer to the data part.

// Example:
// Error checking omitted for conciseness.

    void main(void)
    {

        typedef struct
        {
           char   name[30];
           float   income;
        }record;

        TBASE  db;
        BTREEa index;

        db.define("demo.dbf",sizeof(record));       //Creating.
        index.define("demo.ndx",30,sizeof(long));   //Creating.

        db.open("demo.dbf",30);         // Opening the database with 30 Kb buffers
        index.open("demo.ndx",80);      // Opening the index with 80 Kb buffers.

        record rec;

        strcpy(rec.name,"John Wayne");
        rec.income=7000;                // Filling the record.

        long recnr=db.append_rec(&rec);     // Insert in the database.
        index.insert(rec.name,&recnr);      // Update the index.

        db.close();                 // Close the database.
        index.close();              // Close the index.

    }


int   max(void *Key,void *Data);
int   max_dat(void *Data);
int   max_key(void *Key);
int   max(void);
                These functions return the last (highest) entry in the
                btree. The parameters 'Key' and 'Data'  have to be
                pointers to respectively a buffer for the key part and a
                buffer for the data part. These buffers will be filled with
                the appropriate data upon function return.
                The functions max_dat() and max_key() can be used
                when only one of the two items is required.
                TRUE is returned on success, FALSE otherwise.
int   min(void *Key,void *Data);
int   min_dat(void *Data);
int   min_key(void *Key);
int   min(void);
                These functions return the first (lowest) entry in the
                btree. The parameters 'Key' and 'Data' have to be
                pointers to respectively a buffer for the key part and a
                buffer for the data part. These buffers will be filled with
                the appropriate data upon function return.
                The functions min_dat() and min_key() can be used
                when just one of the two items is required.
                TRUE is returned on success, FALSE otherwise.
void  multiple_keys(int TrueOrFalse);
                With this function the use of multiple keys is controlled.
                Multiple_keys(TRUE) will allow for multiple keys.
                Multiple_keys(FALSE) will not allow for multiple keys.
                Important:
                This function has to be called before the define()
                function is invoked. It is not possible to alter the setting
                of the multiple key parameter later on.
                For more information about multiple keys please read
                paragraph 16.3.
int   multiple_keys_YES(void);
                Same as multiple_keys(TRUE);
int   multiple_keys_NO(void);
                Same as multiple_keys(FALSE);
int   multiple_keys(void);
                This function returns TRUE if multiple keys is set to
                'YES' and FALSE otherwise.
int   next(int n);
int   next_key(int n,void *Key);
int   next_dat(int n,void *Data);
int   next(int n,void *Key,void *Data);
                A set of functions to move the current pointer closer to
                the 'end' of the btree.
                Apart from that, they are similar to the prev() funtions.
                For more information, please see over there.
int     next(void);
                Same as next(1); but more efficient.
long  numkey(void);
                This function returns the number of different keys
                which is in the btree.
int   open(char *name, int kb_buff);
                Opens an existing btree 'name'. The parameter
                'kb_buff' indicates how many Kb ram has to be used
                for buffering.
                The function returns TRUE on success, FALSE
                otherwise.

// Example:

    #include "iostream.h"
    #include "csbtree.h"

    void main(void)
    {

       BTREEa index;
       if(!index.open("demo.ndx",100))
       {
        cout<<"Error"<<end;
       }

    }

                This will open the btree 'demo.ndx' and will, at most,
                use 100 Kb ram for buffering.
                NOTE:   read the chapter 11 about buffering, before
                        using really large amounts of buffers.
void  pack(void);
                A function which optimizes disk usage. Due to many
                insertions and deletions it is possible for blocks with
                zero keys to emerge. There are no pointers to these
                blocks but they will still be part of the btree because
                they can not be removed unless they are the last block
                in the file. These blocks will be used as soon as the
                need for a new block arises.
                The pack function will remove these empty blocks and
                rearrange all the keys. This is done by writing all the
                data to a temporary file and reload the btree.
int   prev(void);
int   prev(int n);
int   prev_key(int n,void *Key);
int   prev_dat(int n,void *Data);
int   prev(int n,void *Key,void *Data);
                A set of functions to move the current pointer closer to
                the 'beginning' of the btree.
                Parameters:
                n       The number of entries the current pointer
                        needs to be moved.
                Key     A pointer to the buffer which is going to be
                        filled with the value of the key field.
                Data    A pointer to the buffer which is going to be
                        filled with the value of the data part.
                If the current pointer is NOT moved, the key and data
                buffers will not be filled. Otherwise they will be filled
                with the values of the entry to which the current pointer
                is moved.
                The key field, the data field, or both can be obtained by
                selecting the appropriate function.
                The prev(void) function is a more efficient version of
                the prev(1) function.
                Important:
                This current pointer needs to be set first. Please, read
                paragraph 16.4 about the 'current pointer' for more
                information.
                When one of the prev() functions is called while the
                current pointer is not set, the function will return 0. The
                current pointer can not be moved before the beginning
                of the btree, therefore the number of positions moved
                can differ from the number requested.
                The return value is the number of positions the current
                pointer has actually moved.
int   search(void *key,void *Data);
                Searches for 'key' and fills buffer 'Data' with the
                corresponding data part if 'key' is found.
                The function returns TRUE if found, FALSE otherwise.
int   search_gt(void *key,void *Key,void *Data);
int   search_ge(void *key,void *Key,void *Data);
int   search_lt(void *key,void *Key,void *Data);
int   search_le(void *key,void *Key,void *Data);
                The purpose of this set of functions is to search for a
                key value close to a given value. 'key' is the key value
                searched for while 'Key' is the key value actually found.
                'Data' is the data part belonging to 'Key'.
                The suffix '_xx' has a meaning conform the
                corresponding FORTRAN operators:
                gt: Greater Then        >
                ge: Greater Equal       >=
                lt: Less Then           <
                le: Less Equal          <=
                The functions return TRUE if such a key could be
                found, FALSE if not.

     Example:
     Assume the next table represents a btree.

         Entry    Key value      Data value

          1         Blue            123
          2         Green           45
          3         Red             678

       search_gt("Blue",Key,Data)
        Return value:   TRUE
        Key:            Green
        Data:           45

       search_ge("Blue",Key,Data)
        Return value:   TRUE
        Key:            Blue
        Data:           123

       search_lt("Blue",Key,Data)
        Return value:   FALSE
        Key:            undefined
        Data:           undefined

       search_ge("Orange",Key,Data)
        Return value:   TRUE
        Key:            Red
        Data:           678


int   search_dat_..(void *key,void *Data);
int   search_key_..(void *key,void *Key);
                The previously described functions return both the key
                value and the data value.
                In some cases this will be a waste of memory,
                therefore there are two similar sets of functions, which
                either return the found key value or the data part.
                search_dat_..() returns only the found data value.
                search_key_..() returns only the found key value.

    Example:

    With the same btree as in the previous example


       search_dat_gt("Blue",Data)
        Return value:   TRUE
        Data:           45

       search_key_ge("Blue",Key)
        Return value:   TRUE
        Key:            Blue


int   skip(int n);
int   skip(int n,void *Key,void *Data);
int   skip_key(int n,void *Key);
int   skip_dat(int n,void *Dat);
                A set of functions to move the current pointer. These
                functions are front ends for the next() and prev()
                functions. E.g. this is how the skip_key() function is   implemented:


                int skip_key(int n,void *Key)
                    {  if(n>0) return  next_key( n,Key);
                       else    return -prev_key(-n,Key); }


                So, with these functions the argument 'n' may be
                positive or negative.
                With 'n' positive the current pointer is moved to the
                'end' of the btree. This is done by calling next(n). With
                'n' negative the current pointer is moved to the
                'beginning' of the btree. This is done by calling
                -prev(-n).
                The return value can be either positive or negative
                depending on the value of 'n'. When 0 is returned, the
                current pointer has not been moved or was not set.
                See also the prev() functions for more information.
int   tBOF(void);   Returns TRUE if 'current' is pointing to the first entry of
                    the btree, FALSE otherwise. Behaviour is undefined in
                    case 'current' is NOT set!
int   tEOF(void);   Returns TRUE if 'current' is pointing to the last entry of
                    the btree, FALSE otherwise. Behaviour is undefined in
                    case 'current' is NOT set!




                            17 CSDBGEN


This chapter describes the CSDBGEN program generator.


17.1 Introduction

So far we have discussed the TBASE class which is capable of
reading and writing records and the BTREE class which can be used
as an index.
CSDBGEN can aid in building an indexed database out of this. A
database which is capable of manipulating fields, maintaining indexes
and the alike.

As input it takes a 'definition file' which describes the fields and the
indexes. From this, it produces the source for a new C++ class. Member
functions of this newly created class are used to access the fields.

CSDBGEN does not generate a user-interface.

There are several good reasons for using a program generator.
- The TBASE class can concentrate on manipulating records rather then
    fields. Because of that, it remains a universal and efficient way to
    read and write blocks of data.
- With this approach it is easier to deal with field types that are not
    supported by the C programming language, particularly dates.
- It is relatively easy for the program generator to convert to dBASE
    format because it has all the required knowledge at hand. Figuring
    out this conversion during runtime is a lot more complicated and will
    also make your executable larger because the knowledge to do the
    conversion is in the application instead of in the program generator.
- Or more in general: everything the program generator does, can be left
    out from the application, making the executable smaller.
- Without a program generator, the differences between the field types
    have to be dealt with runtime, perhaps even with every call
    accessing a field. Doing so, will inevitable result in some sort of an
    interpreter.


17.2 Overview

Using CSDBGEN starts off with creating a database definition file. This
file describes the layout of a record, the indexes, the name of the new
class and the name of the files.

When this file is created, CSDBGEN is called with this filename as a
parameter. In return it produces the source for a brand new C++ class.
This source is ready to compile, without the need for manual editing.

This new class has public member functions for reading a field, setting
the active index, exporting to ASCII, exporting to dBASE, reindexing,
packing and so on.
These member functions are very easy to use because they take very
few, and often none, parameters. This is possible because a lot of the
information which is specific for the database is already in the source of
the functions.

The generated class controls one database and all the indexes that
come with it. If you need more then one database you have to repeat this
procedure for the other databases as well.

An elaborate example is at the end of this chapter.


17.3 Features

- Indexes with more then one reference to a record!
    This is an innovation! It creates the possibility to locate a record by
    searching for a substring of the field, rather then the entire field. This
    topic is discussed in more detail in paragraph 17.6 called
    'tokenizing'.
- Conversion to-and-from ASCII.
    This is convenient for backups, conversions to other systems and
    also for making changes in the record layout.
- Export to dBASE compatible file format.
    CSDBGEN generates a function (member of the new class) capable
    of writing the contents of the database to a file in the dBASE format.
- Import from dBASE.
    There is no direct way to import a dBASE file, but the command-line
    utility CS4DBASE can be used to convert a dBASE file into an ASCII
    file which can be read by the import() function.
- Can manage very large databases.
    The design of the libraries has been keen on avoiding limitations. As
    a result, databases up to 2 billion records are theoretically feasible.
- No overhead.
    Due to the use of a program generator, the overhead involved in
    accessing fields is next to none.
- Large buffers.
    This system is capable of effectively using large amounts of RAM for
    buffering.
- Fast.
    The previous two points, together with the efficient BTREEs
    guarantee a very fast database.


17.4 Limitations

- No record locking.
- No multi-user support.
    This system was designed to be used in single user applications.
    Time being, there is no support for network/shared databases.
    Perhaps there will be in the future but if so, it will take the form of a
    new series of classes.


17.5 Definition file

The information needed to generate the new class is obtained from a
'definition file'. To get started, CSDBGEN is capable of generating an
example.


example

c:\borlandc>csdbgen /example>example.def

Something like this will generate an example definition file 'example.def'.




To get acquainted, let's look what's in it.

Example database definition file:

    class:  NAM
    record: NAM_record
    file:   demo
    field:  name        s   30  Y
    field:  city        s   20
    field:  birthday    d       Y
    field:  salary      f



Explanation:

    line 1: class:  NAM
        The program generator generates a class, which of course has
        to have a name. In this example 'NAM' .

    line 2: record: NAM_record
        As explained before, we use a C structure as a record. The
        name of this structure is defined in the second line.
        In the generated header file the following (among others) will
        appear:


         #define NAME_LENGTH      30
         #define CITY_LENGTH      20

         typedef struct
         {
            char   _name[NAME_LENGTH+1];
            char   _city[CITY_LENGTH+1];
            long   __birthday;
            float  _salary;
         } NAM_record;


    line 3: file:   demo
        This line indicates the name of the files the database system will
        use. In this example three files will be used:
            - demo.dbf, the TBASE database file.
            - demo01.idx, the BTREE index file on field 'name'.
            - demo02.idx, the BTREE index file on field 'birthday'.


    line 4/7:
        Field definitions.
        The syntax for a field definition is:
        field: <field_name> <field_type> [length] [format] [index]
        With:
            field_name, the name of the field.
            field_type, the type of the field which can be:
                i:  integer
                l:  long
                f:  float
                F:  double
                c:  character
                s:  string
            length, only for strings.
                Indicates the number of characters the field needs to
                be able to store. One additional byte is reserved to
                store the null terminator.
            format, only for date fields.
                Please, see the documentation on DATE fields.
                To give a quick example: MDY4 means, Month, Day
                and 4 positions for the Year.
            index,
                'Y' means a normal index.
                'T' means a 'tokenized' index.
                 Nothing means no index at all.
                See also the documentation on 'tokenizing' further on.

In the example:

    field:  name    s   30  Y
        A field 'name' of type string. 31 Bytes are reserved, 30 for the
        characters and 1 for the null terminator. An index is maintained
        for this field because of the additional 'Y'.

    field:  city    s   20
        A field 'city' of type string with a length of 20 characters. There is
        NO index on this field.

    field:  birthday    d   Y
        A field 'birthday' of type DATE. An index is placed on this field.

    field:  salary  f
        A field 'salary' of type float, without an index.


17.6 Tokenizing

This is a new concept!

Let's explain things with an example.

Say you have a database with a record for the 'World Health
Organization'. Normally, the only way to locate this record would be by
entering a search string starting of with  'World ....'. But now, with
tokenizing, this record could also be located by searching for  'Healt..' or
'Organization..'.
It's important to know that this is not implemented by traversing the entire
database from the first record to the last, as is done by some toy-applications, but through maintaining an extensive index.


17.6.1 How does it work?

Traditionally, the index stores the entire field. In this example that would
mean 'World Health Organization' is stored in the index together with a
reference to the record number in the main database.
With tokenizing, the index will store 3 entries, namely 'World', 'Health'
and 'Organization'. This approach means that there will be a lot more
entries in the btree then there are records in TBASE.

In other words, an index is maintained on every suitable substring, rather
then the entire field.

To save disk space, the length of the key field in the BTREE is
only half of the field length in the main database.


17.7 When is a substring indexed?

First of all: tokenizing only applies for string fields. E.g. 'Tokenizing' a
float field seems pointless.

Whether or not a substring is placed in the index is controlled by two
things:
- the way it is separated from the rest of the field.
- the length of the substring.

For every field  tokenized', CSDBGEN generates two symbolic constants:
one for the token delimeters and a second for the minimal token length.


// Example:
// Suppose you use tokenizing on a field  NAME'.
// In the generated .cpp file two #defines will appear:


#define TD_NAME "\t(),- "  // Token delimiters for field 'name'.
#define TL_NAME 4          // Minimal Token length for field 'name'.



The tokenize function separates the field into substrings according to the
characters in the TD_????? string. Notice that the '.' is not a delimiter.
This is to prevent abbreviations from being split up.
A substring has to be at least TL_????? bytes long to appear in an index.
By default this will be four bytes, that's not too long for most cases but it
means that three letter abbreviations like 'IRS' are not indexed.

Of course you can alter these two defines when needed.


Example definition file:

  class:    NAM
  record:   NAM_record
  file:     demo
  field:    name        s   30  T
  field:    city        s   20
  field:    birthday    d
  field:    salary      f


Notice the 'T' behind the 'name' field. This is short for 'tokenize'. The
generated class will apply tokenizing on the  name' field.


17.8 Compound indexes

On several occasions there is a need for an index on a concatenation of
fields, rather then on one single field. E.g. the combination of a string
field and a date field is very popular.

Example:
    Suppose you are building a database for sport events. Some events
    like 'Heavy Weight Champion Match' may appear more then once.
    Therefore, you want the events listed first by alphabet, and second
    by date. In this way all the heavy weight champion matches appear
    together and in order of their scheduled date.

CSDBGEN can generate indexes on a concatenation of fields. For that, it
requires an 'index' line in the database definition file.

17.8.1 A simple example


    class:  NAM
    record: NAM_record
    file:   demo
    field:  name            s   30  Y
    field:  city            s   20
    field:  birthday        d       Y
    index:  nabi a:name+a:birthday


Note the last line. This will generate the index nabi, which is an index on
the concatenation of the name field and the birthday. The a: indicates
'ascending'. In case you need 'descending', use d:.

There is no need to limit the indexes to a concatenation of just two fields.
Many more fields can be used, and the same field may even appear
more then once. Also, there is no reason why a field which already has a
'normal' index, shouldn't be used in a compound index.
E.g. in the example, the name field is already indexed because the 'Y'
behind its definition, despite that it's also used in the nabi index.


17.8.2 A more complex example


    class:  NAM
    record: NAM_record
    file:   demo
    field:  name            s   30  T
    field:  city            s   20
    field:  birthday        d   MDY2
    index:  nabi    d:name+a:birthday
    index:  binaci  a:birthday+d:name+a:city


This will generate a NAM class with a tokenized index on the name field
and two compound indexes, nabi and binaci.
The nabi index sorts the name descending and the birthday ascending,
binaci sorts first ascending on the birthday, secondly descending on the
name and last ascending on city.

Example:


#include "iostream.h"
#include "nam.h"

void main(void)
{
    NAM nam;

    nam.open();             // Opening
    nam.order(NAME_INDEX);  // Use the name index.
        ....                // Do something
    nam.order(NABI_INDEX);      // Use the nabi index.
        ....                    // Do something
    nam.order(BINACI_INDEX);    // Use the binaci index.

    nam.top();              // Display the records in order
    do                      // of the binaci index.
    {
        cout<<nam.birthday()<<" ";
        cout<<nam.name()<<" ";
        cout<<nam.city()<<endl;
    } while(nam.skip(1));

    nam.close();
}



17.8.3 Compound & Tokenizing Indexes

It's also possible to use a compound index while a tokenizing techique is
applied on one of more of its fields. The syntax to generate such an index
can be seen in the next example.
Example database definition file:


    class:  NAM
    record: NAM_record
    file:   demo
    field:  name        s   30
    field:  birthday    d   MDY2
    index:  nabi    aT:name+a:birthday


Notice the 'T' in the last line.
The nabi index will now 'tokenize' the name field, concatenate the
birthday and use this as a reference for the record.

Example:

Record 1:
    Bjorn Esverald
    04/07/70

Record 2:
    Bjorn Gensjeng
    03/23/55

Record 3:
    Gata Esverald
    05/11/93


Assume the above three records resemble the database.
Which references will the nabi index contain?



    Entry nr:           Key:            Record pointed to:
     

    1           Bjorn       03/23/55        2
    2           Bjorn       04/07/70        1
    3           Esverald    04/07/70        1
    4           Esverlad    05/11/93        3
    5           Gata        05/11/93        3
    6           Gensjeng    03/23/55        2




Just as with the ordinary tokenized fields, the length of the
token placed in the index, is at most half of that of the original
field.

Although it's not easy to come up with an application in which it is useful,
it's possible to have a compound index on the concatenation of more
then one tokenized field:

Example:

    class:  NAM
    record: NAM_record
    file:   demo
    field:  name            s   60
    field:  interests       s   50
    index:  nabi    aT:name+dT:interests


This creates a vast index because a reference is generated for
every possible combination of a token in the name field and a
token in the interests field. Let's say the name field contains three
tokens and the interest field four. Then there will be 3*4=12 references in
the index, just for this single record!

Example:


Record 1:
    Roger Sander Barkakati
    Bonsai Trees



Suppose this is the only record in the database. Then the nabi index will
contain the following 6 references:





    Entry nr:           Key:            Record pointed to:
    

    1           Barkakati Trees         1
    2           Barkakati Bonsai        1
    3           Roger     Trees         1
    4           Roger     Bonsai        1
    5           Snder     Trees         1
    6           Sander    Bonsai        1



Note:   'Trees' appear before 'Bonsai' because the interests field is
        tokenized descending.


17.8.4 Locating an Entry

With a compound index, locating a specific entry is not all that easy.
Normally, supplying a pointer to the search argument is sufficient, but this
time the pointer needs to point to the concatenation of two or more
values which don't even have to be of the same type.

Therefore, CSDBGEN typedefs a record structure for every compound
index.
Example:

    class:  NAM
    record: NAM_record
    file:   demo
    field:  name        s   60
    field:  interests   s   50
    index:  nabi aT:name+dT:interests

// This defintion file will generate a header file
// which, among others, contains the following lines:

    #define NAME_LENGTH     60
    #define INTERESTS_LENGTH    50

    #define UNSORTED        0
    #define NABI_INDEX  1

    typedef struct
    {
       char  name[NAME_LENGTH/2];
       char  interests[INTERESTS_LENGTH/2];
    }NAM_rec_nabi;



Note that the field lengths in the NAM_rec_nabi structure are only half of
the database field lengths.

NAM_rec_nabi is the structure which can be used to search for an entry
in  the compound index nabi.

Example:

#include ....

void main(void)
{
    NAM_rec_nabi nrn;
    NAM nam;
    nam.open();

    nam.order(NABI_INDEX);      // Make the NABI index active.

    strcpy(nrn.name,"Gandalf");
    strcpy(nrn.interests,"Witchcraft");

    // Locate. Case INsensitive.

    if(nam.find(&nrn)) cout<<" Found ";
    else               cout<<" Not Found ";

    nam.close();

}



17.9 Export to dBASE

The program generator also produces a  member function:

    int to_DBASE(char *filename);

When called, this function produces a file 'filename' which can be read by
dBASE. Index files are NOT written.


17.10 Importing from dBASE

CSDBGEN does not generate a function to directly import a dBASE file.
However, the CS4DBASE utility, discussed in chapter 27 of this
documentation, can read a dBASE file. It produces an ASCII file which is
formatted to be read by the import() function.
The next paragraph explains how to read ASCII files using the import()
function.


17.11 Exporting/Importing to/from ASCII

    int export(char *filename);

This writes out the contents of the database to an ASCII file 'filename'.
That file will also contain information about the fields. In this way the
import() function knows how to process this data, even after changes in
the record layout.

    int import(char *filename);

This member function reads the ASCII file 'filename' and appends the
data to the current database. It is meant to be used in conjunction with
the export() function. The export function starts of with writing the entire
definition file. The import function uses this information to skip fields
which are not in the database and to read fields in the right order. (Only
'familiar' fields are read, the others are ignored.) This mechanism can be
used to make changes in the record layout.
Note: Because it's an ASCII file, the names of the fields can be changed
with a normal editor.


17.12 Starting a new database

A member function

    int define(void);

is available to create a new database. If the database already exists, it is
overwritten!


17.13 Opening a database

The member function

    int open(void);

opens an existing database.
Index files are automatically rebuilt if they don't exist.


17.14 Current Record

At any moment there is always a record the 'current record'. The
functions to read and write fields all work with this current record.

- After opening, the first record becomes the current record.
- The go_to(), skip(), top(), bottom() and the search()  functions can be
    used to make another record 'current'.


17.15 Accessing fields

CSDBGEN generates two member functions for each field. One to read
the field, and a second to write the field. The names are the same but the
arguments differ.



Example:

// A part of the definition file:
      class:  NAM
      record: nam_record
      file:   dbtest
      field:  name s 40
      field:  number i

// We have a 'name' field consisting of a string with 40 characters
// and a 'number' field which is an integer.
// Among others, the next member functions are generated by CSDBGEN:

class NAM
{
public:
// For reading
      char  *name(void);
      int   number(void);
// For writing
      void   name(char *s);
      void   number(int i);
};


The next example gives an impression of how the generated class could
be used.


Example:

   void main(void)
   {
      NAM db;               // We now have a class 'NAM'.

      if(!db.open())        // Opens database, assuming it already exists.
      {                     // No file names have to be entered.
        cout<<"error";      // All the indexes are opened automatically.
      }
      puts(db.name());      // Displays the name field of the first record.
                            // After 'open' the first record is 'current'.

      db.name("Pjotr Idaho");   // Changes the contents of the 'name' field to
                        // 'Pjotr Idaho'. Indexes are updated automatically.

      if(!db.close())       // Close database.
      { cout<<"error"; }
   }




17.16 DATE fields

Standard C doesn't support date variables. Therefore, this library has its
own DATE class.

The functions to read and write date-fields are using a string
representation of a date. These strings can represent a date in several
formats. CSDBGEN uses a default of DMY4. This means 2 positions for
the Day, 2 positions for the Month and 4 positions for the Year.


Example:

   "02/04/1994"    // By default interpreted as:  April the 2th 1994.


When a two position representation of the year is wanted, use Y2 instead
of Y4.
Roughly speaking: Every sensible order of M,D,Y2 or Y4 is accepted.


Example:
    If you want "02/04/94"  to be interpreted as February the 4th 1994, use the format
    MDY2. The line in the database definition file has to be:

    field: birthday d MDY2

    If you want the field to be indexed, add an additional 'Y':

    field: birthday d MDY2 Y


On disk, dates are stored as longs. The sem_jul() function of the DATE
class is used to convert a date to a long.
For more information about the date formats and the DATE class, please
see chapter 30.


17.17 Changing the record layout.

Even when the database is already in use, the need to make changes in
the record layout may occur. With the next procedure this can be
accomplished quite easily, without to the need to reenter any data
manually.

To put it in a nut shell: save your data to an ASCII file with the old
export() function and reload with the new import() function.

Or in more detail:

    a) Export with the 'old' export() function.
        This will produce an ASCII file which fully resembles the
        database.
    b) Make a new definition file or alter the old one.
    c) Generate a new Class with CSDBGEN.
    d) Compile & link.
        The last three steps are simply the procedure for creating a
        database using CSDBGEN.
    e) Use the new import() function to reload the data.
        Import the ASCII file created with step 'a'. The import() function
        is doing the actual conversion. It can do this because it has
        knowledge of both the old and the new definition file. The old
        one is on top of the ASCII file and knowledge about the new one
        is hard coded in the import() function by CSDBGEN.


17.18 Member functions in alphabetical order

Next is a list of the public member functions as they appear in the
generated class.
With the sole exception of open() and define(), the database needs to be
open for these functions to work properly.

void append_blank(void);
                Appends an additional record to the database. The
                record is filled with binary zeros and becomes the
                current record.
int bottom(void);
                The current record is set to the last record according to
                the active index. The function returns FALSE if the
                database is empty, otherwise TRUE is returned.
int close(void);
                Closes the open files. All buffers are flushed and all
                allocated memory is released. This function is called
                automatically by the class destructor if needed. The
                function always returns TRUE.
long curr_rec(void);
                Returns the number of the current record. The first
                record is number 1.
int define(void);
                Creates a new database. Files are generated for the
                database and all the indexes.
                If a file already exists, it's overwritten.
                TRUE is returned on success. Otherwise FALSE is
                returned and the error_nr() function will return the error
                generated.
void delet(void);
                Marks the current record for deletion. When the pack()
                function is called all the marked records are removed
                from the database.
int export(char *filename);
                Writes the contents of the database to an ASCII file
                'filename'. This file is meant to be read back by the
                import() function. The exported file contains a header
                which resembles the 'database definition file'. The
                function returns TRUE on succes, FALSE otherwise.
int go_to(long rec_nr);
                The record 'rec_nr' becomes the current record.
                Whether or not the record is marked for deletion
                makes no difference.The function returns TRUE on
                succes, FALSE otherwise.
int import(char *filename);
                Reads records from an ASCII file 'filename' generated
                by the export() function and appends these records to
                the database. TRUE is returned on success, FALSE
                otherwise.
int is_delet(void);
                This function returns TRUE if the current record is
                marked for deletion, FALSE otherwise.
long numrec(void);
                Returns the number of records currently in the
                database. The records marked for deletion are also
                counted.
int open(void);
                Opens the database for use. The define() function has
                to be called, that is, the database file needs to exist.
                Index files are automatically generated if they are
                missing. TRUE is returned on success. Otherwise
                FALSE is returned and the error_nr() function will
                return the error generated.
int order(void);    Returns the number of the current active index.
int order(int index_number);
                This function controls the use of indexes. The variable
                'index_number' indicates which index has to become
                the active index. All the indexes however, are updated
                when a record is altered. In the header file a
                preprocessor constant is defined for each index. The
                name of this constant is generated by converting the
                field name to upper case and adding _INDEX.

    Example:
       An index on field:       Street
       Preprocessor constant:   STREET_INDEX
       <Class>.order(STREET_INDEX);
       will make the index on the street field the active index.
       <Class>.order(UNSORTED);
       makes all the indexes inactive.

                Changing the active index does not alter the current
                record. The preprocessor constant UNSORTED can
                be used to render all the indexes inactive. The
                database will be browsed in its 'natural' order.
                The function returns TRUE on succes, FALSE
                otherwise.
int pack(void);
                Removes all the records marked for deletion. No
                temporary files are used!
                The function returns TRUE on succes, FALSE
                otherwise.
int reindex(void);
                Rebuilds all the indexes of the database. The function
                returns TRUE on succes, FALSE otherwise.
int search(void *key);
                The active index is searched for value 'key'. The
                current record becomes the first record which matches
                the search value. The function accepts a pointer to
                the search argument. When the search argument is
                not exactly matched, the current record becomes the
                record with next  higher' value. In this case the funtion
                will return TRUE.
                If no  higher' value is available, the last record
                becomes the current and the function returns FALSE.
                This strategy proofs to work fine when searching for
                names etc..
int skip(int delta=1);
                Moves the current record pointer delta positions.

    Examples:
       skip(1); // The next record becomes the current record.
       skip(-1);// The previous record becomes the current record.
       skip(0); // Nothing happens.
       skip(10);// The record 10 positions to the end becomes
                // the current record.
       skip();  // Same as skip(1);

                If an attempt is made to go 'before' the first record,
                record number 1 becomes the current record. Similar,
                the last record becomes the current record if an
                attempt is made to pass beyond the last record. The
                order in which the records are traversed is controlled
                by the current active index. The function returns the
                number of positions actually moved.
int tBOF(void); Test for Beginning Of File.
int tEOF(void); Test for End Of File.
                The functions return TRUE if the end is reached,
                (according to the active index) FALSE is returned
                otherwise.
int to_DBASE(char *filename);
                Exports the database to a file 'filename', which can be
                read by dBASE. Index files (for dBASE) are NOT
                generated. The function returns TRUE on succes,
                FALSE otherwise.
int top(void);
                The current record is set to the first record according to
                the active index. The function returns TRUE on
                succes, FALSE otherwise.
void undelet(void);
                If the current record is marked for deletion, this
                function removes the marker.


17.19 Warning

The program generator is not 'fool proof'. This means that you
should avoid using names which already are reserved C++ key
words. E.g. if you try to define a field with the name 'delete' the resulting
source will not compile.


17.20 A Large Example

Let's say we want to build a database with stores a person's name and
his/hers birthday.

Step 1

First we need to construct a definition file. Next is a working example.


class:  BIRTH
record: BRecord
file:   bdays
field:  name    s 30 T
field:  birthday    d Y4MD Y


Assume the name of this definition file is 'birth.def'.

Step 2

From the definition file we have to generate the source for the database.
We do that by calling CSDBGEN.


c:\borlandc\csutil\test> csdbgen birth.def



This produces two output files: 'birth.cpp' and 'birth.h'.
These names are derived from the name of the definition file. Not from
the class name as one might expect from this example.


Step 3

We are now ready to start compiling. Normally, creating the database will
be an option in the main menu of the application, but because this is a
demonstration we do things differently.


#include "iostream.h"
#include "birth.h"

void main(void)
{

   BIRTH db;            // Declare an instance of the new BIRTH class.

   if(!db.define()) // Create the database and its indexes.
   { cout<<"Error "<<endl; }

}


Compile this together with the 'birth.cpp' file and link it.

When ran, it should create three files:
- 'bdays.dbf'   The TBASE main database file.
- 'bdays01.idx' The BTREEa index on the field name.
- 'bdays02.idx' The BTREEl index on the field birthday. Remember,
                dates are stored as longs.

If you run CSDIR in the same directory it will show something like this:



Directory C:\BORLANDC\CSUTIL\TEST\

Name                Size      Type      Entries     Created      Updated
--------------------------------------------------------------------------
BDAYS.DBF            174      TBASE           0   Nov 01 1994  Nov 01 1994
BDAYS01.IDX          174      BTREEa          0   Nov 01 1994  Nov 01 1994
BDAYS02.IDX          174      BTREEl          0   Nov 01 1994  Nov 01 1994
--------------------------------------------------------------------------
Total:               522 bytes in   3 files.




Step 4

By now, we have created the database files and we have the class to
work with it. In other words, we are ready to write an 'application'.


// Some error checking omitted for conciseness.


#include "iostream.h"
#include "birth.h"

void main(void)
{

   BIRTH db;            // Declare an instance of the new BIRTH class.

   if(!db.open())       // Open it.
   { cout<<" Error "<<endl; exit(1); }

   db.append_blank();   // Because it's empty, add a blank record

   db.name("Luke Skywalker");   // Modify the name.
   db.birthday("2015/07/03");   // Modify the birthday.

   db.append_blank();           // Add a new record. Becomes the current.

   db.name("Al Bundy");         // Modify the name.
   db.birthday("1945/11/30");   // Modify the birthday.

   db.reindex();                // Reindexing. For demonstration purposes.
                                // Shouldn't be necessary.

   db.order(BIRTHDAY_INDEX);    // Make BIRTHDAY the active index.
   db.top();                    // Go to the oldest person.
   do
   { cout<<db.name()<<endl; }   // Display his name.
   while(db.skip());            // Skip to the next.


   db.go_to(1);                 // Make record 1 the current record.
                                // Index INdependent!
   db.delet();                  // Mark it for deletion.
   db.pack();                   // Remove it from the database.

   db.close();                  // Close database and indexes.
}



If you run CSDIR again afterwards, you will see something like this:



Directory C:\BORLANDC\CSUTIL\TEST\

Name                Size      Type      Entries     Created      Updated
--------------------------------------------------------------------------
BDAYS.DBF           4096      TBASE           1   Nov 01 1994  Nov 01 1994
BDAYS01.IDX         3072      BTREEa          1   Nov 01 1994  Nov 01 1994
BDAYS02.IDX         3072      BTREEl          1   Nov 01 1994  Nov 01 1994
--------------------------------------------------------------------------
Total:             10240 bytes in   3 files.
























                               Part


                               Three






  Next are some classes which can be used where the traditional
                      database will not do.
  A VRAM class is discussed which makes it possible to maintain
                   pointer structures on disk.
  Two other classes, VBASE and VBAXE, are presented which deal
                  with variable length records.






                             18 VRAM

18.1 Introduction

VRAM is without any doubt the most flexible and versatile class in this
library. Contrary to the traditional database, this one doesn't suffer
from fixed record sizes and doesn't have problems with deletions.
In other words: it isn't a database at all!

Assuming a C++ programmer has a good understanding of a 'heap', it
shouldn't take long to explain this class. In one sentence, VRAM mimics
a 'heap on disk'.

The idea is simple: use functions like 'malloc' and 'free' to manipulate the
necessary space, just like with an ordinary heap, only this time the heap
is in fact a file. In this way the data is not lost when the program exits
while all the flexibility of a heap is still there!


18.2 Creating

        int define(char *name,U16 struclen);

This is the function needed to create a VRAM system.
Contrary to what you might have expected, it takes two parameters. The
first is as usual the file name, the second however, is the maximum size
you are planning to allocate.

This differs from the ordinary heap which simply accepts allocations of
any size right from the start. (Which also explains why the ordinary  heap
allocations are so amazingly inefficient.)

In a way, the second parameter 'struclen' is a performance parameter. If
you like, you can always use the maximum, which is 32 Kb, but this would
yield a highly inefficient VRAM. The VRAM system will perform better the
more accurate 'struclen' reflects the true state of affairs. However,
performance option or not, 'struclen' is a upper limit to what you are
allowed to allocate. Any attempt to allocate more, will be answered with a
runtime error.


// Example VRAM define()
// Error checking omitted for conciseness.

#include "CSVRAM.H"

void main(void)
{

    VRAM vr;
    vr.define("VRAM.TST",614);  // Allocating at most 614 bytes.

}



The CSDIR utility recognizes VRAM files. When the example program
has run, it will display something like:


Directory C:\BORLANDC\TEST\VRAM\

Name                Size      Type      Entries     Created      Updated
--------------------------------------------------------------------------
VRAM.TST             174      VRAM            0   Nov 18 1994  Nov 18 1994
--------------------------------------------------------------------------
Total:               174 bytes in   1 files.



18.3 Opening & Closing

Like all the databases classes in this library, VRAM needs to be 'opened'
before it can be used and, consequently, 'closed' afterwards.

syntax: int open(char *name,U16 kb_buf);

This opens the vram file 'name' and uses 'kb_buf' Kb for buffering.

syntax: int close(void);

This closes the VRAM system. This function is also called by the class
destructor when needed.


// Example VRAM
// Error checking omitted for conciseness.

#include "CSVRAM.H"

void main(void)
{

    VRAM vr;
    vr.define("VRAM.TST",614);  // Allocating at most 614 bytes.

    vr.open("VRAM.TST",300);    // Opens VRAM.TST using 300 Kb buffers.

    // Doing something interesting.

    vr.close();             // Close VRAM system.
}




18.4 VRAM Pointers

The normal malloc() function returns a void pointer, unfortunate VRAM
cannot do that. It uses its own type of pointer: VPOI which is short for
Virtual POInter. VPOI is a simple 32 bit unsigned long, defined in
'CSVRAM.H'. The VPOI also limits the size of a VRAM system to 4 Gb.
( Of course you can always use more then one VRAM... )


There is another important difference between VRAM and a normal
heap. VRAM distinguishes between reading and writing. The buffer
system used, cannot tell whether you are making changes. Therefore,
the programmer need to supply that information by calling different
functions for reading and writing.

Reading:    char *R(VPOI p);
Writing:    char *W(VPOI p);


// Example
// Error checking omitted for conciseness.

#include "CSVRAM.H"

void main(void)
{

    VRAM vr;                // A VRAM system.
    VPOI vp;                // A VRAM pointer.
    char *cp;               // A normal character pointer.

    vr.define("VRAM.TST",614);  // Initially create it.

    vr.open("VRAM.TST",50); // Opening with 50 Kb buffers.

    vp=vr.malloc(20);       // Allocate 20 bytes from the virtual heap.

    cp=vr.W(vp);            // Obtaining a character pointer to
                        // the allocated space. We are planning to
                        // write, so the 'W' function is used.

    strcpy(cp,"Some Data"); // Write data into it.

    vr.close();         // Close the VRAM system.
                        // "Some Data" is now on disk!

}



From the above example it becomes clear how the VPOI pointers can be
used. The method is simple: convert them into normal pointers and apply
standard C++ programming technique.

Only the last 2 converted VPOI pointers are guaranteed to be
valid. VRAM has a limited number of buffers, so you cannot
expect all data to be in ram forever.
Every time you convert a VPOI pointer into a character pointer by using
the W() or the R() function, VRAM calculates the corresponding position
in the file and loads the required page in ram. The pointer returned,
points directly into this page. Because only the last two pages are
guaranteed to be in ram under all circumstances, the third time you
convert a VPOI pointer, it can overwrite a previously loaded page.

Because at least two pointers are valid, you can copy data from one
VRAM position to another without using temporary storage.

With the W() function, the loaded page is marked 'dirty' which makes
sure it's written back to disk when the page is removed from the buffer
system. This is not so for the R() function. In that case the page is simply
discarded.


// Example, copying between two VPOI pointers.
// Error checking omitted for conciseness.

#include "CSVRAM.H"
void main(void)
{

    VRAM vr;
    VPOI vp1,vp2;

    vr.open("VRAM.TST",50); // Opening with 50 Kb buffers.

    strcpy(vr.W(vp1=vr.malloc(20)),"Some Data");    // Allocate and fill
                                            // one VPOI.

    vp2=vr.malloc(100);                     // Allocate a second.

    memcpy(vr.W(vp2),vr.R(vp1),20);         // Copy!

    vr.close();         // Close the VRAM system.
}



18.5 Fragmentation

Just as with an ordinary heap, VRAM can suffer from fragmentation. The
normal heap can become prematurely exhausted because of
fragmentation while for the VRAM system it only means the file  becomes
larger then strictly necessary.
On the other hand: the normal heap gets a fresh start every time the
program is run while the VRAM files may be in use for years.

Therefore a defrag() function is available. If you decide to use it, it is best
to use it regularly. It mainly does three things:
a)  Joining free space wherever possible.
    This is not done during normal operation because it may involve
    additional IO.
b)  Sorting the empty-data-chains.
    When space is needed, its taken from the beginning of a empty-
    chain. After sorting the chains, the empty blocks at the beginning of
    the file will also be at the beginning of the chain. Eventually this
    leads to pages at the end becoming completely free and pages at
    the beginning (almost) full.
c)  Empty pages above the highest used location are stripped from the
    file.

The defrag() function links in a lot of code, it uses an entire
btree and a temporary file. In a way this makes the defrag()
function 'bigger' then the rest of the VRAM class combined!

18.6 Root

Under some circumstances you may need a 'starting point' in the VRAM.
Example:
    Let's say you are writing some flowcharting program and you have
    decided that VRAM is a great help in storing and manipulating a
    flowchart. The flowchart probably consists of several independent
    parts pointered together. Once in it, each part can be reached by the
    VPOI's stored in the data structure. This leaves you with just one
    problem: where does the flowchart start?
    It takes just one VPOI to store that location and it would be a shame
    if you needed an additional configuration file for that.

Therefore two very simple functions are implemented to store and
retrieve a 'special' VPOI.

void root(VPOI p);      Stores VPOI 'p'.
VPOI root(void);        Obtains the VPOI stored with the previous
                        function.

These functions just manipulate this single VPOI. They have absolutely
no effect on the rest of the VRAM system.

18.7 Functions in Alphabetical order.

Prototypes are in 'csvram.h'. With the exception of the define(). open()
and zap() functions, the class needs to be opened for the functions to
work.

U16 alloc(VPOI p);
U16 alloc(void *p);
                Returns the number of allocated bytes at a certain
                location. The pointer may be either a VPOI pointer or a
                normal pointer to the same location.
int close(void);    Closes the VRAM system. Returns TRUE on success,
                    FALSE otherwise.
int define(char *name,U16 struclen);
                Creates the VRAM system 'name' with 'struclen' being
                the maximum size of any allocation. Returns TRUE on
                success, FALSE otherwise.
int defrag(void);   Defragments the virtual heap. Returns TRUE on
                    success, FALSE otherwise.
int empty(void);    Makes the VRAM system empty. The class remains
                    open but all allocations will be undone. Returns TRUE
                    on success, FALSE otherwise.
U32 number(void);
                Returns the number of allocations currently done. This
                is the number of malloc()'s minus the number of
                free()'s.
int open(char *name,S16 kb_buf);
                Opens VRAM 'name' using 'kb_buf' Kb ram for
                buffering. Returns TRUE on success, FALSE
                otherwise.
char *R(VPOI p);
                Converts a VPOI pointer into a character pointer. It is
                assumed no modifications are going to take place.
void  root(VPOI p);
                Stores VPOI 'p'.
VPOI root(void);    Obtains the VPOI stored with the previous functions.
int save(void); Safes all buffered data to disk. All the buffers are
                flushed and the header page is updated. Returns
                TRUE on success, FALSE otherwise.
void free(VPOI p);
                Frees the VPOI p.
VPOI malloc(U16  size);
                Allocates 'size' amount of bytes from the virtual heap.
                The corresponding VPOI is returned.
char *W(VPOI p);
                Converts a VPOI pointer into a character pointer. It is
                assumed modifications are going to take place.
int zap(void);  Closes the VRAM system when needed and restores
                all class defaults. Returns TRUE on success, FALSE
                otherwise.



                             19 VBASE


19.1 Introduction

The use and purpose of the VBASE class are much similar to that of
the TBASE class. There is however, one huge difference, VBASE
supports variable length records! The 'V' in VBASE stands for 'variable'.

Compared with TBASE, the differences in the public member functions
are minimal. The append() function now takes an additional parameter
indicating the length of the record. The same goes for the write_rec()
function. Apart from the 'normal' read_rec() function there is now an
additional read_rec() which returns the length of the obtained record.

VBASE is a 'stand alone' class. It has nothing to do with the
databases produced by CSDBGEN.


19.2 Using VBASE.

Using VBASE is very straightforward.

- Initially create the VBASE system by calling define().
- Open it through a call to open().
- Read, write and append records.
- Close VBASE by calling close().

That's all!


// Example
// Error checking omitted for conciseness.

#include "csvbase.h"

void main(void)
{

    VBASE vb;

    vb.relocate_when_shrunk(TRUE);  // Move the record to a better
                                    // fitting position when shrunk.
    vb.define("VBASE.dbf",1230);    // Maximum record length 1230 bytes.

    vb.open("VBASE.dbf", 200);      // Open with 200 Kb buffers.

    char *s="Some chunk of data. ";

    vb.append_rec(s,strlen(s)+1);   // Append a record. Notice the length
                                    // parameter which is not needed with
                                    // TBASE.
    char d[200];
    vb.read_rec(1,d);               // Read record 1 into array 'd'.

    strcpy(d,"New Data");

    vb.write_rec(1,d,strlen(d)+1);  // Overwrite record 1 with a new
                                // block of data. This does not have
                                // to have the same length!

    vb.close();                 // Ready. Close VBASE. Also called by
                                // the class destructor if needed.

}


For more information, please read the documentation on the TBASE
class. (Chapter 15. )


19.3 Relocating records

When an existing record is overwritten with a new bigger record, it no
longer fits in its original slot, which means the record has to be relocated.
This is not necessarily so when the record shrinks. In that case you have
the choice between relocating, which saves disk space but is relatively
slow or leaving the record where it is and waste some disk space.

The function 'relocate_when_shrunk()' is there to choose between these
two strategies. It has to be called before 'define()'.
Calling 'relocate_when_shrunk(TRUE);' will relocate a record when it
becomes smaller. 'Relocate_when_shrunk(FALSE);' will leave the
records in place when possible.
The default is set to: relocate_when_shrunk(TRUE).

The function has to be called before 'define()' and its setting
cannot be altered afterwards.



19.4 Limitations.

VBASE was designed for databases up to around a million records. This
is not a 'hard' limit, its possible to add many more records but under
some unfavourable conditions memory utilization can get out of control.
The way the class uses the available ram is controlled by the open()
function. Therefore, adding a huge number of records in one go, poses a
problem. If the records are not appended all at once but with several
close/open sequences in between, VBASE can easily store 16 million
records.

So:
- Under worst case conditions 1 million records.
- Under favourable conditions 16 million records.
- Avoid more then 16 million records.

The above limitations stem from ram utilization. For those drowning in
memory, there are also software limitations:
- maximum file size 4 Gb.
- 4 billion records.


Because there are so many 'buts' and 'ifs', there is another class VBAXE,
discussed in the next chapter, to deal with the larger databases.

As a rule of thumb, use VBASE for databases up to 1 million
records and VBAXE for more then 1 million records.

19.5 Functions in alphabetical order.

The function prototypes are in csvbase.h.


U32 append_rec(void *data,U16 len);
                Append a record to the database. 'data' is a pointer to
                the data and 'len' is the number of bytes data. The
                function returns the number of the newly created
                record.
int close(void);
                Closes the database. All buffers are flushed and all
                allocated memory is freed. TRUE is returned on
                success, FALSE otherwise.
int define(char *name,U16 struclen);
                Creates a new database. 'name' is the name of the file
                and 'struclen' is the maximum length of a record. Do
                not make 'struclen' unnecessary large because its
                value controls space efficiency. The maximum value of
                struclen is 32767. TRUE is returned on success,
                FALSE otherwise.
void delet(U32 record);
                Marks record 'record' for deletion. Only the 'delete bit'
                is set. The pack() function needs to be called to
                actually remove the record from the file.
void empty(void);
                Removes all records from the database. Upon return
                the database will contain zero records but will still be
                'open'.
int is_delet(U32 record);
                Returns TRUE if record 'record' is marked for deletion.
                FALSE otherwise.
char *locate_rec(U32  rec);
char *locate_rec_d(U32  rec);
                Functions to return a pointer to record 'rec' directly into
                the buffer system. The returned pointer can be used to
                change the contents of a record but not the length.
                Please, read paragraph 15.8.2 about locating before
                using these functions.
U32 numvrec(void);
                Returns the number of records currently in the
                database.
int open(char *name,U16 kb_buf);
                Opens database 'name', using 'kb_buf' Kb ram for
                buffering. Returns TRUE on success, FALSE
                otherwise.
int pack(void); Removes all records marked for deletion. A temporary
                file is used. TRUE is returned on success, FALSE
                otherwise.
void read_rec(U32 rec,void *ptr,U16  &length);
                Reads record 'rec' and copies it into the buffer 'ptr' is
                pointing at. The variable 'length' is set to the length of
                the retrieved record.
void read_rec(U32 pos,U16  maxlen,void *ptr,U16  &length);
                The same as the precious function but with an
                additional parameter 'maxlen' specifying the maximum
                number of bytes that can be copied into the buffer 'ptr'.
                If the record proofs to be longer then 'maxlen', only
                'maxlen' bytes will be copied to 'ptr'.
U16 rec_len(U32 rec);
                Returns the length of record 'rec'.
void relocate_when_shrunk(int TrueOrFalse);
                When called with 'TrueOrFalse' set to TRUE, records
                will be relocated when shrunk. When called with
                FALSE the records will stay at the same place. The
                function has to be called before define(). For more
                information, please see the paragraph about this topic.
int save(void); As a precaution measure, all 'dirty' buffers are written
                to disk and the header page is updated. The database
                remains open. Returns TRUE on success and FALSE
                otherwise.
void undelet(U32 rec);
                Removes the 'delete' marking from record 'rec'.
void write_rec(U32 rec,void *data,U16 len);
                Overwrites the existing record 'rec'. 'len' bytes are
                copied from 'data'. Afterwards the record will be of
                length 'len'.

                             20 VBAXE

20.1 Introduction

As explained in the previous chapter, VBAXE is similar to VBASE but
is intended for larger databases. That is, more then 1 million records.

The public member functions of the classes are 100% identical. The
inner workings however are completely different. VBAXE uses two files
for a database where as VBASE uses only one. VBAXE is build from two
other classes namely TBASE and VRAM.

Building a class for variable length records is not easy, but writing one
that can store millions of records, is fast, uses little ram, doesn't use
unnecessary disk space and still stores everything in one file is next to
impossible.
So, rather then coming up with something slow & clumsy, VBAXE gives
up on storing everything in one file.


20.2 Working.

The working of VBAXE is very simple. It allocates the necessary space
from VRAM and stores the VRAM pointer together with the length in a
TBASE record.
E.g. to obtain record 714 it starts with retrieving record 714 from TBASE.
Because TBASE uses fixed size records, the position of record 714 can
easily be calculated. Once this record is obtained, the VRAM pointer to
the data of record 714 is known. From this pointer the position in the
VRAM file can again be easily calculated. If nothing is in the buffers, it
takes two IO's to obtain the data, but at least no searching is done. The
positions in the files are always known through simple arithmetic.
( Which, btw., also holds for the VBASE class.)


20.3 Files

As explained above there are two files to every VBAXE database. The
TBASE part stores it's data in a file with extension '.vbi'. The VRAM part
uses extension '.vbd'. If define() or open() is called with with a name
which already has an extension, that extension will be removed.

The CSDIR utility recognizes these files and will display the TBASE class
as VBASEi and VRAM as VBASEd.


// Example
// Error checking omitted for conciseness.

#include "csvbaxe.h"

void main(void)
{
    char buf[1000];

    VBAXE vb;
    vb.define("demo",390);      // Max record length 390 bytes.

    vb.open("demo",200);        // 200 Kb buffers.

    for(int i=1;i<=100;i++)
    {
      vb.append_rec(buf,1+random(390));   // Append 100 records
                                          // with random length
                                          // and random contents.
    }

    vb.close();             // Close database.
}


Afterwards CSDIR will display something like:



Directory C:\BORLANDC\DEMO

Name                Size      Type      Entries       Created      Updated
--------------------------------------------------------------------------
DEMO.VBI            4096      VBASEi        100   Dec 14 1994  Dec 14 1994
DEMO.VBD           26624      VBASEd        100   Dec 14 1994  Dec 14 1994
--------------------------------------------------------------------------
Total:             30720 bytes in   2 files.



However, CSINFO will still say DEMO.vbi is a TBASE file and DEMO.vbd
a VRAM file.


20.4 Prototypes.

The class defintion and it's function prototypes are in "CSVBAXE.H".






















                               Part


                               Four







    Part  four discusses the low-level OLAY and DLAY classes.
            Basically, these classes work as a normal
        sequential file but with two whopping differences:
                   insertions and deletions!!

        It also covers IBASE, which implements a database
          class with the ability to insert and delete
                  records anywhere in its file.




                              21 OLAY



21.1 Introduction & Overview

The OLAY class performs the same functions as a normal sequential
file but with two major additions: insertions and deletions!
This makes it possible to insert or delete data anywhere in the file. This is
not done by copying the entire file, but by moving data 'around' inside.

It takes a while to realise the potential of such a system!

It seems to us that several, very basic, problems in computer
programming are related to the limitations of the filesystem. E.g.
wordprocessing would be a lot easier if you could simply add or delete
every character/sentence directly into the file. In databases it would also
be a great help, making it possible to delete a record strait away, instead
of using a tag/pack technique.

The OLAY class encapsulates the standard file system and adds insert &
delete functionality. Still, through the public member functions, the data
will appear as an contingious stream of bytes.



21.2 Buffering

Just as the (other) database classes in this library, the OLAY class is
derived from the PAGE class. This means it has a build in buffering
system.


21.3 Performance

Due to its build-in-buffering, the OLAY system normally outperfoms the
traditional file system.
However, the OLAY files are not 100% full. It depends on the type of
application, but 70% effectively used disk space seems a typical value.


21.4 Core Functions

The features of the OLAY class are implemented through a small set of
functions called "core functions". Several more functions are discussed in
the remaining parts of the chapter but these functions are not strictly
necessary for using the OLAY class. They are merely implemented for
convenience.
The core functions produce the smallest and fastest code.

This section will only discuss the core functions.

These functions are:

int define()    To create an OLAY file.
int open()      To open it.
U32 read()      To read from it.
int write()     To overwrite existing data.
int delet()     To delete data.
int insert()    To insert data.
int append()    To append data.
int close()     To close the file.
U32 filesize()  To return the file size.
U32 bottom()    To return the last position in the file.

The function prototypes are in "CSOLAY.H".
First a working example.


// Error checking omitted for conciseness.


#inlude  "iostream.h"
#include "CSOLAY.H"

void main(void)
{
    char buf[100];          // A text buffer.
    OLAY db;                // An instance of the OLAY class.

    db.define("demo.fil");  // Creating the file.

    db.open("demo.fil",100);// Open the file.
                            // Use 100 Kb ram for buffering.

    strcpy(buf,"Some chunk of data");
    db.append(buf,strlen(buf)+1); // The file is empty.
                                  // Append some data.

    db.insert(5," larger",7);  // Insert 7 bytes at position '5'.
                               // (The first byte is at position '1'.)


    db.read(1,buf,db.filesize());
                            // Read everything back.

    cout<<buf;              // Displays: Some larger chunk of data

}



This program will create the file 'demo.fil' on your harddisk. Only the
OLAY class can make sense out of it. E.g. the DOS command 'type' will
only produce garbage. The OLAY files have to be treated as any other
database file. That is: use them with the application they belong to,
nothing else.21.4.1 Creating
The OLAY class requires a file to be explicitely created before it can be
opened.

int  define(char *name);
                This creates the file 'name' on your harddisk and
                inserts the correct header block. If the file already
                exists, it is overwritten! The function returns TRUE on
                success and FALSE otherwise.

21.4.2 Opening
Before an OLAY file can be used it has to be opened. The open function
does not distinguish between reading, writing or appending.

int  open(char *name,U16 kb_buf=30);
                This opens the file 'name' using 'kb_buf' Kb ram for
                buffering. 'kb_buf' has a default value of 30 Kb. The
                function returns TRUE on success and FALSE
                otherwise.

21.4.3 Reading and Writing
If you like, you can think of the OLAY class as a database with records of
only one byte. The first record/byte is, as always, at position 1. Fortunate,
we are not forced to read and write only one byte at the time.

U32 read(U32 pos,void *p,U32 length);
                Reads 'length' bytes, starting of from position 'pos'.
                The data is copied to pointer 'p', which should be
                pointing to a buffer large enough to hold 'length' bytes.
                If the end of file is reached before 'length' bytes are
                read, the copying process stops without an error. The
                function returns the number of bytes actually copied to
                'p'.

int write(U32 pos,void *p,U32 length);
                This function writes, or to be more precise, overwrites
                'length' bytes starting off from position 'pos'. The data
                is copied from 'p'. The already present data is
                overwritten. Write() can not append data to the file.
                Trying to write more data then exists between 'pos' and
                end-of-file is an error. The function returns TRUE on
                success, FALSE otherwise.

21.4.4 Insert & Delete
The beauty of the OLAY class lays in its ability to instantly insert or
delete data in/from its file. Inserting and deleting also implies the
remaining data in the file changes position. E.g. if you insert 10 bytes at
position 5 the data which was originally at position 120 is now at 130!

S32 delet(U32 pos,S32 length)
                Deletes from position 'pos' 'length' number of bytes.
                The position 'pos' itself is also deleted. Remember: the
                first byte is postition 1. If 'length' is less then or equal to
                0 the function returns 0. When an attempt is made to
                delete more data then is left in the file, all the remaing
                data will be deleted. The function returns the number
                of bytes actually deleted.

int insert(U32 pos,void *buffer,U32 len)
                The insert() function insert 'len' number of bytes copied
                from 'buffer' at position 'pos'. The byte at position 'pos'
                itself is also moved. This means that a call to insert()
                with 'pos' equal to 1 inserts new data before ALL other  data.

// Example
// Error checking omitted for conciseness.

    // This program will display the string
    // 'Led Zeppelin' on the screen.

void main(void)
{
    char buff[200];
    OLAY db;

    db.define("test.dbf");
    db.open("test.dbf",40);

    strcpy(buff,"Zeppelin");
    db.append(buffer,strlen(buffer)+1); //Write terminating zero also.

    strcpy(buff,"Led ");
    db.insert(1,buffer,strlen(buffer)); //Insert before everything.

    db.read(1,buff,db.filesize());      //Read it all back.
    puts(buff);

}



21.4.5 Filesize & bottom

U32 bottom(void);
                The function returns the position directly after the last
                byte in the file: the first 'free' position. If the OLAY file is
                empty, bottom() will return 1.

U32 filesize(void);
                This function returns the number of bytes in the OLAY
                file. This is not the size of the file on disk, but the
                number of bytes that can be read by the 'read()'
                function.  This value is equal to bottom()-1.

21.4.6 Closing

The OLAY class does a lot of buffering. A close() function is needed to
safe all the data to disk.

int  close(void);   Closes the class and the associated file. All buffers are
                    flushed and all allocated memory is freed. When
                    needed, the function is automatically called by the
                    class destructor. The function returns TRUE on
                    success and FALSE otherwise.
21.5 Additional functions

Next are some functions to 'make live easy'. They are not essential for
working with the OLAY class.

int  writea(U32 pos,void *p,U32 len);
                The normal write() function cannot append data. This
                function can. It uses the write() function to overwrite
                excisting data and calls the append() function when
                data has to be added to the file. It (over)writes 'len'
                bytes starting of from position 'pos'. The data is copied
                from pointer 'p'. The function returns TRUE on
                success, FALSE otherwise.


// Example
// Error checking omitted for conciseness.

void main(void)
{
    OLAY db;
    int  i=3;
    long l=4;

    db.define("example.dbf");       //Create an empty file.
    db.open("example.dbf",100);     //Use 100 Kb for buffering.
    db.append(&i,sizeof(int));      //Append 'i'. (2 bytes)
    db.writea(1,&l,sizeof(long));   //Overwrite 'i' with 'l'. (4 bytes)
                                    //The normal write() cannot do this,
                                    //because data has to be appended!
    db.close();
}


int replace(U32 pos,U32 old_len,void *buffer,U32 new_len);
                This function makes it possible the replace a block of
                data with a new block data of different size. (The new
                block can be smaller or bigger.)
                'pos':      the position of the first byte which is to be
                            replaced.
                'old_len':  length of the chunck which needs to be
                            replaced.
                'buffer':   pointer to the buffer which holds the new
                            data.
                'new_len':  number of bytes which has to replace the
                            original 'old_len' bytes.
                TRUE is returned on success, FALSE otherwise.

int  inserta(U32   pos,void *p,U32 len);
                An inline function which calls insert() or append()
                depending on the value of pos. Its purpose is to
                overcome a limitation of the basic insert() function
                which cannot properly handle inserts beyond the end
                of the file (which of course are in fact appends).
                The function inserts or appends data at position 'pos'.
                If 'pos' is equal to 'bottom()' it calls append(),
                otherwise insert(). 'len' bytes are copied from
                pointer 'p'.
                TRUE is returned on success, FALSE otherwise.

21.6 Import & Export

The OLAY class stores its data in a format that is not compatible with
anything else. Therefore two sets of functions are available to convert to-and-from a normal sequential file.

int  export_bin(char *name);
int  export_asc(char *name);
int  export(char *name,int bin_mode=TRUE);
                Exports all data to the file 'name'. 'Name' will be a
                normal sequential file. If it already exists it will be
                overwritten. If not, it is created. The variable
                'bin_mode' controls the mode in which 'name' is
                opened. 'Bin_mode' equal to TRUE, as is the default,
                will open the export file in binary mode. A value of
                FALSE opens it in ascii mode. In addition, two inline
                functions are defined, export_bin() and export_asc(),
                which call import with bin_mode set to respectively
                TRUE and FALSE.
                The status of the OLAY class will remain unchanged.
                TRUE is returned on success, FALSE otherwise.

int  import_asc(char *name);
int  import_bin(char *name);
int  import(char *name,int bin_mode=TRUE);
                The import function appends the data from the file
                'name' to the current OLAY system. The file 'name' can
                be opened in binary or in ascii mode, controlled by the
                parameter 'bin_mode'. 'Bin_mode' equal to TRUE, as
                is the default, will open the export file in binary mode.
                A value of FALSE opens it in ascii mode. Two inline
                functions are defined, import_asc() and import_bin(),
                which call the import() function with 'bin_mode' set to
                respectively FALSE and TRUE.
                TRUE is returned on success, FALSE otherwise.


// Example
// Error checking omitted for conciseness.

    OLAY db;
    db.define("example.bin");   // Start off with a new file.
                                // (Not a prerequisite for applying
                                // the import() function.)
    db.open("example.bin",300); // Open the file.
    db.import_bin("somefile.bin");  // Load it with the data from
                                // 'somefile.bin', assuming this
                                // exists.
    db.close();                 // Close the OLAY class.



21.7 Sequential functions

Completely independent of the functions discussed sofar, another set of
functions is implemented which follows, as closely as possible, the
standards set by the ANSI committee. Again, these functions are not
strictly necessary to use the OLAY class, but they have some advantages
when traversing a file sequentially. Therefore they are called 'sequential
functions'.
These functions are build around  file pointer'. This file pointer is
automatically moved forward with the amount of data read or written.
The OLAY class itself doesn't use a file pointer, it has no need for it. To
implement the sequential functions, a file pointer is simulated by a
variable. This variable is referred to as  VFP', which is short for  virtual file
pointer'. However, the OLAY class is capable of using the VFP to
optimize the process of locating a particular byte. In other words. ( E.g. it
is easier for the OLAY class to locate position 'p' if it already knows where
position 'p-1' is. )

Only the sequential functions use the VFP. All the other
functions DO NOT use it and consequently don't update it!
Because of this, it's propably best not to try mixing the sequential
functions with the others. But if you do, you have to reposition the VFP by
calls to the fseek() function every time you have called a function which
does not belong to the set of sequential functions!



The sequential functions are:

int  fseek()        // To position the VFP.
long ftell()        // To return the position in the file.
int  feof()         // To test for end-of-file.
int  fgetc()        // To read a character.
int  fputc()        // To write a character
int  fread()        // To read blocks of data.
int  fwrite()       // To write blocks of data.
char *fgets()       // To read strings.
int  fputs()        // To write strings.
long fdelete()      // To delete data.
int  finsert()      // To insert data.
void fflush()       // To flush the buffers.

There are no sequential functions for opening, closing or
creating the file. You still have to use the 'normal' open(),
close() and define() functions for that.

21.7.1 Sequential functions in alphabetical order

long fdelete(long amount);
                Its working is fully equivalent to the 'delet()' function,
                except in this case the position from which the data is
                deleted is controlled by the virtual file pointer.
                The first byte deleted, is the one pointed at by the VFP.
                The VFP remains unchanged. The function returns the
                number of bytes actually deleted.

int feof(void); Inline function which returns TRUE if the VFP is
                beyond the last byte, and FALSE otherwise. Because
                moving the VFP beyond 'bottom()' produces a runtime
                error, this can only mean that the VFP points exactly to
                'bottom()'.


// Example
// Error checking omitted for conciseness.

    OLAY db;
    char c;

    db.open("example.dbf");     // Accept the default 30 Kb for buffering.
    db.fseek(0);                // Don't assume the VFP is set.
    while(!db.feof())           // Read until the end.
    {
       c=db.fgetc();
       putchar(c);
    }

    db.close()      // If omitted, called by the class destructor.



void fflush(void);
                All dirty buffers are written back to disk. It is important
                to understand that this is all it does. Afterwards the file
                on disk is still in an undefined state. Only the close()
                function produces a disk file which is valid input for the
                next call to open(). Mainly implemented for
                completeness.

int fgetc(void);    The function reads an unsigned char and returns this
                    as an integer. If the end of the file is reached, fgetc()
                    returns -1. NOT EOF.

char *fgets(char *str,int num);
                The function reads up to 'num'-1 characters and copies
                them to 'str'. Bytes are read until a newline character is
                encountered or end-of-file reached. Upon success 'str'
                is returned, otherwise a NULL pointer is returned. The
                VFP is increased with the number of bytes read.

int finsert(void *buf,long amount);
                Its working is fully equivalent to the 'insert()' function,
                except the position at which the data is inserted is
                controlled by the virtual file pointer. The VFP will be
                increased with the number of bytes inserted.
                TRUE is returned on success, FALSE otherwise.

int fputc(int character);
                The function accepts an integer which is converted into
                an unsigned character and written to the position
                indicated by the VFP. (So, only one byte is written.) If
                the VFP points beyond the last byte the character is
                appended, otherwise it overwrites the existing value.
                The VFP is increased with one byte. The return value
                is the value written.

int fputs(char *str);
                It is an inline function which calls fwrite(). The contents
                of string 'str' is written to disk. The terminating zero
                however is NOT written. The VFP is increased with
                strlen(str). TRUE is returned on success, FALSE
                otherwise.

int fread( void *buf,int size,int count)
                It reads 'count' number of blocks, each of size 'size'.
                The data is copied into 'buf'. The function returns the
                number of blocks actually read. This differs from 'count'
                when an error has occurred or the end of file was
                reached. The total amount of data read can exceed
                64Kb. The VFP is increased with the amount of data
                copied to 'buf'.
                TRUE is returned on success, FALSE otherwise.

int fseek(long offset,int origin=SEEK_SET);
                Its purpose is to position the VFP. Depending on the
                value of 'origin' offset is taken from:
                    a) the beginning; origin=SEEK_SET
                    b) the current position; origin=SEEK_CUR
                    c) the end; origin=SEEK_END
                Fseek() has a default value for 'origin' of SEEK_SET.
                Note that 'offset' indeed means offset and NOT
                position. Fseek(2) makes the VFP points to the third
                byte!
                The next two examples both read the first 10 bytes.

    Example 1:

       db.fseek(0);
       db.fread(buffer,10,1);

    Example 2:
       db.read(1,buffer,10);



                Fseek may be used to move the VFP one byte beyond
                the end. This makes the VFP points precisely to
                'bottom()'. Trying to move beyond that is an error. Also,
                the VFP can not be positioned before the beginning of
                the file. When 'origin' is set to SEEK_END the value of
                'offset' needs to be positive. Fseek(0,SEEK_END)
                makes the VFP points to 'bottom()'.
                TRUE is returned on success, FALSE otherwise.

long ftell(void);   It is an inline function which returns the number of
                    bytes the VFP is removed from the beginning of the
                    file.
                E.g.:   If the VFP points to the very first byte, ftell will
                        return zero.

int fwrite(void *buf,int size,int count);
                It writes 'count' number of blocks, each of size 'size'.
                The function returns the number of blocks actually
                written. This differs from 'count' only when an error has
                occurred. The total amount of data written can exceed
                64K. The VFP is increased with the amount of data
                written to disk.


21.7.2 Miscellanious functions

int already_open(void);
                This function returns 1 if the class is 'open' and 0
                otherwise.

int data_2_header(void * ptr,U16 length);
                Inherited function.

int empty(void);    Makes the file empty. The class needs to be open and will still be open afterwards. Upon return the system
                    will contain 0 bytes data and bottom() points to 1.
                    TRUE is returned on success, FALSE otherwise.

int header_2_data(void * ptr,U16 length)
                Inherited function.

void header_page_size(U16 n)
                Inherited function.


U16 max_data_in_header(void)
                Inherited function.

int pack(void)  After a long serie of insert's and/or delete's the data in
                the OLAY file can become scattered. Some pages
                will contain relatively few data while other pages are
                still 100% filled. To put everything back in order it is
                adviseable (although not strictly necessary) to call the
                pack() function once in a while. The pack() function
                uses a temporary file. This file will be about the same
                size as the OLAY file. So, make sure sufficient free
                disk space is available. If the free space is inadequate,
                the function will return 0, the temporary file is removed,
                and the OLAY file will be unaltered.

void page_size(U16 t);
                Inherited function.




                              22 DLAY


The DLAY class performs the same functions as the previously
discussed OLAY class. DLAY however, is meant for far larger files.
Both classes need a complex datastructure to locate a specific byte in the
file. The OLAY class is keeping this datastructure in ram while DLAY is
storing it on disk in the same file it uses for the data.
As a consequence DLAY can handle files of 'unlimited' size where as
OLAY files should be kept below 5 Mb.
Because it has its datastructure in ram, OLAY is somewhat faster then
DLAY.

So:
- Use OLAY for files below 5 Mb.
- Use DLAY for files above 5 Mb.
- If you are short on ram, use DLAY.
- If you are not sure, use DLAY.


22.1 Performance

DLAY perfoms quite well. To test its usefulness as 'datastructure' for an
editor or wordprocessor, the class has been tested on a 100 Mb file. No
matter the typing speed, the DLAY class was able to individually insert
every typed character in the middle of his file!
(Tests done on a 486DX2 66Mhz.)

22.2 Member functions

The public member functions of the DLAY class are 100% indentical to
those of the OLAY class. Please, refer to the the documentation of the
OLAY class for more details.
The function prototypes are in: CSDLAY.H.








                             23 IBASE


23.1 Introduction

IBASE is a class similar to TBASE. That is: a easy to use class for
reading and writing records, without indexes. The 'I' in IBASE stands for
'insert'. Contrary to TBASE which can only append a record, IBASE can
insert a record anywhere in its file.

The IBASE class is derived from DLAY, which explains why it is in this
section of the documentation. If you have a file-system which can insert
and delete, then a database system which can insert and delete records
is all at a sudden easy to implement!

Deleting records no longer requires the dreaded tag/pack sequence, but
can be accomplished instantaneously.


23.2 Using IBASE

Using IBASE very much follows the same lines as using TBASE. Deleting
is now instantaneous, and records can be inserted.

The performance of IBASE is nowhere near that of TBASE. If
speed is an issue, you should consider using TBASE.

23.3 Using IBASE

IBASE works very much like TBASE. Please, read the documentation on
TBASE (chapter 15 ) also. Because the classes are so much alike,
IBASE will be discussed in far less detail.

23.3.1 Creating

int define(CSCHAR *name,U16  reclen);
                Creates the IBASE file 'name' for use of records with
                length 'reclen'. TRUE is returned on success, FALSE
                otherwise.

23.3.2 Opening

int open(CSCHAR *name,S16 kb=32);
                Opens the IBASE file 'name' for use. It will use at most
                'kb' Kb ram for buffering. 'kb' has a default value of 32.
                TRUE is returned on success, FALSE otherwise.
int open(void); This function returns TRUE if the class is already
                opened. FALSE otherwise.

23.3.3 Appending Records

S32 append_rec(void *data);
                Appends record 'data' to the database. The function
                returns the record number of the newly added record.
                E.g. if a record is added to an empty database the
                function will return 1. Don't forget; the first record is
                record '1'.
S32 append_rec(void);
                Extends the database with one record. Contrary to the
                previous function this one doesn't fill the new record
                with data. Time being, the record will contain garbage.

23.3.4 Reading

Contrary to TBASE, IBASE doesn't have 'locate' functions. The reason is
that a record can now be scattered over two (or even more) different
database pages. When such a record is in the buffer system it will no
longer be laying on contingious memory addresses, making it impossible
to access the record through a single pointer.

void read_rec(  S32 rec, void *data);
                Reads record 'rec' into 'data'. 'data' should be a buffer
                large enough to hold the record.

23.3.5 Writing

void write_rec(  S32    rec, void *data);
                Overwrites the existing record 'rec' with the record
                pointed to by 'data'. Record 'rec' must already exist.
                The function can not be used to add records.

23.3.6 Inserting

Inserting a record means just that. A record is inserted between two
existing records. Let's say you already have a record '2' and a record '3'
but want a new record in between, because that's where it should be
according to the alphabet.
The traditional approach is to add the record at the end of the database
and to maintain an index for the alphabetical order.

But now, thanks to the DLAY class, we can do without the index. The
record can directly be inserted at its correct position.

int insert_rec_b(S32  rec,void *p);
                Insert a new record before record 'rec'. That is: the
                new record will become record number 'rec' and the
                old record 'rec' becomes record number 'rec'+1. The
                pointer 'p' points to the data of the new record. TRUE
                is returned on success, FALSE otherwise.
int insert_rec_a(S32  rec,void *p);
                Insert a new record after record 'rec'. That is: the new
                record will become record number 'rec'+1. The old
                record 'rec' will stay at its place. The pointer 'p' points
                to the data of the new record. TRUE is returned on
                success, FALSE otherwise.

23.3.7 Deleting

Deleting records is now done instantaniously. E.g. the moment you
delete record '8', the old record '9' will become the new record '8' and so
on. In IBASE, there is no such thing as a 'delete bit'.

void delet(S32  rec);
                Deletes record 'rec'. No consecutive 'pack()' is needed.

23.3.8 Closing

int close(void);    Closing the database. All buffers are written back to
                    disk, all allocated memory is freed. TRUE is returned
                    on success, FALSE otherwise.

23.3.9 Miscellaneous functions

For the next functions to work properly, the database has to be opened.

U16 lengthrec(void);
                Returns the length of a record. This is the same value
                as used in the call to the define() function, when the
                database was created.

S32 numrec(void);
                Returns the number of records currently in the
                database.

int pack(void); This is the pack() function inherited from the DLAY
                class! This function has nothing to do with deleting
                records. Its purpose is to compress the file DLAY uses
                to store its data. TRUE is returned on success, FALSE
                otherwise.

int empty(void);    Makes the database empty. Upon function return the
                    database will contain zero records but the database is
                    still open. TRUE is returned on success, FALSE
                    otherwise.
























                               Part

                               Five






        Part Five will present some command-line utilities.
    Most noticeable CSDIR, which gives a quick survey of the
               databases in the current directory.
     It also discusses the demonstration application CSADD.
          CSADD is a DOS application to store addresses.




                             24 CSDIR


CSDIR is a command-line utility similar to the well-known MS-DOS dir.
It's purpose is to list the CS-databases. By default it ignores all other
files.

        SYNTAX: csdir [filename] [/A] [/?]

        filename: the file(s) to be listed. Wildcards allowed.
        /A  List all files.
        /?  Display help.

Example of its output:


c:\bin\address>csdir


Directory C:\BIN\ADDRESS\

Name                Size      Type      Entries     Created      Updated
--------------------------------------------------------------------------
CSADR.DBF          98382      TBASE         298   Sep 20 1994  Oct 31 1994
CSADR01.IDX        40960      BTREEa        403   Oct 29 1994  Oct 31 1994
CSADR02.IDX        10752      BTREEa        104   Oct 29 1994  Oct 31 1994
CSADR03.IDX         4608      BTREEl         28   Oct 29 1994  Oct 31 1994
CSADR04.IDX         5120      BTREEa         22   Oct 29 1994  Oct 31 1994
--------------------------------------------------------------------------
Total:            159822 bytes in   5 files.



As can be seen from this example, CSDIR displays:
    - the name of the class involved.
    - the number of entries in the database.
    - in case of a btree, the number of different keys. If the same key is
        entered twice, it is counted as one entry.
    - date of creation.
    - date of last update.

Example of the /a option.


c:\bin\adres>csdir /a


Directory C:\BIN\ADRES\

Name                Size      Type      Entries     Created      Updated
--------------------------------------------------------------------------
ADRES.EXE         137872       DOS                             Oct 29 1994
CSDEMIO.DEF          277       DOS                             Apr 17 1994
BACKUP.TXT         34478       DOS                             Oct 29 1994
CSADR.DBF          98382      TBASE         298   Sep 20 1994  Oct 31 1994
CSADR01.IDX        40960      BTREEa        403   Oct 29 1994  Oct 31 1994
CSADR02.IDX        10752      BTREEa        104   Oct 29 1994  Oct 31 1994
CSADR03.IDX         4608      BTREEl         28   Oct 29 1994  Oct 31 1994
CSADR04.IDX         5120      BTREEa         22   Oct 29 1994  Oct 31 1994
ERROR.ERR          12964       DOS                             Oct 27 1994
--------------------------------------------------------------------------
Database files:   159822 bytes in   5 files.
Other files:      185591 bytes in   4 files.
                -------- +        --- +
Total:            345413 bytes in   9 files.




Another example:


c:\bin\adres>csdir cs*.* /a


Directory C:\BIN\ADRES\

Name                Size      Type      Entries     Created      Updated
--------------------------------------------------------------------------
CSDEMIO.DEF          277       DOS                             Apr 17 1994
CSADR.DBF          98382      TBASE         298   Sep 20 1994  Oct 31 1994
CSADR01.IDX        40960      BTREEa        403   Oct 29 1994  Oct 31 1994
CSADR02.IDX        10752      BTREEa        104   Oct 29 1994  Oct 31 1994
CSADR03.IDX         4608      BTREEl         28   Oct 29 1994  Oct 31 1994
CSADR04.IDX         5120      BTREEa         22   Oct 29 1994  Oct 31 1994
--------------------------------------------------------------------------
Database files:   159822 bytes in   5 files.
Other files:         277 bytes in   1 files.
                -------- +        --- +
Total:            160099 bytes in   6 files.





                             25 CSINFO


CSINFO is a command-line utility to display information about a
particular database.
It only recognizes the databases made with the CSDB-library.

An example of its output:



c:\adres>csinfo csadr01.idx


  Information about database: csadr01.idx.

   Type..................:  BTREEa
   Version...............:  1.1.b
   Class compiled at.....:  Apr 25 1994, 04:28:24
   With..................:  Borland C++ 3.1

 NOTE: The above information refers to the version of the
       class used during the CREATION of the database file.

   Btree created at......:  September 20 1994, 10:02:11,47
   Btree last updated at.:  September 26 1994, 23:25:19,96
   Multiple keys allowed.:  YES
   Number of keys........:  622
   Number of blocks......:  111
   Block size............:  511 bytes
   Key size..............:  41 bytes
   Data size.............:  4 bytes
   Data degree...........:  10
   Index degree..........:  10
   Number of levels......:  4



                            26 CSERROR


Normally all the errors are read from the file 'error.err'. It has to be in
the current working directory or it cannot be found.

Using a runtime error file produces smaller executables because the
error messages are not linked in.
However, the error file is not kept open all the time and for opening a file,
some dynamic memory allocations have to be done.
This can lead to problems when the error message that has to be
displayed results from an 'out of memory' condition.
(It needs memory to say 'there is no more memory'.)

To overcome this, and other problems, the command-line util CSERROR
can be used.
It generates a C source file which, when compiled and linked in, makes
the runtime error file redundant.


Example:
c:\borlandc>cserror error.err


This will produce a file 'error.cpp' in the current directory. Compile this
and link it in with the rest of your application. Make sure it's linked in
before the libraries.
In this way the csmess_read() function which is in the 'error.obj', will
replace the one in the library.


// Example of how the resulting 'error.cpp' file could look:
// Many errors are left out.


#include "csmess.h"

char *_csa_error[]=
     {
       "Error 9370: TBASE: %s Can't write report file %s. Disk full?",
       "Error 9390: TBASE: %s Out of memory during pack().",
       "Fatal Error 9545: PAGE: %s Header_2_data(): can't perform fseek.",
       "Fatal Error 9550: PAGE: %s Write_header: can't perform fwrite.",
       "Fatal Error 9555: PAGE: %s Header_2_data(): can't perform fread.",
       "Fatal Error 9560: PAGE: %s Can't open file during definition.",
       "Error 9562: PAGE: %s Can't open report file %s.",
       "TheEnd"  //THIS HAS TO BE THE LAST LINE!!
     };

/////////////////////////////////////////////////////////////////////

char *csmess_read(long error)
{
   char tmp[25];
   ltoa(error,tmp,10);
   char **p=_csa_error;
   for(;;)
   {
      if(strstr(*p,tmp)) return *p;
      if(!strcmp(*p,"TheEnd")) return NULL;
      p++;
   }



Notice the 'TheEnd' line, which was not in the original 'error.err'
file. Never remove that line!

                            27 CS4DBASE


27.1 Introduction

The database classes generated by CSDBGEN cannot import a dBASE
file directly. However, they do have a function to import an ASCII file. This
function is called import(), and it expects its input file to be in a specific
format.

CS4DBASE is a command-line utility which is able to convert a dBASE
file into a format required by the import() function.

The generated intermediate ascii file can be manually edited to adjust
field names.


27.2 Converting

Example:


c:\bin\dbase> cs4dbase  person.dbf



Assuming there is a dBASE file 'person.dbf', this will produce the ascii
file 'person.txt'.
On top of the file are some lines indicating field names and types. The
import() function need this information to correctly parse the remainder of
the file.

In the example ('person.txt') it looked like this:


Class:  CONV
Record: CONV_rec
File:   conver.dbf
field:  NAME s 40
field:  ADRE s 32
field:  CITY s 35
field:  UPDATED d

Museum Langeveld
Langevelderweg 27
Noordwijkerhout
1994/02/18

The Truck Giant
Goeverneurlaan 471
Den Haag
1993/02/16


The lines before the first  resemble a 'database definition file'. The
import() function ignores the first three lines but it uses the 'field' lines.
They have to match the names of the fields in the database definition file
which generated the import() function, or these fields will be skipped.

If these names do not match, it maybe necessary to manually edit the
field names in the ASCII file. The order of the fields never matters.


27.3 Example

Suppose the next database definition file, 'person.def', is used to
generate the class PERSON.


Class:  PERSON
Record: PERSON_rec
File:   pers.dbf
field:  NAME    s 40 T
field:  HOBBY   s 32 Y
field:  PHONE   s 15
field:  UPDATED d MDY2



From this you created a PERSON class by calling CSDBGEN.


c:\test> CSDBGEN person.def



Now you want to import some data from an old dBASE database called
'member.dbf'.
However, 'member.dbf' has different fields.

The dBASE 'display structure' command reveals something like this:


Structure for database : C:member.dbf
Number of data records :       3
Date of last update    : 04/19/95
Field  Field name  Type       Width    Dec
    1  MEMBER      Character     30
    2  TELEPHONE   Character     13
    3  INTERESTS   Character     50
** Total **                      94



Despite the differences, call CS4DBASE.


c:\test>CS4DBASE member.dbf



This produces the file 'member.txt'.

Class:  CONV
Record: CONV_rec
File:   conver.dbf
field:  MEMBER s 30
field:  TELEPHONE s 13
field:  INTERESTS s 50

Rudolf Mandrake
0592-24-2379
Ancient Building Techniques

Rachel Labrosse
913-814-1378
Extraterrestrial Encounters

Victor Plauger
312-241-2808
Home Gardening


It doesn't make much sense to import this file. All the field names are
different so the import() function will skip each and every line.
To alter all that, 'import.txt' has to be edited to make the field names
match those in 'person.def'.

After editing it should resemble something like:
Class:  CONV
Record: CONV_rec
File:   conver.dbf
field:  NAME   s 30
field:  PHONE  s 13
field:  HOBBY  s 50

Rudolf Mandrake
0592-24-2379
Ancient Building Techniques

Rachel Labrosse
913-814-1378
Extraterrestrial Encounters

Victor Plauger
312-241-2808
Home Gardening


Note that de order of the fields is still not the same. However, import() is
smart enough to overcome that obstacle.

Now, 'import.txt' can be savely loaded.

The following code could be used for that.

#include "person.h"

void main(void)
{
    PERSON pers;

//  pers.define();  // Uncomment this to create the database.

    pers.open();
    pers.import("import.txt");
    pers.close();
}

27.4 Importing large databases

As can be seen from the previous example, it is sometimes required to
load the entire ASCII file in an editor to change the field names. This is
fine for small databases, but when the number of records increases the
ASCII file can become too large.

Solution:


c:\test>CS4DBASE member.dbf /2



Note the '/2' at the end of the command.
This instructs CS4DBASE to generate two files instead of one, namely
'member.def' and 'member.txt'.

'member.def' will contain:

Class:  CONV
Record: CONV_rec
File:   conver.dbf
field:  MEMBER s 30
field:  TELEPHONE s 13
field:  INTERESTS s 50


and 'member.txt':


Rudolf Mandrake
0592-24-2379
Ancient Building Techniques

Rachel Labrosse
913-814-1378
Extraterrestrial Encounters

Victor Plauger
312-241-2808
Home Gardening


The only things which need editing are the field names, and they are
conveniently placed together in one small ASCII file!
When the editing is done, the two files have to be joined again to create
valid input for the import() function.

Like this:


c:\test>copy member.def+member.txt member.asc



'member.asc' can now be read by the import() function!




























                               Part

                                Six






  Part Six discusses the classes and functions implemented in the
                          CSA-library.
 They have nothing to do with databases so you can use or ignore
                          them at will.
        However, two chapters may require some attention.
Alloc-logging which deals with heap corruption and memory leaks.
The HEAP class for efficiently allocating large numbers of small
                             blocks.




                            28 CSTOOLS


28.1 Introduction

Cstools is a collection of odds & ends, merely intended to support the
other classes, but if you see something to your liking, please feel
free to use it.

The function prototypes are in cstools.h.

int add_path(char *filen,char *path);
                Adds path 'path' to filename 'filen'. Afterwards 'filen'
                contains the new name. It returns TRUE if successful,
                FALSE otherwise.
int  csrand(int amount);
U32 csrand(U32 amount);
                Returns a VERY random number in the range
                0..(amount-1), including both 0 and (amount-1).
int cstmpname(char *name);
                Generates the name of a non-existing file in the 'temp'
                directory. It first searches for the environment variable
                'TMP' and if not found for 'TEMP'. A filename is
                generated which does not already exist in this
                directory. The function has to be called with a
                parameter 'name' pointing to a buffer large enough to
                hold the complete drive, path and filename. If non of
                the evironment variables exist, a filename for the
                current directory is produced. It only generates a
                filename, no file is actually created. The function
                returns TRUE if a unique filename was found, FALSE
                otherwise.
int disk(char *s);
                Sets the current drive and path as indicated by string
                's'. If 's' is the empty string, 's' is set to the current drive
                and path! It returns TRUE if successful, FALSE
                otherwise.
void empty_kb(void);
                Empties the keyboard buffer.
int file_exist(char *fnaam);
                Returns TRUE if file 'fnaam' exists.
char *file_ext(char *name,char *ext);
                Adds an extension to filename 'name'. It returns a
                pointer to an internal buffer which contains the new
                name. If 'name' already has an extension, it is
                overwritten. The string 'name' itself is not changed!
long filesize(char *name);
                Returns the size of file 'name'.
void filter_string(char *source,char *allowed);
                All the characters in 'source' which are not in 'allowed'
                are removed from 'source'.
void lower_upper(char *ptr);
                Converts the entire string to upper case.
long lrandom(long amount);
                Returns a long random number in the range
                0..(amount-1), including both 0 and (amount-1).
size_t next_prime(size_t pri);
                Calculates next higher prime number.
char *notabs(char *s);
                Replaces every occurrence of a tab character in 's'
                with a single space. That is: tabs are not expanded,
                but simply removed.
char *remove_space(char *s);
                Removes ALL the blanks from the string 's'. It returns
                character pointer 's'.
unsigned int  sqrti(unsigned int n);
unsigned long sqrtl(unsigned long n);
                Calculates the sqrt from n WITHOUT USING
                FLOATING POINT arithmetic.
void str_split(char *source,char ch,char *first,char *last);
                Split 'string' source at the first occurrence of character
                'ch'. Ch is included in neither the 'first' nor the 'last'
                string.
void str_strip(char *source,char *remove);
                Characters in 'source' which are also in 'remove' are
                removed from 'source'.
int str_equal(char *s1, char *s2);
                Returns TRUE if 's1' is equal to 's2', discriminating
                between upper and lower case.
void str_left(char *source,char *dest,int len);
                Copies at most 'len' number of characters from the left
                of 'source' to 'dest'.
char *string_replace_ones(char *source,char *d,char *r);
                Replaces the first occurrence of 'd' in 'source' with 'r'.
int string_replace(char *s,char *d,char *r);
                Replaces every occurrence of 'd' in 'source' with 'r'. It
                returns an integer number indicating the number of
                times a substitution was made.
long time_stamp(void);
                Returns a higher long number on each successive call,
                starting with zero again when MAXLONG is reached.
void trim_string(char *s);
                Removes heading and trailing blanks from string 's'.
void wait(long msec);
                Waits 'msec' milliseconds.
void waitkb(long msec);
                Waits 'msec' milliseconds or until the next keyboard
                hit.

                             29 CSKEYS


Almost all the input for the library functions is done through the cskey()
function.

Syntax:
    int cskey(void);

The return value can be one of the following symbolic constants: (defined
in CSKEYS.H )

     CTRL_A    KEY_A    KEY_a     ALT_A     DELETE      CURSOR_UP
     CTRL_B    KEY_B    KEY_b     ALT_B     END         CURSOR_DOWN
     CTRL_C    KEY_C    KEY_c     ALT_C     HOME        CURSOR_RIGHT
     CTRL_D    KEY_D    KEY_d     ALT_D     PAGE_UP     CURSOR_LEFT
     CTRL_E    KEY_E    KEY_e     ALT_E     PAGE_DOWN
     CTRL_F    KEY_F    KEY_f     ALT_F     INSERT
     CTRL_G    KEY_G    KEY_g     ALT_G     BACKSPACE
     CTRL_H    KEY_H    KEY_h     ALT_H     TAB
     CTRL_I    KEY_I    KEY_i     ALT_I     SHIFT_TAB
     CTRL_J    KEY_J    KEY_j     ALT_J     ENTER
     CTRL_K    KEY_K    KEY_k     ALT_K     ESC
     CTRL_L    KEY_L    KEY_l     ALT_L     SPACE
     CTRL_M    KEY_M    KEY_m     ALT_M
     CTRL_N    KEY_N    KEY_n     ALT_N     CTRL_DELETE
     CTRL_O    KEY_O    KEY_o     ALT_O     CTRL_HOME
     CTRL_P    KEY_P    KEY_p     ALT_P     CTRL_CURSOR_UP
     CTRL_Q    KEY_Q    KEY_q     ALT_Q     CTRL_CURSOR_DOWN
     CTRL_R    KEY_R    KEY_r     ALT_R     CTRL_CURSOR_RIGHT
     CTRL_S    KEY_S    KEY_s     ALT_S     CTRL_CURSOR_LEFT
     CTRL_T    KEY_T    KEY_t     ALT_T     CTRL_PAGE_UP
     CTRL_U    KEY_U    KEY_u     ALT_U     CTRL_PAGE_DOWN
     CTRL_V    KEY_V    KEY_v     ALT_V     CTRL_END
     CTRL_W    KEY_W    KEY_w     ALT_W
     CTRL_X    KEY_X    KEY_x     ALT_X
     CTRL_Y    KEY_Y    KEY_y     ALT_Y
     CTRL_Z    KEY_Z    KEY_z     ALT_Z

     F1     SHIFT_F1    CTRL_F1   ALT_F1    KEY_1   ALT_DELETE
     F2     SHIFT_F2    CTRL_F2   ALT_F2    KEY_2   ALT_HOME
     F3     SHIFT_F3    CTRL_F3   ALT_F3    KEY_3   ALT_CURSOR_UP
     F4     SHIFT_F4    CTRL_F4   ALT_F4    KEY_4   ALT_CURSOR_DOWN
     F5     SHIFT_F5    CTRL_F5   ALT_F5    KEY_5   ALT_CURSOR_RIGHT
     F6     SHIFT_F6    CTRL_F6   ALT_F6    KEY_6   ALT_CURSOR_LEFT
     F7     SHIFT_F7    CTRL_F7   ALT_F7    KEY_7   ALT_PAGE_UP
     F8     SHIFT_F8    CTRL_F8   ALT_F8    KEY_8   ALT_PAGE_DOWN
     F9     SHIFT_F9    CTRL_F9   ALT_F9    KEY_9   ALT_END
     F10    SHIFT_F10   CTRL_F10  ALT_F10   KEY_0
     F11    SHIFT_F11   CTRL_F11  ALT_F11
     F12    SHIFT_F12   CTRL_F12  ALT_F12

The predefined values for the 'normal' keys like 'A', 'a' or '1' are
the same as the ASCII values. This means you are not forced
the type things like

            if( KEY_A==cskey() ) ....

        but can also use:

            if( 'A'==cskey() )  ......


29.1 CSKEYS.exe

This simple command-line utility displays an integer value corresponding
to the pressed key. It is the same value the cskey() funtion returns and
can therefore be used to make additions to het list of symbolic constants.

also a simple utility to test the return value of the cskeys() function.


Syntax:


c:\tmp>cskey



To exit the program press CTRL-END.

                              30 DATE

The csDATE class is implemented to aid in dealing with dates.
It has build-in functions to convert from-and-to julian dates.
There are also functions to read and write dates as strings.
The julian-date routines require floating point, therefore the csDATE
class tries to avoid these routines whenever possible. As a consequence,
using the csDATE class doesn't necessarally means a floating point
library has to be linked in, but it depends on the functions used.


30.1 Example

A quick example to show where we are talking about:


#include "iostream.h"
#include "date.h"

void main(void)
{
    csDATE d;

    d.format(Y4MD); // Choose the Year/Month/Day format.
    d="1967/04/23"; // Set 'd' to April 23th 1967

    d+=100;         // Add 100 days. (Requires floating point)

    DATE e;         // A new instance of the DATE class.

    e.now();        // Set 'e' to the system date.

    cout<<endl<<e-d;// Print the number of days in between.
                    // (Requires floating point)

    cout<<endl<<(char *)e;  // Print 'e'.
}




30.2 Initialising

The next function can be used to initialise a DATE instance:

void month(int m);   Sets the month to 'm'. Januari is 1, Februari is 2, etc..
void year(int y);    Sets the year, using a 4 digit format.
void day(int d);     Sets the day.
void now(void);      Sets the date to the system clock.
void julian(long j); Sets the date to the julian date 'j'.


30.3 Converting Strings

The assignment operator is overloaded to be able to assign a string to
the date instance. For this to work properly, the class has to know the
format in which the date is represented.

The format used has to be indicated by a call to the format() function.

Example:

void main(void)
{
    csDATE date;
    date.format(DMY4);
    date="27/02/1994";   // Februari, 27th 1994

}



In the header file 'CSDATE.H', constants are defined for the following
formats:

    MDY2    MY2D
    Y2MD    Y2DM
    DMY2    DY2M
    MDY4    MY4D
    Y4MD    Y4DM
    DMY4    DY4M


There is logic behind these formats!
'M'         means Month
'D'         means Day
'Y4'        means Year with 4 positions
'Y2'        means Year with 2 positions

So:     Y2MD means: Two positions for the year/month/day.
        MDY4 means: month/day/year four positions.

When 'Y2' is used, years >75 are interpreted as 20th century, years<=75
as the 21st century!

Example:

#include "csdate.h"

void main(void)
{
    csDATE date;
    date.format(Y2MD);

    date="80/04/15";        // April 15th, 1980
    date="20/04/15";        // April 15th, 2020

}


The default format is DMY4.


30.4 Obtaining date info

The next functions can be used to 'read out' information about the DATE
instance.

int week_day(void);
                Returns the day of the week. Monday=1, Tuesday=2,
                etc..
int month(void);    Returns the month [1..12].
int month(csCHAR *);
                Returns also the name of the calendar month,
                [January,February...December]
int year(void); Returns the year in 2 or 4 digits, depending on the
                format used. Remember, the default format is DMY4,
                which implies 4 digits.
int year4(void);    Returns the year with 4 digits, independent of the
                    format chosen.
long julian(void);
                Returns the julian date.
operator char*();
                Casts the date to a string with respect to the format
                used.
Example:


void main(void)
{
    csDATE date;
    date.now();
    date.format(DMY2)
    cout<<(char *)date<<endl;   // Displays 25/04/95

}



30.5 Comparing dates

All the comparison operators like, <=, != etc. are overloaded to compare
instances of the csDATE class. The csDATE instances do NOT have to
use the same formats for the comparison operators to work properly.

Example:


#include "iostream.h"
#include "csdate.h"

void main(void)
{
    csDATE d1,d2;

    d1.now();

    d2.format(MDY4);
    d2="01/01/2000";

    if(d1>=d2)  cout<<" The turn of the century!"<<endl;

}



30.6 Arithmetic

It is possible to apply simple arithmetic to the dates. That is, adding or
subtracting days and subtracting one date from the other.
This requires the use of julian dates and consequently the use
of floating point.

Example:



#include "iostream.h"
#include "csdate.h"

void main(void)
{
    csDATE d1,d2;

    d1.now();
    d1+=100;        // Add 100 days
    d1-=300;        // Subtract 300 days.

    d2.format(Y4MD);
    d2="1999/04/20";

    cout<<d2-d1<<endl; // Display the number of days in between.

}



30.7 Miscellaneous

int format(void);   Returns the format used.
int leap_year(void);
                Returns TRUE if the date is a leap-year, FALSE
                otherwise.
int long_year(void);
                Returns TRUE if 4 positions are used to represent the
                year, FALSE otherwise.
S32 sem_jul(void);
                'sem_jul' is short for 'semi julian'. It converts each date
                into an unique 32 bits number, using the formula:
                year*512+month*32+day. This number is convenient
                for storing or comparing dates.
void sem_jul(S32 l);
                The reverse of the previous function. It sets the DATE
                instance according to the 'sem_jul' number 'l'.
int valid(void);    Returns TRUE if the date is a valid calendar date,
                    FALSE otherwise.





                              31 HEAP


31.1 Purpose

Some type of applications allocate numberous small blocks from the
heap. This is inefficient in terms of ram uitlization, can lead to heap
fragmentation and is also slow.
To overcome these problems this library contains  a special HEAP class.

The idea is to do allocations in chunks of about 2Kb and take the small
amounts from that, when needed.

This approach has considerable advantages.
    - You can release all allocated memory with just one function call
        instead of freeing many small blocks separately.
    - It is a lot faster because normal heap operations are relatively slow.
    - Heap efficiency is also improved. It is much easier for the heap to
        deal with relatively few allocations of about 2Kb then it is to deal
        with numerous small allocations.
    - It can save valuable memory. There is considerable overhead
        involved in using the heap. Apart from what you need, several
        additional bytes are used to 'pointer' the allocated blocks
        together. In addition, allocations are done in multiples of 16
        bytes. This can lead to a situation where you need only 13 bytes
        while 32 bytes are used!


31.2 When to use it?

The HEAP class is particularly useful when dealing with pointer
structures in ram. Pointer structures are small, all of the same size and
an application will probably use a lot of them.
The BUFFER class described earlier in this documentation also uses the
HEAP class. This means you can use it without enlarging your
application.
The HEAP class assumes allocations of a fixed size. This limits its
usefulness but improves efficiency. It is very well possible to use more
then one instance of the HEAP class in an application. It is feasible to
use a different HEAP for every size of allocation needed.
The class was designed with small allocations in mind, something below
50 bytes. It is doubtful whether the HEAP class still makes sense for
allocations above 100 bytes.

Summarizing:
- Allocations have to be of a fixed size.
- Allocations have to be small, below 50 bytes.
- Many allocations of this type are going to take place.


31.3 Using HEAP.

Using the HEAP class starts off with an initialization, stating the size of
the allocations. Afterwards the class has to be 'opened'. From there on
allocations can be made, and blocks can be free-ed again. When the
work is done, the close or the zap function can be called to free all
allocated memory.



// Example

#include "csheap.h"

void main(void)
{
    typedef struct
    {
        void *next;
        void *prev;
        int  number;
    } pStruct;          // A typical pointer structure.

    HEAP heap;          // HEAP class instance.

    heap.init(sizeof(pStruct)); // Initialize it for the size of the
                            // pointer structure.

    heap.open();            // Open the class so it can be used.

    pStruct *p,*q;

    p=(pStruct *)heap.malloc(); // Allocation.
    q=(pStruct *)heap.malloc(); // Allocation.

    p.next=q.prev=NULL; // Doing something.

    // Doing much more.

    heap.free(q);       // Freeing q.

    heap.close();       // Finally finished.
                        // Freeing all allocated memory.

}




31.4 Functions in alphabetical order.

The function prototypes are in CSHEAP.H.

void close(void);
                Closes the class. All allocated memory is freed. The
                initialisation parameters are retained, which makes it
                possible to reopen the class without calling init(). This
                function is also called by the class destructor.
void empty(void );
                Frees all allocated memory, but the class remains
                open.
void init(U16 alloc_size,U16 page_size=2048);
                Initializes the class. 'Alloc_size' is the size of the
                allocations needed. 'Page_size' is the size of the
                chunks which are going to be allocated from the heap.
                This parameter has a default value of 2048 bytes.
int open(void); Opens the class. The 'init()' function has to be called
                first. The function returns TRUE on success and
                FALSE otherwise.
void free(void *p);
                Frees the previously allocated block 'p' is pointing at.
void *malloc(void);
                Allocates a block and returns a pointer to it. If no
                memory is available, a NULL pointer is returned.
void zap(void); Frees all allocated memory and closes the class. The
                initialization parameters set by the init() function are
                reset. This means the class has to be initialized again
                before it can be reopened.




                         32 Alloc-Logging

32.1 Introduction

Dynamic memory allocations can create problems which are difficult
to trace. Therefore, this library contains a set of functions which can
be used to replace the normal malloc() and free() functions. The
replacements can be made to write a record to a log. This log can be
used later to check for memory leaks.

The replacements also test for things like freeing a NULL pointer or a
malloc which returns NULL.

Replacements are preprocessor commands which can be switched on
and off with the preprocessor variable CS_DEBUG.
If CS_DEBUG is not defined, the normal functions are called.


32.2 Replacements

Replacements are available for the following functions:
Prototypes in csmalloc.h.

   Function:                Replacement:
    malloc                   csmalloc
    calloc                   cscalloc
    realloc                  csrealloc
    free                     csfree
    farmalloc                csfarmalloc
    farcalloc                csfarcalloc
    farrealloc               csfarrealloc
    farfree                  csfarfree

The use of the replacements is fully equivalent to the original.

The allocations in the CS-libraries are always done by calling the
replacements. The production version of the libraries was compiled
without CS_DEBUG being defined, so the normal functions are used and
are called without any additional overhead. When the debug version was
compiled, CS_DEBUG was defined and as a result all the allocations
done by the library functions can be logged!


32.3 Logging

Logging of allocations can be switched on and off through the use of two
functions.

void alloc_logging(int TrueFalse);
                After a call to alloc_logging(TRUE) all the allocations
                are logged in the ASCII file 'malloc.log'. Calling
                alloc_logging(FALSE) switches the logging off.
void alloc_logging(int TrueFalse,char *name);
                This is basically the same as the previous function but
                has the additional option of specifying the name of the
                log file.

The next example displays a part of an allocation log.



 4E79:0004         file csedst30.cpp    line 8:  malloc()   8 bytes
 4E7A:0004         file csedst30.cpp    line 8:  malloc()   8 bytes
 4E78:0004         file csedst30.cpp    line 23: free()
 4E78:0004         file csedst30.cpp    line 8:  malloc()   9 bytes
 4E79:0004         file csedst30.cpp    line 23: free()
 4E7B:0004         file csedst30.cpp    line 8:  malloc()   22 bytes
 4E7B:0004         file csedst28.cpp    line 12: realloc  free()
 4E7B:0004         file csedst28.cpp    line 12: realloc  malloc()
 4E7A:0004         file csedstr.cpp     line 15: free()
 4E7B:0004         file csedstr.cpp     line 15: free()




As can be seen, the first column displays the pointer involved, the second
and third display the file and the line where the call was made. When an
allocation is concerned its size is also displayed. Reallocs appear as two
lines.


32.4 Memory Leaks.

With a log like this it is easy to check for memory leaks. In fact, a
command-line utility is supplied to check for that. It is called CSMALLOC.
Only one parameter needs to be supplied: the name of the allocation log.

Example

c:\test>CSMALLOC malloc.log




If it encounters a malloc which is not matched by a free, it displays the
pointer involved.
Like:



 UNMATCHED address: 29CC:0004



If all malloc's are matched by a free it says something like this:



  NO ERRORS encountered!!


  Number of addresses: 23
  Lowest address:  29CC:0004
  Highest address: 2EAE:0004



If malloc logging is kept on for longer periods, the log file can become
extremely large. However, this poses no problem for CSMALLOC.

If you are planning to use this method to detect memory leaks,
it is essential to switch on the logging before the first allocation
is done. Don't forget class constructors do allocations as well!                             33 csSTR


The class for manipulating strings. Instead of providing the formal
syntax we will clarify things by supplying a large number of examples.

Class name: csSTR.  Prototypes are in CSSTR.H.


   #include "csstr.h"

   main(void)
   {
      csSTR str;

      str=" A test ";       // Assign a string
      str.upper();          // Convert to upper case
      str.lower();          // Convert to lower case
      str.trim();           // Remove all leading and trailing blanks
                            // str contains now: "a test";
      str+=" At the end ";  // APPEND at the end
                            // str contains now: "a test At the end"
      int i=-345;
      str=i;                // Convert the integer value to string
                            // str contains now: "-345";
      str="1001";
      i=str;                // Assign string to integer
      str="A line";
      str.strip("ijkl");    // Stripping the characters i,j,k,l from
                            // str.
                            // Str now contains: "A ne";
      str="A line";
      str.filter("ijkl");   // Allow only the characters i,j,k,l in
                            // str.
                            // Str now contains: "li";
      str="The quick brown fox";
      str[4]='Q';           // Str now contains: "The Quick brown fox"

      csSTR str2="by C++ !";
      str="Made possible ";
      str=str+str2;         // Str: "Made possible by C++ !";

      if( str<str2)   ..
      if( str>str2)   ..
      if( str<=str2)  ..
      if( str>=str2)  ..
      if( str==str2)  ..    // Comparisons are possible.

  }



