
January 25, 2003; TreePad PLUS, TreePad SAFE and TreePad Business Edition file format versions 3.x/4.x/5.x/6.x

Technical documentation on the structure of the TreePad file format.

TreePad file format for TreePad .hjt, htmhjt (and .tpz) files:

The basic file format is text/ascii.


File version section

Inside the ascii file, the start of a TreePad file should be the line containing the version number of the TreePad program which last wrote to the file:

Example: <hj-Treepad version 2.7> 
Example: <Treepad version 4.0>


Node section

The TreePad data file consists of a large number of nodes, each representing and article/node combination which is part of the tree.

Each node looks like this:

  (Unique identifier)
  (Data type indicator)
  <node>
  (Node title)
  (level indicator, 0 meaning at tree level 0, 1 meaning one level up, etc.)
  the following lines are part of the article text, until the closing line ('end node'), which is:
  <end node> 5P9i0s8y19Z

* When 'unique identifier' is not present or is equal to zero, TreePad 3.1 and higher will automatically generate a unique identifier. This tag has been introduced in version 3.1 and will be ignored by earlier versions.

* When 'data type indicator' is not present, it is assumed that the article is of type 'plain text'. The data type indicator has been introduced by TreePad freeware, version 2.7 and will be ignored by earlier TreePad versions, so that the file format is backwards compatible. 
Please note: TreePad freeware will not be able to read html and rtf articles correctly. TreePad PLUS files containing only plain text articles are backwards compatible with any previous TreePad version.

* A TreePad RTF article should start with the string {\rtf, a TreePad HTML article should start with the line '<html>', a TreePad XML article should start with '<?xml'.

* Usually standard TreePad 4.x and 3.x (.hjt) files will contain Rich Text and Plain Text articles, and 'HTML TreePad files' (extension .htmhjt) will contain HTML and Plain Text articles. 


Example 1, a plain-text article

  id=6
  dt=text
  <node>
  mail from the President
  4
  Dear sir,

  I would like to invite all TreePad users into the Oval Office
  to help me better organize the country.
  Sincerely,

  G. Bush,
  the White House
  Washington
  <end node> 5P9i0s8y19Z

Example 2, an HTML article

  id=6
  dt=HTML
  <node>
  mail from the President
  4
  <html>
  <body>
  <font face=arial size=3>
  <p>Dear sir,</p>
  <p>I would like to invite all TreePad users into the Oval Office
  to help me better organize the country.
  Sincerely,</p>
  <p>G. Bush,
  the White House
  Washington</p>
  </body>
  </html>

Example 3, an RTF article

  id=6
  dt=RTF
  <node>
  bush
  1
  {\rtf1\ansi\deff0\deftab254{\fonttbl{\f0\fnil\fcharset0 arial;}}{\colortbl\red0\green0\blue0;\red255\green0\blue0;\red0\green128\blue0;\red0\green0\blue255;\red255\green255\blue0;\red255\green0\blue255;\red128\green0\blue128;\red128\green0\blue0;\red0\green255\blue0;\red0\green255\blue255;\red0\green128\blue128;\red0\green0\blue128;\red255\green255\blue255;\red192\green192\blue192;\red128\green128\blue128;\red0\green0\blue0;}\wpprheadfoot0\paperw12240\paperh15840\margl1880\margr1880\margt1440\margb1440\margh720\margf720{\*\pnseclvl1\pnucrm\pnstart1\pnhang\pnindent720{\pntxtb}{\pntxta{.}}}
  {\*\pnseclvl2\pnucltr\pnstart1\pnhang\pnindent720{\pntxtb}{\pntxta{.}}}
  {\*\pnseclvl3\pndec\pnstart1\pnhang\pnindent720{\pntxtb}{\pntxta{.}}}
  {\*\pnseclvl4\pnlcltr\pnstart1\pnhang\pnindent720{\pntxtb}{\pntxta{)}}}
  {\*\pnseclvl5\pndec\pnstart1\pnhang\pnindent720{\pntxtb{(}}{\pntxta{)}}}
  {\*\pnseclvl6\pnlcltr\pnstart1\pnhang\pnindent720{\pntxtb{(}}{\pntxta{)}}}
  {\*\pnseclvl7\pnlcrm\pnstart1\pnhang\pnindent720{\pntxtb{(}}{\pntxta{)}}}
  {\*\pnseclvl8\pnlcltr\pnstart1\pnhang\pnindent720{\pntxtb{(}}{\pntxta{)}}}
  {\*\pnseclvl9\pndec\pnstart1\pnhang\pnindent720{\pntxtb{(}}{\pntxta{)}}}
  \endnhere\sectdefaultcl{\pard{\ql\li0\fi0\ri0\sb0\sl\sa0 \plain\f0\fs24\cf0  Dear sir,\par
  \ql\li0\fi0\ri0\sb0\sl\sa0 \plain\f0\fs24\cf0 \par
  \ql\li0\fi0\ri0\sb0\sl\sa0 \plain\f0\fs24\cf0   I would like to invite all TreePad users into the Oval Office\par
  \ql\li0\fi0\ri0\sb0\sl\sa0 \plain\f0\fs24\cf0   to help me better organize the country.\par
  \ql\li0\fi0\ri0\sb0\sl\sa0 \plain\f0\fs24\cf0   Sincerely,\par
  \ql\li0\fi0\ri0\sb0\sl\sa0 \plain\f0\fs24\cf0 \par
  \ql\li0\fi0\ri0\sb0\sl\sa0 \plain\f0\fs24\cf0   G. Bush,\par
  \ql\li0\fi0\ri0\sb0\sl\sa0 \plain\f0\fs24\cf0   the White House\par
  \ql\li0\fi0\ri0\sb0\sl\sa0 \plain\f0\fs24\cf0   Washington}}
  }
  <end node> 5P9i0s8y19Z

  
Example 4, an XML article, which will be displayed as a form in TreePad PLUS/SAFE/BIZ 6.0.

  id=6
  dt=XML
  <node>
  My Yahoo account
  1
  <?xml version="1.0"?>
  <passworditem version="1" pxpos_col1="30" pxpos_col2="190">
  <title caption="Title" pxwidth="440">Yahoo ID</title>
  <account caption="Account" pxwidth="440">Fido</account>
  <loginname caption="Login name"  pxwidth="300" >Fido</loginname>
  <password caption="Password" type="password"  pxwidth="300" >MyDog</password>
  <emailaddress caption="Email address" type="email" pxwidth="300">Fido@yahoo.com</emailaddress>
  <URL caption="URL" type="url" pxwidth="440">website.yahoo.com/fido</URL>
  <hint caption="Question/hint" pxwidth="440">What is the name of my dog?</hint>
  <answer caption="Answer" pxwidth="440">Fido</answer>
  <remarks caption="Remarks" pxwidth="440" multirow="true" pxheight="200">No real data. Only for testing purposes</remarks>
  </passworditem>
  <end node> 5P9i0s8y19Z


The string 5P9i0s8y19Z is included to make sure that when someone types
<end node> in the article, the program will not get confused into thinking
the node has ended there (in the middle of the article). 
The chance that someone types <end node> 5P9i0s8y19Z by accident into the article area
is very much smaller. Not a very beautiful implementation, but it's simple
and it works.

The order in which the nodes are listed in the TreePad file is determined
by the order in which they would appear in a fully expanded tree,
beginning at the top, and ending at the bottom.

E.g. a sequence of levels
(the numbers are indicating the node levels):

1 - 2 - 3 - 3 - 3 - 2

will be corresponding to the tree structure
(the numbers are indicating the node levels):

1 ___2
|    |___3
|    |___3
|    |___3
|
|____2


Warning: if you store this text in TreePad, do not include any 
"<end node> 5P9i0s8y19Z" as appearing in this text !!!!!



Additional tags (node section): Object tag

This tag indicates an object embedded into the article, such as an image. This tag may occur, but it may also not be present.
This example contains three bitmap files, embedded into the article:

id=209
dt=RTF
obj=11.bmp
obj=34.bmp
obj=29.bmp
<node>
Photos of my house
2
...........

Objects are stored into the accompanying .tpz file. Additionally references to the objects are present inside the article as RTF links to external objects.



Additional tags (node section): Node caption formatting tag

This tag indicates the formatting of a node (like bold, italic, underline, font name, color). If this tag is not present, the default tree font is used for the node.
Example:

nft=0F;00000000;000A;01;arial

(1) The first two digits in the node format string are a hexadecimal number indicating the font style and which font attributes are used from the global tree font parameters:
bit0 = 1 -> bold; bit0 = 0 -> not bold
bit1 = 1 -> italic; bit1 = 0 -> not italic
bit2 = 1 -> underline; bit2 = 0 -> not underline
bit3 = 1 -> strikeout; bit3 = 0 -> not strikeout
bit4 = 1 -> use the default tree font size; bit4 = 0 -> use the specified node font size (see below)
bit5 = 1 -> use the default tree font name and charset; bit5 = 1 -> use the specified node font name and charset (see below)
bit6 = 1 -> use the default tree font color; bit6 = 1 -> use the specified node font color (see below)
bit7 = 1 -> use the default tree font style; bit7 = 1 -> use the specified node font style (see above)

So a hexadecimal number of '01' means only 'bold', and not italic/underline/stikeout

(2) The second number is an eight digit hexadecimal number representing the font color in hexadecimal notation. 00FF0000 represents pure blue, 0000FF00 is pure green, and 000000FF is pure red. $0000000 is black and 00FFFFFF is white.

(3) The third number is a four digit hexadecimal number representing the font size. For example, 0A would represent font size 10.

(4) The fourth number represents the font character set. 
ANSI_CHARSET		00	ANSI characters. 
DEFAULT_CHARSET		01	Font is chosen based solely on Name and Size. 
				If the described font is not available on the 
				system, Windows will substitute another font.
SYMBOL_CHARSET		02	Standard symbol set.
SHIFTJIS_CHARSET 	80	Japanese shift-jis characters.
HANGEUL_CHARSET		81	Korean characters (Wansung).
JOHAB_CHARSET		82	Korean characters (Johab).
GB2312_CHARSET		86	Simplified Chinese characters (mainland china).
CHINESEBIG5_CHARSET	88	Traditional Chinese characters (taiwanese).
GREEK_CHARSET		A1	Greek characters.
TURKISH_CHARSET		A2	Turkish characters.
VIETNAMESE_CHARSET	A3	Vietnamese characters.
HEBREW_CHARSET		B1	Hebrew characters.
ARABIC_CHARSET		B2	Arabic characters.
BALTIC_CHARSET		BA	Baltic characters. Not available on NT 3.51.
RUSSIAN_CHARSET		CC	Cyrillic characters. Not available on NT 3.51.
THAI_CHARSET		CE	Thai characters. Not available on NT 3.51
EASTEUROPE_CHARSET	EE	Includes diacritical marks for eastern european countries. Not available on NT 3.51.
OEM_CHARSET		FF	Depends on the codepage of the operating system.


(5) the fifth part of the node formatting tag is the name of the font.





Additional tags (node section): 


Node color tag

This tag, 'cl', specifies the text background color of an individual node. If this tag is not present, the default tree background color is used.

Example for specifying pure red:

cl=000000FF

This is an eight digit hexadecimal number representing the font color in hexadecimal notation. 00FF0000 represents pure blue, 0000FF00 is pure green, and 000000FF is pure red. $0000000 is black and 00FFFFFF is white.


Export tag:

enableexport=0 means that this node, and all its child nodes will not be included in a single-file html/rtf/txt subtree print or export. Anything else means that this node will be exported and printed.


Template tag:
istemplate=1 means that this node is a template for all newly created child nodes


Default subtree icon tag:
dsi=1 means that this node contains the default icon for the subtree, all newly created descendants of this subtree will display this icon.


Bookmarks list


The bookmark list is a simple flat list of tree node ids, example:

<bmarks>
  id=34
  id=119
  id=7
<end bmarks> 5P9i0s8y19Z

The number in each line after 'id=' is the ID of the node in the tree.

The bookmarks list is placed before the first node tag of the tree inside a TreePad file.

_______________________________________________________


TPZ files

When storing objects such as images into TreePad PLUS, the program will not keep them into the .hjt file, but into an external file, of extension '.tpz'.
TPZ files (TreePad Zipped) are simply zip files! TPZ files can contain images. The .tpz file has the same name as the .hjt file, only the extension is different.
Images are stored in a folder inside the .tpz file, called \1 . The images have numbered names, like 1.bmp, 2.bmp.....346.bmp....etc.
For images these formats are supported: bmp, ico, gif, jpeg, png, wmf, emf
Tree icons are stored in the folder \2 inside the tpz file.
For icons these formats are supported: ico, bmp


_________________________________________________________

TreeBooks and Templates

A TreeBook is a subtree or tree containing nodes all with a the same structure, like an XML form. A TreeBook node contains the template from which the structure for the descendant nodes is copied.

The article of an XML TreeBook root node consists of two sections. The first section is the XML template, this is the model which is applied to new descendant nodes by default. The second section  contains the text that is displayed in the TreeBook root node itself, it is not part of the template. The text in the second section can be Rich Text, plain text, XML or HTML. An RTF section should start with the string {\rtf, an HTML section should start with the line '<html>', an XML section should start with <?xml. Anything else will be interpreted as a plain text section.
The two sections are separated by the line <end section> 5P9i0s8y19Z 

Example 5, an XML treebook node (or an XML template node)

  id=6
  dt=XML
  istemplate=1
  <node>
  Business Addressbook
  1
  <?xml version="1.0"?>
  <Person version="1" pxpos_col1="30" pxpos_col2="190">
  <first_name caption="First name"  pxwidth="440" ></first_name>
  <last_name caption="Last name"  pxwidth="440" ></last_name>
  <company_name caption="Company"  pxwidth="440" ></company_name>
  <address caption="Address" pxwidth="440" multirow="true" pxheight="50"></address>
  <zipcode caption="Zip code" pxwidth="300"></zipcode>
  <city caption="City" pxwidth="440" ></city>
  <state caption="State"  pxwidth="440" ></state>
  <country caption="Country" pxwidth="440" ></country>
  <business_phone type="phone" caption="Business Phone 1" pxwidth="300"></business_phone>
  <business_phone type="phone" caption="Business Phone 2" pxwidth="300"></business_phone>
  <business_mobile type="phone" caption="Business Mobile" pxwidth="300"></business_mobile>
  <private_phone type="phone" caption="Private Phone 1" pxwidth="300"></private_phone>
  <private_phone type="phone" caption="Private Phone 2" pxwidth="300"></private_phone>
  <private_mobile type="phone" caption="Private Mobile" pxwidth="300"></private_mobile>
  <private_email type="email" caption="Personal email address" pxwidth="300" ></private_email>
  <business_email type="email" caption="Business email address" pxwidth="300" ></business_email>
  <business_fax type="fax" caption="Business fax" pxwidth="300" ></business_fax>
  <company_homepage type="URL" caption="Company home page"  pxwidth="440" ></company_homepage>
  <private_homepage type="URL" caption="Personal Home page" pxwidth="440" ></private_homepage>
  <remarks caption="Remarks" pxwidth="440" multirow="true" pxheight="300"></remarks>
  </Person>
  <end section> 5P9i0s8y19Z
  {\rtf1\ansi\ansicpg1252\deff0\deflang1033{\fonttbl{\f0\fswiss\fcharset0 Arial;}}
  \viewkind4\uc1\pard\b\f0\fs40 Business Addressbook\par
  \par
  \b0\fs20 By default, any descendent-node of this node will display the Business Addressbook form.\par
  Alternatively, you can also insert standard TreePad nodes containing normal article text and images.\par
  Please note that the text of this article can be altered in any way you like.\par
  \par
  }
  <end node> 5P9i0s8y19Z


_________________________________________________________

Draft pad 

The draft pad is an extra editor pane (to be found only in TreePad Business Edition) in the accessory pane, below the article.

The contents of the Draft Pad is stored as a block of Rich Text, between <scrpbk> tags; example:

<scrpbk>
{\rtf1\ansi\deff0\deftab850{\fonttbl{\f0\fnil\fcharset0 arial;}{\f1\fnil\fcharset2 symbol;}{\f2\fnil\fcharset2 WingDings;}}{\colortbl\red0\green0\blue0;\red255\green0\blue0;\red0\green128\blue0;\red0\green0\blue255;\red255\green255\blue0;\red255\green0\blue255;\red128\green0\blue128;\red128\green0\blue0;\red0\green255\blue0;\red0\green255\blue255;\red0\green128\blue128;\red0\green0\blue128;\red255\green255\blue255;\red192\green192\blue192;\red128\green128\blue128;\red0\green0\blue0;}\wpprheadfoot1\paperw11906\paperh16838\margl1417\margr1417\margt1417\margb1417\margh720\margf720{\*\listtable{\list\listtemplateid19690212{\listlevel\leveljc0\levelfollow0\levelstartat1\levelspace0\levelindent360\levelnfc1{\leveltext\'02\'00.;}{\levelnumbers\'01;}}
{\listlevel\leveljc0\levelfollow0\levelstartat1\levelspace0\levelindent360\levelnfc3{\leveltext\'02\'01.;}{\levelnumbers\'01;}}
{\listlevel\leveljc0\levelfollow0\levelstartat1\levelspace0\levelindent360\levelnfc0{\leveltext\'02\'02.;}{\levelnumbers\'01;}}
{\listlevel\leveljc0\levelfollow0\levelstartat1\levelspace0\levelindent360\levelnfc4{\leveltext\'02\'03);}{\levelnumbers\'01;}}
{\listlevel\leveljc0\levelfollow0\levelstartat1\levelspace0\levelindent360\levelnfc2{\leveltext\'02\'04);}{\levelnumbers\'01;}}
{\listlevel\leveljc0\levelfollow0\levelstartat1\levelspace0\levelindent360\levelnfc4{\leveltext\'02\'05);}{\levelnumbers\'01;}}
{\listlevel\leveljc0\levelfollow0\levelstartat1\levelspace0\levelindent360\levelnfc0{\leveltext\'02\'06);}{\levelnumbers\'01;}}
{\listlevel\leveljc0\levelfollow0\levelstartat1\levelspace0\levelindent360\levelnfc0{\leveltext\'02\'07);}{\levelnumbers\'01;}}
{\listlevel\leveljc0\levelfollow0\levelstartat1\levelspace0\levelindent360\levelnfc0{\leveltext\'02\'08);}{\levelnumbers\'01;}}
\listid1194737}}{\*\listoverridetable{\listoverride\listid1194737\listoverridecount0\ls1}}\endnhere\sectdefaultcl{\pard{\tx8480 \ql\li0\fi0\ri0\sb0\sl\sa0 \plain\f0\fs20\cf0 All in one! Multi-featured Tree-structured Organizer, intuitive yet powerful.}}
}
<end scrpbk> 5P9i0s8y19Z



Future TreePad file formats

New tags will be introduced to support node marking, node timestamps, etc. For any program to be compatible with future TreePad versions, the program needs to ignore any tags it does not know (this is how TreePad functions, so that older versions can read files created by newer versions).

Example of a possible new tag which can be introduced in a new TreePad version:

  id=6
  t=200104292343
  dt=text
  <node>
  mail from the President
  4
  Dear sir,

The new element here is 't=200104292343'. Which can mean that the node has been created on april 29 2001 on 23:43, but that does not really matter. Any current TreePad version will ignore the 't=' tag, because it does not know it. That's what will keep the program compatible with future versions, this is also the way any TreePad compatible program should approach .hjt files.

To help programmers create fully TreePad compatible software (programs which are able to read any future .hjt files), some example programming code is shown below. 

This code example (in a Pascal-like language) shows how you can read a TreePad node without relying on knowing which tags you will encounter, and in which order. 


  type
    TDataType = (dt_text, dt_rtf, dt_html);

    TNode = class
      ID, level: integer;
      caption, article: string;
      DataType: TDataType;
    end;

  var 
    line: string;
    node: TNode;

  procedure Process_ID(line: string; node: TNode); //procedure will put ID found in line in Node.ID
  procedure Process_DT(line: string; node: TNode);  //procedure will put DT if found in Node.DataType
  function LineStartsWith(substring, line: string): boolean; //true if line starts with substring


  .........................

  //read one node:

  //read the first part, until the node tag:
  repeat
    readln(line, file);
    if LineStartsWith('id', line) then
      process_ID(line, node) 
    else if LineStartsWith('dt', line) then
      process_DT(line, node);
  until line = '<node>';

  //read the node caption and the node level:
  readln(line, file);
  Node.caption := line
  readln(line, file);
  Node.level := strToIntDef(line, 0)

  //finally read the article until the end node tag has been found
  Node.article := ''; //initialize the article variable
  repeat
    readln(line, file);
    if line = '<end node> 5P9i0s8y19Z' then
      break //exit from repeat loop
    else 
      Node.article := Node.article + line; //add one more line to the article
  until eof(file);



Henk Hagedoorn
Freebyte!
Almere, the Netherlands

http://www.treepad.com
http://www.freebyte.com
email: http://www.treepad.com/support



