                            ====================
                            Web2Text v1.2/16-bit
                    Freeware HTML to ASCII text converter
                    =====================================

Unlike all the other such programs to be found on the net (that I know of),
this one attempts to create a text file that retains some of the layout of
the web page being converted. Most other converters merely remove HTML
tags, which can leave you with a total mess, and lots more work to do.
Web2Text also keeps URLs intact, which only one other converter makes any
attempt at (and it does it very badly, so I'll name no names).

What's New in v1.2:
===================

+ No limit on filesizes now; however, conversion is now a bit slower.
+ Now spaces after entities are always kept; in v1.1 they were sometimes lost.
+ Fixed &nbsp; entities; these now produce a space once more (though not a
  non-breaking one).
+ Fixed tab order.
+ Fixed spurious underlines at start and end of converted document.
+ Conversion can now be cancelled; press Esc, Alt+C or click the Cancel
  button during conversion.

Unfortunately I haven't had time to put in some features I had planned for
version 1.2, so these will appear in 1.3, and I'm making no prediction on a
release date for that one!

What's New in v1.1:
===================

+ Now supports <EM>, <STRONG>, <CITE>, <DFN>, <DIR>.
+ Now supports all 100 mnemonic entities plus &#xxx; entities as well.
+ Selectable filemask.
+ Settings saved to INI file.
+ Labels no longer truncated in certain resolutions.
+ Larger window so more files and directories are visible.
+ Progress bars now correctly show progress!
+ Beeps on completion of processing; uses whatever sound you've assigned to
  'Asterisk'.
+ mailto: URLs no longer keep the 'mailto:' bit visible.
+ Font selection button added primarily to enable Japanese users to select a
  Kanji font and see folders and files using Kanji characters. 

Installation:
=============

Unzip WEB2TEXT.EXE to a directory of your choice. Add it to a program group
(File|New...|Program Item in Program Manager) and double-click the icon to run
it. In case you're wondering there's no installation routine in order to keep the
size of the ZIP file down; around 129Kb instead of around 700Kb if I used
InstallShield for instance!

<CENTER> - centers text. However, the ALIGN= property of various tags is *not*
supported. I feel <CENTER> should be used in addition to ALIGN=CENTER because
older browsers may support the former but not the latter. Modern browsers
support both.

<I>, <EM>, <DFN> and <CITE> - text often rendered as italics; surrounded by
                              asterisks *like this*.

<B> or <STRONG> - text often rendered as bold; surrounded by underlines
                  _like this_.

<TITLE> and <H1> thru <H4> - text within these is displayed with equals/minus
signs around it appropriate to the importance of the text. E.G. <TITLE> text
is --===like this===--

Lists <UL>, <DIR> and <OL> are supported correctly though text longer than one
line will not indent correctly. <LI> produces a number for ordered lists or a
plus sign for unordered/directory lists. Make sure you use <UL>, <DIR> or <OL>
as appropriate, as use of <LI> without doing so can cause problems if you are
already in some other type of list. Netscape does not handle this correctly,
but IE does and will display the same results you get from Web2Text.

Tags that cause a new line: <P> </P> <BR> <TR> </CENTER> </Hx> </TITLE>

All entities are supported, in numeric or mnemonic forms. I've noticed some
browsers accept characters other than ; to end an entity. Well, I don't. The
only incorrectly supported entity is &nbsp; which should give a non-breaking
space, but Web2Text converts it to a normal space.

Images are ignored. URLs, providing they are absolute ones, are retained and
held in square brackets after the descriptive text assigned to them; e.g.
<A HREF="http://blah.com">this url</A> becomes 'this url [http://blah.com]'.
Relative URLs are ignored. URL types supported are: http, gopher, telnet,
news, ftp and mailto. Mailto URLs do not keep the URL type, i.e.
<A HREF="mailto:me"> becomes [me] instead of [mailto:me]. All other URL types
keep the URL type specifier.

Problems:
=========

+ Tables are currently poorly supported. <TD> is treated as a tab, so columns
  will not line up correctly. Lines that wrap will not indent correctly. As I
  don't really need better support than that for my own purposes, I'm unlikely
  to change this in the near future but you can always ask.

+ Any tags between the end of a <A HREF="..."> and </A> tag are not processed
  and will be visible in the converted text file. I may fix this at some
  point, but it requires a slight reorganisation of my code so it is not a
  priority.

+ The line-length setting works only for lines without hard tabs in them. If
  your HTML file has hard tabs, they are incorrectly counted as being a single
  space. I couldn't be bothered putting in a tab setting, as you shouldn't be
  using tabs in HTML anyway. You may, however, have used them in <PRE> blocks,
  so watch out for that.

This program is FREEWARE. I accept no responsibility for any harm or loss
caused by the use of it. I.E. if you save over your company's end of year
report, don't come crying to me.

A 32-bit version is available; see the URL below for details. It supports
long filenames and is a bit quicker but is otherwise identical.

Distribution:
=============

Web2Text is freeware. You may not charge money for it, with one exception:
if you are putting out a CD full of shareware/freeware/etc. and want to
include Web2Text you may do so but on the condition that you send me a copy of
the CD. Email me to get my postal address and formal permission.

-- 
Damien Burke
software@jetman.demon.co.uk
http://www.jetman.demon.co.uk/software/index.html
