19 A Hevea tutorial
We now work through an example of using Hevea to generate HTML from a
complex SWP document. We shall be ambitious, and try to translate the text
of Manfred Szabo's Linear Algebra textbook to HTML. The source
(and the graphics) for this document are available on the SWP CD, and you
may want to copy them to your hard disk and work through the details as we
discuss them. I'll rename SzaboLinearAlgebra.tex as szabo.tex, since we're going to be typing the name a lot.
Note that this discussion is based on Hevea 1.05. I don't intend to revise
it every time a new edition of Hevea comes out (which seems to happen about
4 times a year) since it's just supposed to illustrate the basic concepts.
So it's possible that later versions of Hevea will give somewhat different
results.
19.0.1 Intended audience
The SWP approach to document preparation is to shield the user as much as
possible from the details of LATEX macro-writing. With Hevea this will no
longer work: you will usually have to teach the system how to deal with
non-standard macros. This tutorial is therefore written for the SWP user who
has managed to avoid interaction with LATEX, and tries to explain some LATEX macro concepts as the need arises. Experienced macro-writers can
probably skim the details.
19.0.2 Customization strategy
As you'll see, we shall be customizing Hevea for szabo.tex in a
step-by-step fashion, teaching it a few new macros at a time. I strongly
recommend this approach because it makes isolating errors much easier: you
just comment out the most recent set of macros, and restore them one-by-one
until Hevea crashes.
Like LATEX itself, Hevea is very sensitive to typos. One of the most
common is to use parentheses ( or ) instead of the
corresponding braces { or }. In every text editor I've
used, these are hard to tell apart on the screen. If Hevea complains about
an empty stack or about an error reading LaTeX arguments
this is probably the first thing to check.
19.1 Round 1 — indexing
OK, let's get started. szabo.tex is a book created with the Style
Editor; so start a DOS session, switch to the directory containing the
source, take a deep breath, and type sebook szabo.
Hevea responds by printing a huge number of warning messages to the
screen — about 2180, if you're counting. (See Appendix C for a
way to save these into a file). At this point, the most important thing is
not to panic. And indeed, if you watch what's scrolling down the screen,
you'll notice one reassuring feature: although there are many messages, many
of them are the same — that is, Hevea is finding many instances of each of
the macros it doesn't understand. So with any luck we can solve the problems
with comparatively little work. The best strategy is to jot down the three
or four names which seem to flash by most often; we'll try to fix those
first.
We'll begin with the warning about the index, which appears as warning: index structure not found, missing \makeindex. This is easy to fix, but to make the details accessible only to Hevea we
need to put them into a document-specific configuration file. This file must
be called (corresponding to szabo.tex) szabo.hvb, and must
reside in the same directory as the source. So use a text editor to create
szabo.hvb, insert
\usepackage{makeidx}
\makeindex
and close the file. Now we're ready to try again.
19.2 Round 2 — substitutions and tryout
Issue the command sebook szabo. The new szabo.hvb file is automatically included. The warnings about the index have
disappeared, and the number of warnings has fallen to 1815. We'll continue
with our strategy of fixing the most common warnings, a few at a time. To
me, it looks as though the likely candidates are \blacklozenge, \blacksquare, and
\sqrt.
At this point you'll need to understand the rudiments of LATEX macro
writing. The critical thing is to make certain that the definition you
provide to Hevea matches the way the macro is used in your document. You
should start by opening your source document in a text editor, and locate a
few instances of the macros, to see the contexts. We begin in this section
with the \blacklozenge macro
If you look at szabo.tex, you'll see that \blacklozenge is a zero-argument command: that is, its use is
\blacklozenge, with no arguments. These are
the simplest to deal with, and this is one is particularly easy, because if
you look at the symbol font, you'll see that it contains a black diamond
character, and this is close enough for a black lozenge. (Hands up all those
who actually know the difference between a diamond and a lozenge). So now
the task is to tell Hevea to use it.
You can read about this in the Hevea manual, but there's an easier way:
figure out how Hevea handles other symbols. Look in the html
subdirectory of your Hevea directory, find a likely .hva file and
scroll through it. Your first choice will usually be hevea.hva, but
in this case, since we're specifically interested in symbols, look in
symb.hva. There we see lots of constructs defined as {\@symb{\charxxx}} where xxx is some number. We guess — correctly — that this
is a way of printing out Symbol font character xxx in HTML. If we
look in the Windows Character Map for the symbol font we see that
the black diamond is character 168. So we'll add the following macro to
szabo.hvb:
- \def\blacklozenge{\@symb{\char168}}
Here the token after \def
is what's being defined (so, \blacklozenge),
and the actual definition is in braces immediately following.
It's often useful to get immediate feedback on whether your new macro works,
and the simplest way to do this is to run Hevea in filter mode. Let's test
our \blacklozenge translation.
-
Create a small file, say test.hva (Note: it must have the
extension .hva).
- Paste (or type) the \blacklozenge
definition into it
- Close the file, run fhevea test.hva
- Now tell Hevea what to translate: type $\blacklozenge$
- Hit Return for a new line, then enter Ctrl+Z to
close input.
Hevea responds with
- <FONT FACE=symbol>¿
</FONT>
so you can see that it's doing the right thing, though the exact
character isn't obvious. If you have put redir.exe (see Appendix C) into your PATH, you can generate a tiny HTML file
which you can view with your browser by running
- redir -o test.html fhevea test.hva
and then entering your test expression as above.
19.3 Round 3 — more substitutions
We continue in the same vein by dealing with the other two macros, \blacksquare and \sqrt.
-
\blacksquare
- This is a zero-argument command
just like \blacklozenge, but here we run up
against a more serious problem: the symbol font doesn't contain a black
square. Here are three possibilities.
- Change the symbol. Since \blacksquare is used both as the Halmos end-of-proof symbol, and as a marker
for some textual elements, we can probably get away with translating it into
a single black dot, character 183. So we could say:
- \def\blacksquare{\@symb{\char183}}
- Ignore it. In this context, this is also reasonable, though
in others it may not be. The TEX for “do nothing” is \relax, so you could say:
- \def\blacksquare{\relax}
- Use Unicode. Obviously this will work only when you can be
sure your readers have a Unicode font; but unlike the two previous
solutions, this one will give you the correct symbol. From the Character
Map, a black square is character Unicode: 25A0, so we say
- \def\blacksquare{\UNICODE{0x25A0}} or \def\blacksquare{\U{25A0}}
Note the difference: the \U macro uses the
hexadecimal number as read off the Character Map, while \UNICODE requires a leading 0x. In both instances, case
is irrelevant. (You could also try a larger square, character 2588).
If you go with the Unicode solution, you will probably want to tell Hevea to
use the UTF-8 character set, which means adding some text to szabo.hvb: see section 12 for the precise statement required.
- \sqrt
- The \sqrt macro is a two-argument command: that is, \sqrt{root}{expression}. The two
arguments are in braces immediately following the main command. It's a bit
surprising that Hevea doesn't know about square roots; and when I first ran
into the problem I e-mailed Luc Maranget asking if \sqrt really wasn't available. This illustrates another
nice feature of Hevea — its author is always willing to help.11 Luc sent me a
solution, and I have included it in sqrt.hva along with some other
commented-out possibilities. You might want to experiment with some of
these, but for now we'll simply use the solution provided in sqrt.hva: add
- \input{sqrt.hva}
to szabo.hvb. You could also cut-and-paste the selected
definition directly into the document-specific file.
Let's see what we've accomplished.
19.4 Round 4 — environments
Run sebook szabo once again; we're down to about 1100 warnings,
about half the number we started out with. This time we deal with two
Style-Editor proclamations: Note and Example (with a capital “E”). If you
look at the source code you'll see what's going on: a typical instance is
\begin{Note}
\QTR{NoteLeadin}{Note}
(material)
\end{Note}
Here “Note” is an instance of a LATEX environment –
material enclosed by \begin and \end, with the name in braces. It's clear that the
author wants this environment to set up some special visual appearance; this
may or may not be followed by a \NoteLeadin
command, which is formed by the \QTR macro.12
Customizing Hevea to deal with new environments is simple. The canonical
form is
\newenvironment{env_name}[nargs]{pre_material}{post_material}
where env_name is the name of the environment; nargs (0≤ nargs ≤ 9) is the number of arguments you are
providing for the enviornment, and may be omitted if, as here, there are
zero of them; pre_material is anything you want typeset
before the contents of the environment, and post_material is
anything you want typeset afterwards.
We'll set up the Note environment to start a new paragraph, print out a
horizontal line and start a new line in preparation for the actual example
material. This will be our pre_material. After the example itself,
we'll start a new line, include a rule for symmetry's sake, and then start a
new paragraph in anticipation of whatever's to come next. To get the rule,
we use a special Hevea command \@print, whose
argument, in braces, consists of raw HTML code. Since <HR> is HTML-ese for a rule, we need \@print{<HR>}, and we'll do the same
thing for the newline, which is <BR> in
HTML. So we say
\newenvironment{Note}
{\par\@print{<HR>}\@print{<BR>}}
{\@print{<BR>}\@print{<HR>}\par}
Next, we deal with the \NoteLeadin command
(formed by \QTR{NoteLeadin}), and we'll
just print out its argument in bold:13
\def\NoteLeadin#1{\noindent\textbf{#1}}
Finally, Example. This is just like NoteLeadin (except that, if you
look at the source, it's curiously nested; I don't know why) and we'll treat
it the same way:
\def\Example#1{\noindent\textbf{#1}}
19.5 Round 5 — more math
Run sebook szabo once more; the number of warnings is down to 841.
This time we'll deal with some other math constructs.
-
\func
- If you look at the source, this is a
1-argument command, used like \func{span}.
This is an instance of a so-called “log-like” function where the argument
is to be typeset in upright style, rather than math italics. We can get the
desired effect as follows:
- \def\func#1{\textrm{#1}}
where \textrm selects the upright
(“Roman”) font.
- \limfunc
- This is basically the same \func, except that there could be limits present.
We'll set the first argument upright, and place the limit as a subscript:
- \def\limfunc#1#2{\textrm{#1}_{#2}}
(Note that we're guaranteed to be in math already, so we don't have to worry
about entering math mode for the subscript).
- \dfrac
- This may take a bit more detective work,
but it turns out that \dfrac is just a way of
printing fractions in a “display” style. In HTML we don't have such fine
control so we'll just make \dfrac synonymous
with the normal fraction construct \frac. We
do this by \let-ing \dfrac equal \frac:
- \let\dfrac=\frac
The argument to \let is a single
token, which is then equated to another single token. (Thus you can't
\let something-plus-an-argument equal
something else, because that wouldn't be a single token.)
- \lbrack
- \lbrack
is just a macro for an open-bracket; \rbrack is the other one:
- \def\lbrack{[}
- \def\rbrack{]}
19.6 Round 6 — some Style Editor constructs
We're down to 612 warnings, and next we'll deal with some other Style Editor
constructs, \MenuDialog and the obviously
paired \STARTGRAYBOX and \ENDGRAYBOX. To get a gray box effect, we'll put the material
into a 1× 1 table with a gray background attribute: this may not be
the best solution, depending on what the author has inside these boxes, but
we can always adjust it later.
- \def\STARTGRAYBOX{\par\@print{<TABLE><TR bgcolor=“silver”><TD>}}
\def\ENDGRAYBOX{\@print{</TD></TR></TABLE>}\par}
Next, the \MenuDialog command. This just
identifies items on the Scientific Notebook menus; let's set it in
sans-serif type:
- \def\MenuDialog#1{\textsf{#1}}
19.7 Round 7 — math accents
We're down to 100 warnings; and we'll now deal with \hat and \widehat. The \hat macro is easy, since it's defined in mathaccents.hva: we get it by including
- \input{mathaccents}
in our configuration file. Inline, \hat{A} will be rendered as A^, while in display mode
it will actually be placed above the letter using a table.
As you might expect, \widehat places a hat
over several characters; and there's no way that this can be done accurately
with HTML. At this point you may want to re-think your decision to use Hevea
and consider an image-based solution like TeX4ht. But if there aren't many
occurrences, we could consider using \hat for
widehats, too: we can define \widehat to be a
sort of \hat:
- \def\widehat#1{\hat{(#1)}}
Note the parentheses around the argument in the definition:
without them \widehat{AB} would be rendered
ambiguously as AB^.
We adopt the same idea for dealing with the overline in text
warning, which occurs about 14 times in the document. We'll print out the
text in parentheses, and follow it by an over-bar character, character 175.
This is far from optimal, and again you may want to consider an image-based
solution. There's one complication: \overline
is already defined (in latexcommon.hva) and if you try to \def it, the new definition will be ignored (see the
discussion in section 15). Moreover, we don't want to
completely over-ride the old definition, which already deals satisfactorily
with the display form. Here's our new definition, where only the material
following the \else — which governs the
non-display, ie inline, case — is new. Note that now that we know we want
to over-ride the macro, we can use the non-standard \texdef form.14
- \texdef\overline#1{\ifdisplay
\begin{array}{c}\hline #1 \\~\end{array}
\else
(#1)\@nostyle\@print{¯}
\fi}
Note the appearance of \@nostyle
here: this is to ensure that characters in the succeeding \@print are sent though literally, and not further
interpreted. (Without it, the ampersand & would be rendered as
& in the HTML).
19.8 Round 8 — The \preface macro
We now need to teach Hevea about the \preface
macro.15 What's wanted is to make an unnumbered Chapter heading. This
turns out to be an especially subtle problem of macro processing, and is
explained in detail in section 17.2. For now, enter the following,
being very careful to observe the space between the #1#2 and the
{:
\def\preface#1#2 {\chapter*{#1#2}}
19.9 Round 9 — harpoons
\leftrightharpoons is another AMS symbol not
in the Symbol font. (As far as I can tell, it isn't available in Unicode
either). But it only appears a few times, so we adopt the inelegant solution
and just print it out in boldface.
- \def\rightleftharpoons{\textbf{ rightleftharpoons }}
At this point, we're down to 32 warnings, as shown in Appendix E.
Let's take stock.
-
Hevea ignores the \text macro defined
at line 164 of swp.hva. \text was
defined in ams.hva (which was loaded with Style Editor support).
swp.hva also defines it in case we haven't loaded the ams
package (as might occur with a non-SE document). So Hevea's ignoring it is
just what we want.
- Hevea ignores the \solution macro at
line 21 of the source document. This is because \solution is also defined in the theorem.hva file, which
is automatically loaded by Style Editor documents. If we want to re-define the macro (perhaps to change the way it's rendered in HTML),
include the redefinition in the document-specific configuration file, via
\renewcommand.
- Hevea reports that the macro \arraystretch is defined via \renewcommand in
the source. Since there's no easy way to control the inter-row spacing in
HTML tables, we can ignore this too. If you really need it, try defining it
via \newcommand.
- Hevea ignores an instance of the \vspace
macro for inserting vertical space. Since HTML can't do this, ignoring the
macro is just what we want.
- Hevea reports about 15 instances of an undefined label.
That's correct — the labels are undefined in the source. At least
one of these (the label on line 18321) is a simple typo (“mutally” for
“mutually”).
- Hevea reports five instances of Multiple definitions for label. As explained earlier (see section 15) this arises because a
section's label is included within a sectioning command; this too can be
ignored.
- Hevea reports three instances of an undefined label where the
original text of the label extends over two lines. This is because of the
way SWP sometimes places labels in the document: see section 16
for a fix.
If we fix the last problem, Hevea will generate as good an HTML document as
we can expect from this highly mathematical source. The result, szabo.html is just over 1.7 MB long. Trying to load a document that long
will certainly tax, and may even break, many browsers, so the obvious thing
to do is to cut it into bits.
This is exactly what the companion program Hacha does. To cut up szabo.html into smaller files, issue the command
hacha szabo.html
You will see a number of messages from Hacha, of the form
szabo.html:38325: Warning, cannot find anchor: zerokernel
These refer to problems in the source (.tex) file — as
previously noted, the labelling in the source isn't completely correct. When
Hacha has finished, the lead file will be index.html and the
sub-files will be szabo001.html, szabo002.html, etc. You
can view the result in your browser; and when you see the screen version,
you may want to change in the way you've asked Hevea to handle SWP's
constructs. Since Hevea is fast, this is relatively painless to do.