Previous Up Next

Part V
Hevea in Practice

18  A Hevea tutorial

We now work through an example of using Hevea to generate HTML from a complex SWP document. We shall be ambitious, and try to translate the text of Manfred Szabo’s Linear Algebra textbook to HTML. The source (and the graphics) for this document are available on the SWP CD, and you may want to copy them to your hard disk and work through the details as we discuss them. I’ll rename SzaboLinearAlgebra.tex as szabo.tex, since we’re going to be typing the name a lot.

Note that this discussion is based on Hevea 1.05. I don’t intend to revise it every time a new edition of Hevea comes out (which seems to happen about 4 times a year) since it’s just supposed to illustrate the basic concepts. So it’s possible that later versions of Hevea will give somewhat different results.

18.0.1  Intended audience

The SWP approach to document preparation is to shield the user as much as possible from the details of LATEX macro-writing. With Hevea this will no longer work: you will usually have to teach the system how to deal with non-standard macros. This tutorial is therefore written for the SWP user who has managed to avoid interaction with LATEX, and tries to explain some LATEX macro concepts as the need arises. Experienced macro-writers can probably skim the details.

18.0.2  Customization strategy

As you’ll see, we shall be customizing Hevea for szabo.tex in a step-by-step fashion, teaching it a few new macros at a time. I strongly recommend this approach because it makes isolating errors much easier: you just comment out the most recent set of macros, and restore them one-by-one until Hevea crashes.

Like LATEX itself, Hevea is very sensitive to typos. One of the most common is to use parentheses ( or ) instead of the corresponding braces { or }. In every text editor I’ve used, these are hard to tell apart on the screen. If Hevea complains about an empty stack or about an error reading LaTeX arguments this is probably the first thing to check.

18.1  Round 1 — indexing

OK, let’s get started. szabo.tex is a book created with the Style Editor; so start a DOS session, switch to the directory containing the source, take a deep breath, and type sebook szabo.

Hevea responds by printing a huge number of warning messages to the screen — about 2180, if you’re counting. (See Appendix C for a way to save these into a file). At this point, the most important thing is not to panic. And indeed, if you watch what’s scrolling down the screen, you’ll notice one reassuring feature: although there are many messages, many of them are the same — that is, Hevea is finding many instances of each of the macros it doesn’t understand. So with any luck we can solve the problems with comparatively little work. The best strategy is to jot down the three or four names which seem to flash by most often; we’ll try to fix those first.

We’ll begin with the warning about the index, which appears as warning: index structure not found, missing \makeindex. This is easy to fix, but to make the details accessible only to Hevea we need to put them into a document-specific configuration file. This file must be called (corresponding to szabo.tex) szabo.hvb, and must reside in the same directory as the source. So use a text editor to create szabo.hvb, insert

\usepackage{makeidx}
\makeindex

and close the file. Now we’re ready to try again.

18.2  Round 2 — substitutions and tryout

Issue the command sebook szabo. The new szabo.hvb file is automatically included. The warnings about the index have disappeared, and the number of warnings has fallen to 1815. We’ll continue with our strategy of fixing the most common warnings, a few at a time. To me, it looks as though the likely candidates are \blacklozenge, \blacksquare, and \sqrt.

At this point you’ll need to understand the rudiments of LATEX macro writing. The critical thing is to make certain that the definition you provide to Hevea matches the way the macro is used in your document. You should start by opening your source document in a text editor, and locate a few instances of the macros, to see the contexts. We begin in this section with the \blacklozenge macro

If you look at szabo.tex, you’ll see that \blacklozenge is a zero-argument command: that is, its use is \blacklozenge, with no arguments. These are the simplest to deal with, and this is one is particularly easy, because if you look at the symbol font, you’ll see that it contains a black diamond character, and this is close enough for a black lozenge. (Hands up all those who actually know the difference between a diamond and a lozenge). So now the task is to tell Hevea to use it.

You can read about this in the Hevea manual, but there’s an easier way: figure out how Hevea handles other symbols. Look in the html subdirectory of your Hevea directory, find a likely .hva file and scroll through it. Your first choice will usually be hevea.hva, but in this case, since we’re specifically interested in symbols, look in symb.hva. There we see lots of constructs defined as {\@symb{\charxxx}} where xxx is some number. We guess — correctly — that this is a way of printing out Symbol font character xxx in HTML. If we look in the Windows Character Map for the symbol font we see that the black diamond is character 168. So we’ll add the following macro to szabo.hvb:

\def\blacklozenge{\@symb{\char168}}

Here the token after \def is what’s being defined (so, \blacklozenge), and the actual definition is in braces immediately following.

It’s often useful to get immediate feedback on whether your new macro works, and the simplest way to do this is to run Hevea in filter mode. Let’s test our \blacklozenge translation.

  1. Create a small file, say test.hva (Note: it must have the extension .hva).
  2. Paste (or type) the \blacklozenge definition into it
  3. Close the file, run fhevea test.hva
  4. Now tell Hevea what to translate: type   $\blacklozenge$  
  5. Hit Return for a new line, then enter Ctrl+Z to close input.

Hevea responds with

<FONT FACE=symbol>¿ </FONT>

so you can see that it’s doing the right thing, though the exact character isn’t obvious. If you have put redir.exe (see Appendix C) into your PATH, you can generate a tiny HTML file which you can view with your browser by running

redir -o test.html fhevea test.hva

and then entering your test expression as above.

18.3  Round 3 — more substitutions

We continue in the same vein by dealing with the other two macros, \blacksquare and \sqrt.

\blacksquare
This is a zero-argument command just like \blacklozenge, but here we run up against a more serious problem: the symbol font doesn’t contain a black square. Here are three possibilities.
Change the symbol. Since \blacksquare is used both as the Halmos end-of-proof symbol, and as a marker for some textual elements, we can probably get away with translating it into a single black dot, character 183. So we could say:
\def\blacksquare{\@symb{\char183}}
Ignore it. In this context, this is also reasonable, though in others it may not be. The TEX for “do nothing” is \relax, so you could say:
\def\blacksquare{\relax}
Use Unicode. Obviously this will work only when you can be sure your readers have a Unicode font; but unlike the two previous solutions, this one will give you the correct symbol. From the Character Map, a black square is character Unicode: 25A0, so we say
\def\blacksquare{\UNICODE{0x25A0}}  or  \def\blacksquare{\U{25A0}}

Note the difference: the \U macro uses the hexadecimal number as read off the Character Map, while \UNICODE requires a leading 0x. In both instances, case is irrelevant. (You could also try a larger square, character 2588).

If you go with the Unicode solution, you will probably want to tell Hevea to use the UTF-8 character set, which means adding some text to szabo.hvb: see section 11 for the precise statement required.

\sqrt
The \sqrt macro is a two-argument command: that is, \sqrt{root}{expression}. The two arguments are in braces immediately following the main command. It’s a bit surprising that Hevea doesn’t know about square roots; and when I first ran into the problem I e-mailed Luc Maranget asking if \sqrt really wasn’t available. This illustrates another nice feature of Hevea — its author is always willing to help.11 Luc sent me a solution, and I have included it in sqrt.hva along with some other commented-out possibilities. You might want to experiment with some of these, but for now we’ll simply use the solution provided in sqrt.hva: add
\input{sqrt.hva}

to szabo.hvb. You could also cut-and-paste the selected definition directly into the document-specific file.

Let’s see what we’ve accomplished.

18.4  Round 4 — environments

Run sebook szabo once again; we’re down to about 1100 warnings, about half the number we started out with. This time we deal with two Style-Editor proclamations: Note and Example (with a capital “E”). If you look at the source code you’ll see what’s going on: a typical instance is

\begin{Note}
\QTR{NoteLeadin}{Note}
(material)
\end{Note}

Here “Note” is an instance of a LATEX environment – material enclosed by \begin and \end, with the name in braces. It’s clear that the author wants this environment to set up some special visual appearance; this may or may not be followed by a \NoteLeadin command, which is formed by the \QTR macro.12

Customizing Hevea to deal with new environments is simple. The canonical form is

\newenvironment{env_name}[nargs]{pre_material}{post_material}

where env_name is the name of the environment; nargs (0≤ nargs ≤ 9) is the number of arguments you are providing for the enviornment, and may be omitted if, as here, there are zero of them; pre_material is anything you want typeset before the contents of the environment, and post_material is anything you want typeset afterwards.

We’ll set up the Note environment to start a new paragraph, print out a horizontal line and start a new line in preparation for the actual example material. This will be our pre_material. After the example itself, we’ll start a new line, include a rule for symmetry’s sake, and then start a new paragraph in anticipation of whatever’s to come next. To get the rule, we use a special Hevea command \@print, whose argument, in braces, consists of raw HTML code. Since <HR> is HTML-ese for a rule, we need \@print{<HR>}, and we’ll do the same thing for the newline, which is <BR> in HTML. So we say

\newenvironment{Note}
{\par\@print{<HR>}\@print{<BR>}}
{\@print{<BR>}\@print{<HR>}\par}

Next, we deal with the \NoteLeadin command (formed by \QTR{NoteLeadin}), and we’ll just print out its argument in bold:13

\def\NoteLeadin#1{\noindent\textbf{#1}}

Finally, Example. This is just like NoteLeadin (except that, if you look at the source, it’s curiously nested; I don’t know why) and we’ll treat it the same way:

\def\Example#1{\noindent\textbf{#1}}

18.5  Round 5 — more math

Run sebook szabo once more; the number of warnings is down to 841. This time we’ll deal with some other math constructs.

\func
If you look at the source, this is a 1-argument command, used like \func{span}. This is an instance of a so-called “log-like” function where the argument is to be typeset in upright style, rather than math italics. We can get the desired effect as follows:
\def\func#1{\textrm{#1}}

where \textrm selects the upright (“Roman”) font.

\limfunc
This is basically the same \func, except that there could be limits present. We’ll set the first argument upright, and place the limit as a subscript:
\def\limfunc#1#2{\textrm{#1}_{#2}}

(Note that we’re guaranteed to be in math already, so we don’t have to worry about entering math mode for the subscript).

\dfrac
This may take a bit more detective work, but it turns out that \dfrac is just a way of printing fractions in a “display” style. In HTML we don’t have such fine control so we’ll just make \dfrac synonymous with the normal fraction construct \frac. We do this by \let-ing \dfrac equal \frac:
\let\dfrac=\frac

The argument to \let is a single token, which is then equated to another single token. (Thus you can’t \let something-plus-an-argument equal something else, because that wouldn’t be a single token.)

\lbrack
\lbrack is just a macro for an open-bracket; \rbrack is the other one:
\def\lbrack{[}
\def\rbrack{]}

18.6  Round 6 — some Style Editor constructs

We’re down to 612 warnings, and next we’ll deal with some other Style Editor constructs, \MenuDialog and the obviously paired \STARTGRAYBOX and \ENDGRAYBOX. To get a gray box effect, we’ll put the material into a 1× 1 table with a gray background attribute: this may not be the best solution, depending on what the author has inside these boxes, but we can always adjust it later.

\def\STARTGRAYBOX{\par\@print{<TABLE><TR bgcolor=‘‘silver’’><TD>}}
\def\ENDGRAYBOX{\@print{</TD></TR></TABLE>}\par}

Next, the \MenuDialog command. This just identifies items on the Scientific Notebook menus; let’s set it in sans-serif type:

\def\MenuDialog#1{\textsf{#1}}

18.7  Round 7 — math accents

We’re down to 100 warnings; and we’ll now deal with \hat and \widehat. The \hat macro is easy, since it’s defined in mathaccents.hva: we get it by including

\input{mathaccents}

in our configuration file. Inline, \hat{A} will be rendered as A^, while in display mode it will actually be placed above the letter using a table.

As you might expect, \widehat places a hat over several characters; and there’s no way that this can be done accurately with HTML. At this point you may want to re-think your decision to use Hevea and consider an image-based solution like TeX4ht. But if there aren’t many occurrences, we could consider using \hat for widehats, too: we can define \widehat to be a sort of \hat:

\def\widehat#1{\hat{(#1)}}

Note the parentheses around the argument in the definition: without them \widehat{AB} would be rendered ambiguously as AB^.

We adopt the same idea for dealing with the overline in text warning, which occurs about 14 times in the document. We’ll print out the text in parentheses, and follow it by an over-bar character, character 175. This is far from optimal, and again you may want to consider an image-based solution. There’s one complication: \overline is already defined (in latexcommon.hva) and if you try to \def it, the new definition will be ignored (see the discussion in section 14). Moreover, we don’t want to completely over-ride the old definition, which already deals satisfactorily with the display form. Here’s our new definition, where only the material following the \else — which governs the non-display, ie inline, case — is new. Note that now that we know we want to over-ride the macro, we can use the non-standard \texdef form.14

\texdef\overline#1{\ifdisplay
\begin{array}{c}\hline #1 \\~\end{array}
\else
(#1)\@nostyle\@print{&#175;}
\fi}

Note the appearance of \@nostyle here: this is to ensure that characters in the succeeding \@print are sent though literally, and not further interpreted. (Without it, the ampersand & would be rendered as &amp; in the HTML).

18.8  Round 8 — The \preface macro

We now need to teach Hevea about the \preface macro.15 What’s wanted is to make an unnumbered Chapter heading. This turns out to be an especially subtle problem of macro processing, and is explained in detail in section 16.2. For now, enter the following, being very careful to observe the space between the #1#2 and the {:

\def\preface#1#2 {\chapter*{#1#2}}

18.9  Round 9 — harpoons

\leftrightharpoons is another AMS symbol not in the Symbol font. (As far as I can tell, it isn’t available in Unicode either). But it only appears a few times, so we adopt the inelegant solution and just print it out in boldface.

\def\rightleftharpoons{\textbf{ rightleftharpoons }}

18.10  The end

At this point, we’re down to 32 warnings, as shown in Appendix E. Let’s take stock.

If we fix the last problem, Hevea will generate as good an HTML document as we can expect from this highly mathematical source. The result, szabo.html is just over 1.7 MB long. Trying to load a document that long will certainly tax, and may even break, many browsers, so the obvious thing to do is to cut it into bits.

This is exactly what the companion program Hacha does. To cut up szabo.html into smaller files, issue the command

hacha szabo.html

You will see a number of messages from Hacha, of the form

szabo.html:38325: Warning, cannot find anchor: zerokernel

These refer to problems in the source (.tex) file — as previously noted, the labelling in the source isn’t completely correct. When Hacha has finished, the lead file will be index.html and the sub-files will be szabo001.html, szabo002.html, etc. You can view the result in your browser; and when you see the screen version, you may want to change in the way you’ve asked Hevea to handle SWP’s constructs. Since Hevea is fast, this is relatively painless to do.


Previous Up Next