Testing SWP4’s HTML Production

Philip A. Viton

January 13, 2002

1 Introduction

With the release of version 4 of the SWP family of products (including Scientific Notebook!), users now enjoy the ability to create HTML directly, without requiring the intervention of some other program, using the internal HTML Export feature (File -> Export Document As ...). Here I compare the new facilities with other hypertext production options currently available.

2 The Source

The source was a simple two-page document created with SWP4. Because the screen style influences what is produced by the SWP internal HTML Export facility, I began with the “Blank — Standard LaTeX Article” shell and then changed all fonts to Times New Roman, for maximum compatibility with the output produced by the other systems. If I hadn’t done this, SWP would have produced HTML with text in Arial — because that’s the way the .cst file set up the default screen font — while the other systems would not specify a font at all, and the viewer would see the text in the default proportional font, typically Times. Again for compatibility reasons, with TeX4ht I used the mathtime package, to get the math fonts to match the text (though this makes very little difference, at least on the screen).

Finally, for the bibliography I made use of the harvard package, and specified \let\cite=\citeasnoun. The actual bibliography style was a personal hack of the American Economic Review style, without the bolding of authors’ names.

(The alert reader will note that the link below is to a .txt file, not .tex. It took me a couple of hours to figure out why TeX4ht converted an SWP hyperref reference to the .tex file into a link to the corresponding .html file. This is a “feature” in the TeX4ht support file swpht.sty, which I’d completely forgotten about. The only reason to mention it here is that others may forget, too: this “feature” makes it effectively impossible to display a raw .tex file).

3 The Results

When viewing the results, it is important to remember that different browsers can differ considerably in the way they display the same document on your screen; it is therefore instructive to view the results in a variety of browsers to get a feel for what your readers may see. (I’m told, in fact, that simply refreshing the screen can change the appearance in some browsers). In particular, Netscape 4 appears to have real problems in the way it displays inline math produced by SWP4’s HTML-Export facility. One the other hand, Internet Explorer 5+ and Netscape 6.2 are quite similar, and exported HTML looks excellent.

So here are the results of the test. When you view any of the HTML files, use the Back button on your browser to get back to this text, so you go on to the next one. For the curious, I also provide details on file sizes, though, except for Hevea, there’s not much to choose between them.

3.1 SWP 4 Export-as-HTML

This uses the default settings in SWP4; graphics are therefore in .png format.

3.2 Tweaked SWP 4 Export-as-HTML

As you can see from the previous test, SWP’s internal HTML-Export system does not deal well with the Front Matter. Fortunately, this is easily fixed: open htmlout.cfg (in your top-level SWP directory) and add the following lines to the end of the [STYLEHEADER] section before the closing </SCRIPT>:

P.Author {font-size: large ; text-align: center}

P.Title {font-size: large ; text-align: center}

You can use these as a model of other elements in the front-matter; if you want the text to be displayed in bold, you can add font-weight: bold to the descriptors. Note that once you’ve done this, it will automatically be used in all your conversions, which is presumably what you want; however, there is no provision for document-specific configuration. Finally, this correction is not relevant to Scientific Notebook, which does not include front-matter at all.

If you read the HTML carefully, you’ll notice a couple of places where the English appears a bit off: this is because the HTML-Export doesn’t process labels and markers correctly. For example, near the beginning of the DEA section we have “(Another approach was presented in section )" — obviously something is missing — in this case the section reference. A second tweak can make SWP markers (ie LaTeX labels) work (at least if they’re links to material — eg sections — in the same document). Open htmlout.dat (in your top-level SWP directory) and replace the definitions given there as follows:

\QTSN{label}#2 modeless textarg "<A NAME=""#2""></A>"

\QTSN#1#2 modeless "<A HREF=""##2"">#2</A>"

The effect of this is that, while there will be nothing visible at the point at which you defined your marker, still references (SWP cross-reference fields) will take the reader to the right place.

Finally, note that although this fixes the problem with sectional cross references, it doesn’t generate cross-references to displayed equations. See for example the line “Then we may write equation (linear) as”, near the end of the “Econometric Approaches” section of the tweaked output (below). The tweak correctly generates a link, but the link doesn’t work, because Export-As-HTML ignores labels in displayed equations, hence there’s nothing to be linked to. I know of no fix for this.

3.3 TeX4ht

Like SWP’s internal HTML-Export, Tex4ht uses graphics (in this case .gif images, though the user can change this if desired) for display math, but uses a combination of images and plain text for inline math. Note that we get a linked Table of Contents, proper handling of BibTeX bibliography items, section and equation cross-references, none of which can be done with SWP’s internal HTML-Export. In addition, TeX fields are interpreted correctly here, while they are not in the native HTML-Export system. Finally, note that (though it’s not applicable to this test) TeX4ht will handle subdocuments and \include’d files: both of these are ignored by SWP’s internal facility.

3.4 Tweaked TeX4ht

Normally, TeX4ht will use a combination of plain text and images to represent inline math, which can occasionally lead to anomalous results. We can get around this via a configuration file to force inline math to be represented as gifs (suggestion of Doyle Cutler and Eitan Gurari in previous correspondence) and sets the image alignment explicitly. This can significantly improve the appearance of inline math; it also makes it comparable with what you get with the internal SWP Export-as-HTML facility.

The contents of the configuration file are:

\Configure{$}{\PicMath}{\EndPicMath}{}  
\Configure{PicMath}{}{}{}{ class="math" align="absmiddle"}

3.5 Hevea

Until Scientific Notebook version 4, Hevea was the only viable way to produce HTML from SN documents, because it did not require LaTeX. Unlike the other systems, Hevea uses glyphs from the Symbol font when they are available, and a variety of other non-graphic techniques otherwise, to represent math. Subscript and superscript positioning uses the HTML <sup> and <sub> tags, and accents over letters are handled by little tables. As a result, display math can look quite spread out vertically as compared to the other solutions, and complicated inline math is typically problematic. On the other hand, Hevea is configurable at the document level, and (relevant to SWP/SW users) will handle TOCs, bibliographies, cross-references and other LaTeX tags you may insert in your document.

The results produced here use the file swp.hva, supplied with the SWP support for Hevea, plus a small configuration(.hvb) file:

\input{mathaccents.hva}

\newcommand{\harvarditem}[5][nul]{%

[\@print{<A name="}#4\@print{">}#4\@print{</A>}] #5 (#3)}

The first line allows the \tilde macro to be processed, and the second deals with the Harvard bibliography style.