Generating Output Conditionally

I maintain a set of pages that I need to produce html for in three different ways. The pages must look identical, except the text has to be different in each. One set of pages is completely in English, the second set is completely in Spanish, and the third set is the English pages with some information omitted.

If you've ever grappled with a problem like this, you'll know it can be difficult. Once you start having multiple copies of the same page (i.e., the formatting is the same but the text is different), you're multiplying the amount of work you need to do when it's time to make changes. And who ever heard of a web page that wasn't under construction? The ideal solution is to maintain a single set of pages and have something generate the different html versions automatically.

M4 can skin this cat in many different ways. Here's what I currently do.

I have a single copy of all the pages, written in htm4l, in a directory called (say) ~/html/project/htm4l. These pages all include a macro file with the following definitions (amongst many others):

define(`ALL_ON', `divert(0)') ifelse(m4_lang, english, `define(`ENG_ON', `divert(0)') define(`SPA_ON', `divert(-1)')', `define(`ENG_ON', `divert(-1)') define(`SPA_ON', `divert(0)')') define(`ENG_SPA', `ifelse(m4_lang, english, `$1', `$2')') ifdef(`m4_xxx', `define(`XXX_OFF', `define(`XXX_ON', `divert'(divnum))divert(-1)')', `define(`XXX_OFF') define(`XXX_ON')')

This may look a little complicated and, in fact, it is. But we can unpick it slowly, and the first parts are not too bad. Firstly, there is a macro called ALL_ON which cancels any diversion and makes output go to standard out, as normal. If you don't know what I'm talking about, read the counted table example. If that doesn't help, look at your m4 documentation.

The next piece (that starts with ifelse) makes two macro definitions, depending on the value of another macro: m4_lang. If m4_lang expands to english, then ENG_ON is defined as divert(0) and SPA_ON is defined as divert(-1). This means that any text following an occurrence of ENG_ON will be sent to stdout, and any text following a SPA_ON will be diverted to -1, which is m4's equivalent of catting to /dev/null. That is, the English text is output and the Spanish is discarded.

If m4_lang does not expand to, english then the second set of two definitions is done, and the obvious thing happens: English is thrown away and Spanish is output.

At this point, we can write htm4l that looks like this:

<center> ENG_ON These words are in English. SPA_ON Estas palabras son en Espa&ntilde;ol. ALL_ON <center>

and everything will work as hoped. To give myself more choice about how I delimit languages, I also added the third piece, which I use for very short pieces of text (e.g., the label on a radio button on a form). It's simpler to write

SPA_ENG(`Surname', `Apellido')

Than to use the previous method (which I use extensively for long paragraphs of text).

So we've solved the problem of switching between English and Spanish. The third set of pages requires some of the English be thrown away. Again, this could be done in many ways. Simplest would be to simply define a macro and test if it holds some specific value. If it does, begin a diversion to -1. If not, to 1. But we cannot blindly do that, because if we begin a diversion to 1 when English is already off, we'll get unwanted English in the Spanish.

Instead, the macro in the last part looks to see if m4_special is defined. If is not, then XXX_ON and XXX_OFF are defined to do nothing (this is the final line above). Otherwise, XXX_ON and XXX_OFF are set up so they turn English off and turn it back on if it was already on. To accomplish this, XXX_OFF defines XXX_ON by making use use the built in m4 macro divnum that returns the number of the current diversion. When XXX_OFF is called, it defines XXX_ON to restore the current diversion number, and then begins to throw away output by calling divert(-1).

The upshot of all this is that I now can write htm4l text that looks like this:

<center> ENG_ON These words are in English. XXX_OFF Only show this if m4_xxx is not defined. XXX_ON SPA_ON Estas palabras son en Espa&ntilde;ol. ALL_ON <center>

Which can produce 3 different html outputs, depending on the values of the macros m4_lang and m4_xxx.

This leaves one question, which is: where do those two macros get defined? The simplest thing is to create a directory to hold the generated html for each of the three possibilities. Then, in the Makefile in the directory, you can pass the definition of the macro to m4 on the command line. If you use GNU make, you can write a Makefile that looks like this:

include /usr/local/www/m4/Makefile %.html : ../htm4l/%.htm4l $(M4) $(M4FLAGS) $< > $@ M4FLAGS = -Dm4_lang=english HTM4L = $(wildcard ../htm4l/*.htm4l) HTML = $(patsubst %.htm4l, %.html, $(subst ../htm4l/,, $(HTM4L))) all: $(HTML) clean: rm -f $(HTML) $(HTML): /usr/local/www/m4/html-macros

This is in my ~/html/project/english directory. This is a thing of beauty because the directory with this Makefile contains no other files at all (and no symbolic links to the htm4l files either). I just type make in this directory, and the html is generated from the htm4l in the sibling htm4l directory. Doing a make clean, gets rid of it all. I have a directory like this for Spanish and another for xxx. The only difference anywhere is in a word or two in the Makefiles. Doing a make in the ~/html/project/htm4l does a make in each of the three for me, generating about 60 html files in all.

If you've waded through all this, you know more than enough about how to use m4 and the htm4l macros. There's probably not a lot left that I can tell you. Have fun...

Back to the htm4l home page. Back to the htm4l examples page.
© Terry Jones (terry <AT> Last modified: Mon Oct 2 02:21:32 CEST 2006