Original in fr John Perr
fr to en:John Perr
Linux user since 1994; he is one of the french editor of LinuxFocus.
This tutorial describes how to ease the maintenance of text or HTML files,using the m4 macro processor.
A macro command language is often needed when using a text editor. Most of them already have such languages among their features. Even the C compiler provides such a facility for programmers throught the C preprocessor CPP. When it is used to maintain configuration files or a small web site, the GNU/m4 macro processor can efficiently reduce work load. The GNU/m4 macro processor is part of all linux distributions and is a standard among Unix users.
In the following, we show how to use the GNU/m4 macro processor to maintain a set of HTML pages for a small web site. This system will help to keep the whole site coherent. Of course, there are dozens of ways to obtain the same result with Unix tools; that's the beauty of Unix.
This technique is used for the construction of the well known sendmail.cf. There is an m4 macro kit available from Berkley university and designed by Eric Allman.
The GNU/m4 macro processor is not limited to text and HTML editing. It can prove very useful for programmers wishing to extend the features of CPP or to those wishing to have features equivalent to CPP with other languages.
A macro processor is a program which interprets commands (named macros) defined by the user. Macros are often embedded into the text to process. For instance, the following definition:
define(AUTHOR,`Agatha Christie<[email protected]>')
allow to use the word "AUTHOR" anywhere in the text. It will be replaced with "Agatha Christie<[email protected]>" after processing it with m4. There are, of course, more useful functions, as will be shown next.
Let's suppose we have to maintain a web site that has the same pages but within different languages. Moreover, each page has the same header and footer in order to give the site a coherent look. In order to keep things simple and thus avoid the use of a browser to see the result, our example will only deal with text. This will also allow people using lynx to easily browse our site. Here is the HTML code for one page:
<!-- Start of header --> <HTML> <HEAD> <TITLE>Lynx homepage</TITLE> <META name="description" content="Site lynx et m4"> </HEAD> <BODY BGCOLOR="#FFFFFF" LINK="#008000" VLINK="#808080" ALINK="#8080FF"> <TABLE> <TBODY> <TR><TD align=middle colspan="2"> <H1>Lynx a fully-featured World Wide Web client for character-cell displays</H1> <TR><TD align="left" valign="top" width="15%"> <a href="./index-en.html">English</A><BR> <a href="./index-fr.html">French</A><BR> <a href="./index-es.html">Italian</A><BR> <a href="./index-it.html">Spanish</A><BR> <a href="./index-de.html">German</A><BR> <TD align=left> <!-- End of header --> <P>Links to the current sources and support materials for Lynx are maintained at <A HREF="http://www.crl.com/~subir/lynx.html">Lynx links</A></P> <P> and at the Lynx homepage <A HREF="http://lynx.browser.org/">Lynx Information.</A></P> <P>View these pages for information about Lynx, including new updates.</P> <P>Lynx is distributed under the GNU General Public License (GPL) without restrictions on usage or redistribution. The Lynx copyright statement, "COPYHEADER", and GNU GPL, "COPYING", are included in the top-level directory of the distribution. Lynx is supported by the Lynx user community, an entirely volunteer (and unofficial) organization.</P> <!-- Start of footer --> </TBODY> </TABLE> <HR size="0" noshadow> <FONT SIZE=-2> <EM>Page maintained by John Perr.<BR> Page updated on 25/07/99 - © <A HREF="mailto:[email protected]">lynx.browser.org</A>1999 </EM></FONT> </BODY> </HTML> <!-- End of footer -->
Here is the result:
with lynx | with netscape |
All pages will have the same header or footer style, only the language and the body of
the page will differ. We are now going to design m4 macros that are going to be
inserted into the HTML text of our pages in order to replace all the
repetitive data.
Before going into the detail of the macros, let us have a look at the above example
written with such macros:
LYNX_TITRE(Lynx a fully-featured World Wide Web client for character-cell displays) LYNX_ENTETE(Lynx homepage) <P>Links to the current sources and support materials for Lynx are maintained at <A HREF="http://www.crl.com/~subir/lynx.html"> Lynx links</A></P> <P> and at the Lynx homepage <A HREF="http://lynx.browser.org/"> Lynx Information.</A></P> <P>View these pages for information about Lynx, including new updates.</P> <P>Lynx is distributed under the GNU General Public License (GPL) without restrictions on usage or redistribution. The Lynx copyright statement, "COPYHEADER", and GNU GPL, "COPYING", are included in the top-level directory of the distribution. Lynx is supported by the Lynx user community, an entirely volunteer (and unofficial) organization.</P> LYNX_PIED
As such, writing HTML pages is simpler and the text is not lost among HTML tags. To write pages with others languages, translations of this file will have to be built. The french version would be:
LYNX_TITRE(Lynx un navigateur en mode console) LYNX_ENTETE(Un site pour les utilisateurs de lynx) <P>Visitez le <A HREF="http://lynx.browser.org/"> site officiel de lynx</A> pour plus d'informations sur Lynx, y compris les nouvelles mises � jour.</P> <P>Les liens vers les sources de la version courante et divers supports pour Lynx sont tenus � jour sur le site <A HREF="http://www.crl.com/~subir/lynx.html"> liens Lynx</A>.</P> <P>Lynx est distribue dans le cadre de la lisence GNU (General Public License - GPL) sans restriction sur son utilisation ni sa distribution. Les mentions des droits de reproduction de Lynx, "COPYHEADER", et GNU GPL, "COPYING", sont inclus dans la racine de l'arborescence de la distribution. Lynx est supporte par la communaute des utilisateurs de Lynx, une communaute enti�rement benevole (et non-officielle).</P> LYNX_PIED
For each language, the same macros LYNX_TITRE, LYNX_ENTETE and LYNX_PIED are used but with different arguments. These 3 macros allow an efficient replacement for the HTML codes of the header and footer. This is the main advantage of this system: the definition of header and footer is consistent for all the site. If the style of the header and footer has to be changed, only the macro definition file will have to be modified instead of editing each page by hand.
Above, 3 macros have been defined in order to achieve most of the formatting. Here is the file defining those macros. Comments follow:
divert(-1) # File mac.css # Version 1.0 M4 macros for Lynx # # A file trans-LANG.m4 is defined for each # language, based on the french one. # If no translation file exist, # french is the default. # divert(0) changequote({,})dnl # change quotes to curly braces ifdef({LANG},,{define({LANG},{fr})})dnl # Default= french include({trans-}LANG{.m4})dnl # call translation file undefine({format})dnl # Suppress the format definition define({_ANNEE_},esyscmd(date +%Y))dnl #Current year define({LYNX_TITRE},{define(_TITLE_,$1)})dnl # First macro dnl # Second macro define({LYNX_ENTETE},{<!-- Header start --> <HTML> <HEAD> <TITLE>$1</TITLE> <META name="description" content="Site lynx and m4"> <META name="keywords" content="m4, lynx, GPL"> </HEAD> <BODY BGCOLOR="#FFFFFF" LINK="#008000" VLINK="#808080" ALINK="#8080FF"> <TABLE> <TBODY> <TR><TD align=middle colspan="2"> <H1>_TITLE_</H1> <TR><TD align="left" valign="top" width="15%"> <a href="./index-en.html">_ANGLAIS_</A><BR> <a href="./index-fr.html">_FRANCAIS_</A><BR> <a href="./index-es.html">_ESPAGNOL_</A><BR> <a href="./index-it.html">_ITALIEN_</A><BR> <a href="./index-de.html">_ALLEMAND_</A><BR> <TD align=left> <!-- end of header -->})dnl dnl # Third macro define({LYNX_PIED},{<!-- Start of footer --> </TBODY> </TABLE> <HR size="0" noshadow> <FONT SIZE=-2> <EM>_MAINTENEUR_.<BR> _MAJ_ esyscmd(date +%d/%m/%y) - © <A HREF="mailto:[email protected]"> lynx.browser.org</A> _ANNEE_</EM></FONT> </BODY> </HTML> <!-- End of footer -->})dnl
Lines between "divert(-1)" and "divert(0)" are comments. "Divert" is one of the builtin macros of the m4 processor. It is designed to divert the output of the processor. Using -1, tells the processor not to write the lines coming next in the final HTML file, which is what we want.
The "changequote" macro redefines the quotes normally used to quote macro arguments. They are replaced here with curly braces because in text files, and especially french ones, quotes are heavily used and would introduce misinterpretation of macros. Curly braces are less often used for text or HTML, so that is why they have been choosen here.
The "ifdef" macro is used to test whether the macro LANG is defined and to default it to "fr" if not. The LANG macro is used to set the language. In the lines below, we shall see how to define it when calling m4, in order to choose the language of the HTML page.
The "include" line has the same meaning as with C and is used to include an external file. We use it to load the language specific macro definitions used in the header and footer. Here are its contents according to language:
divert(-1) # File trans-fr.m4 # Definitions for french divert(0) define({_ANGLAIS_},{Anglais})dnl define({_FRANCAIS_},{Fran�ais})dnl define({_ITALIEN_},{Espagnol})dnl define({_ESPAGNOL_},{Italien})dnl define({_ALLEMAND_},{Allemand})dnl define({_WEBMASTER_},{John Perr})dnl define({_MAINTENEUR_},{Page maintenue par _WEBMASTER_})dnl define({_MAJ_},{Date de mise à jour:})dnl
divert(-1) # File trans-en.m4 # Definitions for english divert(0) define({_ANGLAIS_},{English})dnl define({_FRANCAIS_},{French})dnl define({_ITALIEN_},{Spanish})dnl define({_ESPAGNOL_},{Italian})dnl define({_ALLEMAND_},{German})dnl define({_WEBMASTER_},{John Perr})dnl define({_MAINTENEUR_},{Page maintained by _WEBMASTER_})dnl define({_MAJ_},{Page updated on })dnl
If you speak Spanish, Italian or German, you should be able to write similar files for these languages.
The line "undefine" suppresses the default definition of the built-in named "format" because it is not used here. If this line is omitted, each time the word "format" appears within the text will disappear unless it is quoted, i.e., surrounded with curly braces. Such a practice is not advisable when editing a simple web page.
Next comes the definition of the current year. It is obtained from the macro "easyscmd" which calls the unix command "date". This command is also used within the definition of the footer in order to print the date at which the page has been updated.
The following line defines the first of our three main macros: LYNX_TITRE. This macro defines another macro called _TITRE_. This double definition is necessary in order to use the title several times within the header and footer of the page from one definition. Note the use of $1 to refer to the first argument of the macro.
The remaining lines define the two others main macros: LYNX_ENTETE and LYNX_PIED which correspond to the contents of the header and footer of our HTML page except for the variable elements of the page. These are:
The "dnl" which appears at the end of each line is a built-in macro of m4 meaning "Delete to New Line". With "dnl" m4 does not generate an empty line when interpreting a macro.
Now that our system is set, the generation of a web page from the files is done
with the following command:
Where "XX" is the code to use for each language. Note that the -D option is used, as with gcc, to define a macro from the command line.
The table below presents the files and their use in this application.
The following files are used to generate HTML pages:
index-XX.html | The body of the page, that is text written by the author or the translator. It is different for each page and each language. (the code is XX=en for English, es for Spanish, etc...) |
mac.css | Standard definitions. This file is common to all pages and all languages. it can be seen as a sort of style sheet. |
trans-XX.m4 | Standard definitions for one language. this file is common to all pages for one language. (the code is XX=en for English, es for Spanish, etc...) |
Despite its power, the m4 macro processor cannot be compared to a scripting language like Perl or Tcl. Once its few peculiarities have been acquired, it is a quick and handy tool to help process text files. To learn more, consult the documentation bundled with your distribution. You should find a tutorial to m4, about 30 pages long, that covers all the aspects of the GNU/m4 macro processor. You can also have a look at the site of the Linux User Group of Bordeaux (ABUL) which is maintained with a kit of m4 macros, similar to those presented here.
GNU/m4 is available from
ftp://prep.ai.mit.edu/pub/gnu/m4-1.4.tar.gz
Download the files presented here: The Lynx m4 macro kit
Thank you to Paul Kienzle for reviewing this article.