Back to Top

The New C:X Macros

Assembly language programmers of the 60’s had to develop some great tools just to preserve their sanity. Some of those tools, such as X macros, are still potentially useful today.

60年代的汇编程序员不得已要开发一些伟大的工具,仅仅是为了保持一直性。其中一些工具,时至今日依然可能有用处。

Sometimes a good idea just gets lost. Assembly language programmers of the 1960’s invented a clever way to use macros to initialize parallel tables. This technique works perfectly well in C and C++, but seems almost unknown. The technique can also be used in some cases as a replacement for, or as an enhancement to, the C99 feature known as designated initializers.

一些好的思想会随时间而磨灭。19世纪60年代的汇编程序员发明了一种聪明的办法,利用宏来初始化并列的几个表格。 这一技术在C和C++也很好用,但人们却似乎对其知之甚少。这一技术在一些情况下可以替代或增强C99的“分派初始化”特性

Designated Initializers 分派初始化

Designated initializers permit you to name in the brace-enclosed initializer the particular array element or member being initialized when you initialize an aggregate or union. (See my April 2001 column for more details.) As such, designated initializers are a partial solution to the software engineering problem of maintaining an enum type definition and a parallel array giving the names of the enum literals:

分派初始化允许在初始化聚合类型或union时,在大括号括起来的的初始化列表中指定要初始化的数组元素或成员名称。 (具体请参考我在2001年4月份在专栏内的文章。)通过这个办法,软件工程中维护一个枚举类型和一个与之匹配的为枚举值设置字符串名称的问题, 可以通过分派初始化得到部分解决:

#include <stdio.h>

enum COLOR {
    red,
    green,
    blue
};

char *color_name[] = {
    [red]="red",
    [green]="green",
    [blue]="blue"
};

int main() {
    enum COLOR c = red;
    printf("c=%s\n", color_name[c]);
    return 0;
}

Using designated initializers when initializing the array color_name has two advantages. First, it makes the initialization of the array independent of the order that the enum literals are declared in the definition of enum COLOR. Even if you rearrange the definition of the enum literals so that red has the integer value 2, green has the value 0, and blue has the value 1, the array color_name will have its elements initialized with the proper values in the right elements.

用分派初始化来初始化color_name数组时有2个好处: 一,可以使对数组的初始化独立于在COLOR枚举内枚举常出现的顺序。即使你重新排列了枚举常量的顺序,让red具有整数值2, green对应整数0,blue对应1,color_name数组内的元素依然可以正确地初始化。

Second, since the names of the enumeration literals are explicitly used in the initialization of color_name, it serves as a clue that if you add a new enum literal to the definition of COLOR that you should add a new initializer to color_name. (A comment, of course, would also be helpful.) Unfortunately, too often adding a new enum literal to a large program requires searching the program source files for the names of the old enum literals to find all of the switch statements, arrays, and other places that need to be aware of the new enum name. Note that in a large program, it is likely that the definition of enum COLOR would be in a header file along with an extern declaration for array color_name, and the definition and initialization of color_name would be in a separate source file since it must be compiled only once. An explicit reference to an enum literal is thus much better than an implicit one when you have to search several source files for uses of the enum.

二,由于在初始化color_name时,明确使用了枚举常量,这就暗示在为COLOR添加新的枚举常量时,也应该在color_name中添加一个新的初始式。 (当然,一条注释也是有用的)不幸的是,当往一个大的程序内频繁添加新的枚举常量时,就需要搜索整个源代码,查找旧枚举常量使用的地方, 例如switch语句、数组和其它用到该枚举的地方。注意在大型程序内,COLOR枚举的定义通常会和array_name的extern声明放在一个头文件中, color_name的定义和初始化放在另外一个代码文件中,因为这数组必须只编译一次。因此当需要搜索几个文件来查找是否使用了枚举时, 显示地引用一个枚举常量要比隐式地引用要好。

The use of designated initializers by color_name can also be a mixed blessing. Consider if a new enum literal white is added to COLOR between red and green, but no corresponding initialization is made to color_name. The array color_name will have the right number of elements since [blue], the highest numbered element is still initialized. The elements [red], [green], and [blue] will be initialized correctly. However, the element [white] will have not been initialized and will default to a null pointer. (The use of designated initializers does not change the fact that you need not initialize all of the elements of an array, and compilers need not issue a message if you only initialize part of an array.) The program will work as long as you do not attempt to reference the name of white. Hence the mixed blessing: if you manage to put the program with this bug into production, you would probably prefer that it mostly work. However, during testing, you would probably prefer that the program fail spectacularly and obviously.

分派初始化好坏参半。假设在COLOR中,要往red和green之间添加一个新的枚举常量white,但却没有在color_name中添加对应的初始式。 color_name数组仍然会有正确数量的元素,因为具有最高数值的枚举量blue在数组内是被初始化的。[red],[green]和[blue]元素会被正确初始化。 然而,[white]却没有被初始化,从而默认成为一个null指针。 (即使用了分派初始化,也不必初始化数组内的所有元素,编译器也不会因为数组只初始化了一部分而报告消息。) 只要你不使用数组内的white元素,程序就能正常工作。 这就是好坏参半之处:当你把这个bug放入产品时,你可能希望它最好可以工作。而在测试阶段时,你会希望程序会因此而明确失败。

X Macros

A better solution would be to automatically obligate programmers to add the initializer when adding a new enum literal, and to provide a single place to add both, even if the enum definition was in a header file and the array initialization was in a separate source file. The assembly language programmers had a preprocessor-based solution to this problem.

更好的解决办法是自动强制让程序员每添加一个新的枚举常量,就添加一个初始式, 并且在一个地方来把2者都加上 —— 即使枚举定义在一个头文件中,数组初始化在另外一个单独的源文件中。汇编程序员使用预处理器来解决这个问题。

Their solution depended upon features common to most preprocessors for most languages, and that are part of the traditional and standard preprocessors for C and C++. Those features are that macros can call other macros, that macros may be redefined, and that a nested macro call is not expanded until the macro that calls it is expanded during one of its calls. Consider a use of this technique:

他们的解决办法是基于对大多数语言来说都很常见的预处理器,这也属于c和c++传统和标准的预处理器的一部分。 所需要的功能包括嵌套宏调用,可以重定义宏,和内层的宏调用直到调用它们的宏被展开后才展开。考虑该技术的一个应用:

#include <stdio.h>

#define COLOR_TABLE \
    X(red, "red")       \
    X(green, "green")   \
    X(blue, "blue")

#define X(a, b) a,
enum COLOR {
    COLOR_TABLE
};
#undef X

#define X(a, b) b,
char *color_name[] = {
    COLOR_TABLE
};
#undef X

int main() {
    enum COLOR c = red;
    printf("c=%s\n", color_name[c]);
    return 0;
}

In this example, COLOR_TABLE is a macro that expands into a series of calls to a two-argument macro named X. At the time COLOR_TABLE is defined, the macro X has not been defined, but that is okay since the definition of macro X is not needed until COLOR_TABLE is actually used and expanded.

在这个例子中,COLOR_TABLE展开后成为一系列对一个2参数宏X的调用。 在COLOR_TABLED定义时,宏X还没有定义,但这没关系,因为直到COLOR_TABLE实际被使用和展开时才需要宏X的定义。

COLOR_TABLE is expanded twice by this program, and a different definition of the macro X is used by each expansion. The first time COLOR_TABLE is expanded, the macro X is defined to expand to its first argument followed by a comma, and thus the result of expanding COLOR_TABLE is a comma-separated list of the enum literals inside the definition of enum COLOR. After the expansion of COLOR_TABLE, I undefine the macro X, which prevents accidental use of the macro name and also allows X to be redefined later with a different body.

在本程序中,COLOR_TABLE展开了2次,每次使用一份不同的X定义。COLOR_TABLE第一次展开时,宏X被展开为第一个参数跟一个逗号, 使得COLOR_TABLE最终被展开为在COLOR枚举内定义的逗号分隔的一系列枚举常量。 展开COLOR_TABLE后,宏X被解除定义(undef),以避免误用,同时也使得X以后可以被重新定义。

The second time COLOR_TABLE is expanded, the macro X is defined to expand to its second argument followed by a comma, which results in a comma-separated list of string literals that properly initialize the array color_name. Again, after the expansion of COLOR_TABLE, the macro X is undefined to prevent accidental uses of the name.

COLOR_TABLE第2次展开时,X被定义为展开成其第2个参数后跟一个逗号,结果就成为一系列逗号分隔的字符串常量,正好把color_name数组正确地初始化。 在展开COLOR_TABLE后,宏X再次被undef,以避免误用。

The macro COLOR_TABLE, and its use of “X macros,” improves program maintainability in several ways. If you add a new line (X macro call) to COLOR_TABLE in order to declare a new enum literal, you are forced to add the initializer as well. You can reorder the lines in the table, or insert lines, or delete lines, and the color_name array will still be initialized properly. You have one location to edit to add a new enum literal even if the declaration of the enum type is in a header file and the initialization of the array is in a separate source file. For example, the definitions of COLOR_TABLE and enum COLOR could be in a header, and the source file that initializes the array color_name would include that header so it could expand COLOR_TABLE.

COLOR_TABLE宏以及X宏,从多个方面提高了程序的可维护性。如果要往COLOR_TABLE添加一行X宏的调用来声明一个新的枚举常量, 就必须要同时提供初始式。你可以重新排列表格中的行,添加新行或者删除行,color_name数组仍然能够正确初始化。 你只需在一个地方来添加新的枚举常量,即使枚举定义是在一个头文件中,数组的初始化是在一个单独的源文件中。 例如,COLOR_TABLE和COLOR枚举的定义可以放在一个头文件中,而初始化color_name数组的文件则包含该头文件以便能够展开COLOR_TABLE。

There is a potential problem with defining a macro that expands into a series of X macro calls. If the table gets big, you may exceed the compiler limit on the number of characters in a source line when defining the table macro or when expanding the macro. C90 only required that compilers accept lines with fewer than 510 characters after line continuation and macro expansion. C99 increases that limit to 4,095 or fewer characters. In practice, many compilers allow source lines much larger than the minimum limits, but it is wise to avoid defining very large macros.

不过,将一个宏展开为一系列的X宏调用可能会有一个问题。如果表格变得很大,宏展开后可能会超过编译器对于一行代码所允许的最大长度。 C90只要求编译器接受每行代码少于510个字符。C99将这个限制提高到4095。实际中,很多编译器都允许源代码行的长度远远超出标准规定的上限, 但还是应该尽量避免定义过大的宏。

To avoid the line length problem, you can put the X macro calls in a header file and include that header file in the places that would have expanded the table macro, as in the following.

为避免长度问题,可以将X宏调用放在一个头文件中,然后在需要展开表格的地方将头文件包含进来,例如:

// File: color_table.h
X(red, "red")
X(green, "green")
X(blue, "blue")

// File: main.c
#include <stdio.h>

#define X(a, b) a,
enum COLOR {
#include "color_table.h"
};
#undef X

#define X(a, b) b,
char *color_name[] = {
#include "color_table.h"
};
#undef X

int main() {
    enum COLOR c = red;
    printf("c=%s\n", color_name[c]);
    return 0;
}

Putting the X macro calls in a header file also has the advantage that you can use conditional compilation in building the table of X macro calls. For example, if some systems have the color purple instead of red, the header color_table.h could be:

将X宏的调用放在头文件中还有个好处是,可以在构建调用X宏的表时可以使用条件编译。 例如,如果某个系统有紫色而没有红色,头文件color_table.h可以是:

// File: color_table.h
#ifdef NO_RED
    X(purple, "purple")
#else
    X(red, "red")
#endif
    X(green, "green")
    X(blue, "blue")

Note that there is no requirement that X macros take two arguments. The alert reader probably has realized that in the previous examples the second argument to the X macro was a quoted string whose value was the same as the first argument. The X macro calls in COLOR_TABLE could have taken only one argument (the enum name). When COLOR_TABLE was expanded to initialize color_name, the X macro could have been defined to use the stringize operator to create the needed string literal. On the other hand, if you have more that one parallel table indexed by an enum, you might use X macros that take more than two arguments. Since the X macro is defined immediately before expanding calls to it, and undefined immediately afterwards, you might have a table macro containing X macro calls with two arguments and a different table macro containing X macro calls with five arguments. There is no conflict since a different definition of the X macro would be used when expanding the two table macros.

有一点值得注意的是,没有规定说X宏必须接受2个参数。细心的读者可能会发现在前一个例子中,X宏的第2个参数是一个引号括起来的字符串, 其内容和第一个参数相同。这样,在COLOR_TABLE中的X中调用就可以只接受1个参数(枚举名)。当COLOR_TABLE被展开来初始化color_name时, X宏可以使用“字符串化”运算符来创建所需的字符串常量。另一方面,如果需要定义一个以枚举值为索引的多个并列表格,X宏则需要使用多个参数。 因为对X宏的调用紧接在X宏的定义之后,并且在使用完后马上解除定义,这样就可以定义一个表格宏,其中用2个参数调用X宏, 而在另一个表格宏中用5个参数来调用X宏。这样做不会有冲突,因为在不同的表格宏中使用的是不同的X宏。

X macros might expand into something more complex than a copy of one of their arguments. Suppose we wanted to have a small gap in the value of our enum literals. We might use three-argument X macros as follows:

X宏也可以展开为一种更加复杂的形式,而不仅仅是复制某个参数。假设枚举常量之间有一处不是连续的。可以使用一个3参数的X宏,例如:

#define COLOR_TABLE \
    X(green, , "green") \
    X(red, =3, "red")   \
    X(blue, , "blue")

#define X(a, b, c) a b,
enum COLOR {
    COLOR_TABLE
};
#undef X

#define X(a, b, c) [a]=c,
char *color_name[] = {
    COLOR_TABLE
};
#undef X

The second parameter in the X macros calls is an optional initializer for the enum literal itself to set its value. This example also shows the C99 feature wherein a macro argument is allowed to consist of no tokens. Such an argument expands into nothing in the macro body. Thus, the definition for enum COLOR is:

X宏调用的第2个参数是可选的,用来为枚举常量赋初值。这个例子也展示了C99的一个特性,允许一个宏调用的实参为空。 这样一个实参在宏展开后没有任何效果。因此,宏COLOR的定义就是:

enum COLOR {
    green, red=3, blue,
};

Since our enum literals now have gaps in their values, the X macro used during the initialization of color_name expands into designated initializers.

既然我们的枚举常量已经不是连续的了,用于初始化color_name的X宏也就展开为分派初始式了。

The above definition of enum COLOR shows another C99 feature I have been quietly using in this article. C traditionally allows a trailing comma in a bracketed initializer list as an aid to machine-generated source text. C90, however, did not allow a corresponding trailing comma in a enum literal definition list, although many C compilers allowed the comma as an extension. C99 now permits the trailing comma for enum literal definitions. As shown above, the X macro expansions in all of my previous examples generated a trailing comma after the last enum literal definition. If you are using a compiler that does not accept the trailing comma, you can comma-separate the X macro calls in the table macro and not produce a comma in the X macro expansion as in the following.

上面COLOR枚举的定义展示了另外一项我在本文中使用的C99特性。传统的C允许大括号括起来的一系列初始式以逗号结束, 以便帮助机器自动代码生成。然而C90却不允许在枚举常量的定义列表中最后包含一个逗号,虽然很多C编译器都进行了扩展, 允许最后包含一个逗号。C99则正式允许枚举常量定义最后可以是一个逗号。如果编译器不允许这样做, 则可以在表格宏中以逗号来分隔X宏的调用,而在X宏的展开式中则不输出逗号,例如:

#define COLOR_TABLE \
    X(red, "red"),       \
    X(green, "green"),   \
    X(blue, "blue")

#define X(a, b) a
enum COLOR {
    COLOR_TABLE
};
#undef X

Acknowledgments 致谢

The X macro technique was used extensively in the operating system and utilities for the DECsystem-10 as early as 1968, and probably dates back further to PDP-1 and TX-0 programmers at MIT. Alan Martin introduced X macros to me in 1984. I wish to thank Alan Martin, Phil Budne, Bob Clements, Tom Hastings, Alan Kotok, Dave Nixon, and Pete Samson for providing me with historical background for this article.

X宏技术在操作系统和早在1968年的DECsystem-10的工具程序中应用广泛,甚至可以追溯到MIT的PDP-1和TX-0程序员。Alan Martin在1984年向我介绍了X宏。 我希望能够感谢Alan Martin, Phil Budne, Bob Clements, Tom Hastings, Alan Kotok, Dave Nixon, 和Pete Samson,他们为这篇文章提供了历史背景。

Randy Meyers is consultant providing training and mentoring in C, C++, and Java. He is the current chair of J11, the ANSI C committee, and previously was a member of J16 (ANSI C++) and the ISO Java Study Group. He worked on compilers for Digital Equipment Corporation for 16 years and was Project Architect for DEC C and C++. He can be reached at rmeyers@ix.netcom.com.

Randy Meyers是一名提供C、C++和Java培训的顾问。他目前是J11(ANSI C委员会)的主席,以前还是J16(ANSI C++委员会)和ISO Java研究组的成员。 他在DEC公司做了16年的编译器,并且是DEC C和C++的架构师。可以通过rmeyers@ix.netcom.com联系他。

原文地址 作者:Randy Meyers 翻译:李清