The central data structure used by the internal representation is the
tree
. These nodes, while all of the C type tree
, are of
many varieties. A tree
is a pointer type, but the object to
which it points may be of a variety of types. From this point forward,
we will refer to trees in ordinary type, rather than in this
font
, except when talking about the actual C type tree
.
You can tell what kind of node a particular tree is by using the
TREE_CODE
macro. Many, many macros take a trees as input and
return trees as output. However, most macros require a certain kinds of
tree node as input. In other words, there is a type-system for trees,
but it is not reflected in the C type-system.
For safety, it is useful to configure G++ with --enable-checking
.
Although this results in a significant performance penalty (since all
tree types are checked at run-time), and is therefore inappropriate in a
release version, it is extremely helpful during the development process.
Many macros behave as predicates. Many, although not all, of these
predicates end in `_P'. Do not rely on the result type of these
macros being of any particular type. You may, however, rely on the fact
that the type can be compared to 0
, so that statements like
if (TEST_P (t) && !TEST_P (y)) x = 1;
and
int i = (TEST_P (t) != 0);
are legal. Macros that return int
values now may be changed to
return tree
values, or other pointers in the future. Even those
that continue to return int
may return multiple non-zero codes
where previously they returned only zero and one. Therefore, you should
not write code like
if (TEST_P (t) == 1)
as this code is not guaranteed to work correctly in the future.
You should not take the address of values returned by the macros or functions described here. In particular, no guarantee is given that the values are lvalues.
In general, the names of macros are all in uppercase, while the names of functions are entirely in lower case. There are rare exceptions to this rule. You should assume that any macro or function whose name is made up entirely of uppercase letters may evaluate its arguments more than once. You may assume that a macro or function whose name is made up entirely of lowercase letters will evaluate its arguments only once.
The error_mark_node
is a special tree. Its tree code is
ERROR_MARK
, but since there is only ever one node with that code,
the usual practice is to compare the tree against
error_mark_node
. (This test is just a test for pointer
equality.) If an error has occurred during front-end processing the
flag errorcount
will be set. If the front-end has encountered
code it cannot handle, it will issue a message to the user and set
sorrycount
. When these flags are set, any macro or function
which normally returns a tree of a particular kind may instead return
the error_mark_node
. Thus, if you intend to do any processing of
erroneous code, you must be prepared to deal with the
error_mark_node
.
Occasionally, a particular tree slot (like an operand to an expression, or a particular field in a declaration) will be referred to as "reserved for the back-end." These slots are used to store RTL when the tree is converted to RTL for use by the GCC back-end. However, if that process is not taking place (e.g., if the front-end is being hooked up to an intelligent editor), then those slots may be used by the back-end presently in use.
If you encounter situations that do not match this documentation, such as tree nodes of types not mentioned here, or macros documented to return entities of a particular kind that instead return entities of some different kind, you have found a bug, either in the front-end or in the documentation. Please report these bugs as you would any other bug.
This section is not here yet.
An IDENTIFIER_NODE
represents a slightly more general concept
that the standard C or C++ concept of identifier. In particular, an
IDENTIFIER_NODE
may contain a `$', or other extraordinary
characters.
There are never two distinct IDENTIFIER_NODE
s representing the
same identifier. Therefore, you may use pointer equality to compare
IDENTIFIER_NODE
s, rather than using a routine like strcmp
.
You can use the following macros to access identifiers:
IDENTIFIER_POINTER
char*
. This string is always NUL
-terminated, and contains
no embedded NUL
characters.
IDENTIFIER_LENGTH
IDENTIFIER_POINTER
, not
including the trailing NUL
. This value of
IDENTIFIER_POINTER (x)
is always the same as strlen
(IDENTIFIER_POINTER (x))
.
IDENTIFIER_OPNAME_P
IDENTIFIER_POINTER
or the
IDENTIFIER_LENGTH
.
IDENTIFIER_TYPENAME_P
TREE_TYPE
of
the IDENTIFIER_NODE
holds the type to which the conversion
operator converts.
Two common container data structures can be represented directly with
tree nodes. A TREE_LIST
is a singly linked list containing two
trees per node. These are the TREE_PURPOSE
and TREE_VALUE
of each node. (Often, the TREE_PURPOSE
contains some kind of
tag, or additional information, while the TREE_VALUE
contains the
majority of the payload. In other cases, the TREE_PURPOSE
is
simply NULL_TREE
, while in still others both the
TREE_PURPOSE
and TREE_VALUE
are of equal stature.) Given
one TREE_LIST
node, the next node is found by following the
TREE_CHAIN
. If the TREE_CHAIN
is NULL_TREE
, then
you have reached the end of the list.
A TREE_VEC
is a simple vector. The TREE_VEC_LENGTH
is an
integer (not a tree) giving the number of nodes in the vector. The
nodes themselves are accessed using the TREE_VEC_ELT
macro, which
takes two arguments. The first is the TREE_VEC
in question; the
second is an integer indicating which element in the vector is desired.
The elements are indexed from zero.
Go to the first, previous, next, last section, table of contents.