The internal representation for expressions is for the most part quite straightforward. However, there are a few facts that one must bear in mind. In particular, the expression "tree" is actually a directed acyclic graph. (For example there may be many references to the integer constant zero throughout the source program; many of these will be represented by the same expression node.) You should not rely on certain kinds of node being shared, nor should rely on certain kinds of nodes being unshared.
The following macros can be used with all expression nodes:
TREE_TYPE
In what follows, some nodes that one might expect to always have type
bool are documented to have either integral or boolean type. At
some point in the future, the C front-end may also make use of this same
intermediate representation, and at this point these nodes will
certainly have integral type. The previous sentence is not meant to
imply that the C++ front-end does not or will not give these nodes
integral type.
Below, we list the various kinds of expression nodes. Except where
noted otherwise, the operands to an expression are accessed using the
TREE_OPERAND macro. For example, to access the first operand to
a binary plus expression expr, use:
TREE_OPERAND (expr, 0)
As this example indicates, the operands are zero-indexed.
The table below begins with constants, moves on to unary expressions, then proceeds to binary expressions, and concludes with various other kinds of expressions:
INTEGER_CST
TREE_TYPE; they are not always of type
int. In particular, char constants are represented with
INTEGER_CST nodes. The value of the integer constant e is
given by @example
((TREE_INT_CST_HIGH (e) << HOST_BITS_PER_WIDE_INT)
+ TREE_INST_CST_LOW (e))
HOST_BITS_PER_WIDE_INT is at least thirty-two on all platforms. Both
TREE_INT_CST_HIGH and TREE_INT_CST_LOW return a
HOST_WIDE_INT. The value of an INTEGER_CST is interpreted
as a signed or unsigned quantity depending on the type of the constant.
In general, the expression given above will overflow, so it should not
be used to calculate the value of the constant.
The variable integer_zero_node is a integer constant with value
zero. Similarly, integer_one_node is an integer constant with
value one. The size_zero_node and size_one_node variables
are analogous, but have type size_t rather than int.
The function tree_int_cst_lt is a predicate which holds if its
first argument is less than its second. Both constants are assumed to
have the same signedness (i.e., either both should be signed or both
should be unsigned.) The full width of the constant is used when doing
the comparison; the usual rules about promotions and conversions are
ignored. Similarly, tree_int_cst_equal holds if the two
constants are equal. The tree_int_cst_sgn function returns the
sign of a constant. The value is 1, 0, or -1
according on whether the constant is greater than, equal to, or less
than zero. Again, the signedness of the constant's type is taken into
account; an unsigned constant is never less than zero, no matter what
its bit-pattern.
REAL_CST
COMPLEX_CST
__complex__ whose parts are constant nodes. The
TREE_REALPART and TREE_IMAGPART return the real and the
imaginary parts respectively.
STRING_CST
TREE_STRING_LENGTH
returns the length of the string, as an int. The
TREE_STRING_POINTER is a char* containing the string
itself. The string may not be NUL-terminated, and it may contain
embedded NUL characters. Therefore, the
TREE_STRING_LENGTH includes the trailing NUL if it is
present.
FIXME: How are wide strings represented?
PTRMEM_CST
PTRMEM_CST_CLASS is the class type (either a RECORD_TYPE
or UNION_TYPE within which the pointer points), and the
PTRMEM_CST_MEMBER is the declaration for the pointed to object.
Note that the DECL_CONTEXT for the PTRMEM_CST_MEMBER is in
general different from from the PTRMEM_CST_CLASS. For example,
given:
struct B { int i; };
struct D : public B {};
int D::*dp = &D::i;
The PTRMEM_CST_CLASS for &D::i is D, even though
the DECL_CONTEXT for the PTRMEM_CST_MEMBER is B,
since B::i is a member of B, not D.
VAR_DECL
NEGATE_EXPR
BIT_NOT_EXPR
TRUTH_NOT_EXPR
PREDECREMENT_EXPR
PREINCREMENT_EXPR
POSTDECREMENT_EXPR
POSTINCREMENT_EXPR
PREDECREMENT_EXPR and
PREINCREMENT_EXPR, the value of the expression is the value
resulting after the increment or decrement; in the case of
POSTDECREMENT_EXPR and POSTINCREMENT_EXPR is the value
before the increment or decrement occurs. The type of the operand, like
that of the result, will be either integral, boolean, or floating-point.
ADDR_EXPR
ADDR_EXPR will be a
LABEL_DECL. The type of such an expression is void*.
If the object addressed is not an lvalue, a temporary is created, and
the address of the temporary is used.
INDIRECT_REF
FIX_TRUNC_EXPR
FLOAT_EXPR
COMPLEX_EXPR
CONJ_EXPR
REALPART_EXPR
IMAGPART_EXPR
NON_LVALUE_EXPR
NOP_EXPR
char* to an
int* does not require any code be generated; such a conversion is
represented by a NOP_EXPR. The single operand is the expression
to be converted. The conversion from a pointer to a reference is also
represented with a NOP_EXPR.
CONVERT_EXPR
NOP_EXPRs, but are used in those
situations where code may need to be generated. For example, if an
int* is converted to an int code may need to be generated
on some platforms. These nodes are never used for C++-specific
conversions, like conversions between pointers to different classes in
an inheritance hierarchy. Any adjustments that need to be made in such
cases are always indicated explicitly. Similarly, a user-defined
conversion is never represented by a CONVERT_EXPR; instead, the
function calls are made explicit.
THROW_EXPR
throw expressions. The single operand is
an expression for the code that should be executed to throw the
exception. However, there is one implicit action not represented in
that expression; namely the call to __throw. This function takes
no arguments. If setjmp/longjmp exceptions are used, the
function __sjthrow is called instead. The normal GCC back-end
uses the function emit_throw to generate this code; you can
examine this function to see what needs to be done.
LSHIFT_EXPR
RSHIFT_EXPR
BIT_IOR_EXPR
BIT_XOR_EXPR
BIT_AND_EXPR
TRUTH_ANDIF_EXPR
TRUTH_ORIF_EXPR
TRUTH_AND_EXPR
TRUTH_OR_EXPR
TRUTH_XOR_EXPR
PLUS_EXPR
MINUS_EXPR
MULT_EXPR
TRUNC_DIV_EXPR
TRUNC_MOD_EXPR
RDIV_EXPR
TRUNC_DIV_EXPR is always rounded towards zero.
The TRUNC_MOD_EXPR of two operands a and b is
always a - a/b where the division is as if computed by a
TRUNC_DIV_EXPR.
ARRAY_REF
EXACT_DIV_EXPR
LT_EXPR
LE_EXPR
GT_EXPR
GE_EXPR
EQ_EXPR
NE_EXPR
MODIFY_EXPR
VAR_DECL, INDIRECT_REF, COMPONENT_REF, or
other lvalue.
These nodes are used to represent not only assignment with `=' but
also compount assignments (like `+='), by reduction to `='
assignment. In other words, the representation for `i += 3' looks
just like that for `i = i + 3'.
INIT_EXPR
MODIFY_EXPR, but are used only when a
variable is initialized, rather than assigned to subsequently.
COMPONENT_REF
FIELD_DECL for the data member.
COMPOUND_EXPR
COND_EXPR
?: expressions. The first operand
is of boolean or integral type. If it evaluates to a non-zero value,
the second operand should be evaluated, and returned as the value of the
expression. Otherwise, the third operand is evaluated, and returned as
the value of the expression. As a GNU extension, the middle operand of
the ?: operator may be omitted in the source, like this:
x ? : 3which is equivalent to
x ? x : 3assuming that
x is an expression without side-effects. However,
in the case that the first operation causes side effects, the
side-effects occur only once. Consumers of the internal representation
do not need to worry about this oddity; the second operand will be
always be present in the internal representation.
CALL_EXPR
POINTER_TYPE. The second argument is a TREE_LIST. The
arguments to the call appear left-to-right in the list. The
TREE_VALUE of each list node contains the expression
corresponding to that argument. (The value of TREE_PURPOSE for
these nodes is unspecified, and should be ignored.) For non-static
member functions, there will be an operand corresponding to the
this pointer. There will always be expressions corresponding to
all of the arguments, even if the function is declared with default
arguments and some arguments are not explicitly provided at the call
sites.
STMT_EXPR
int f() { return ({ int j; j = 3; j + 7; }); }
In other words, an sequence of statements may occur where a single
expression would normally appear. The STMT_EXPR node represents
such an expression. The STMT_EXPR_STMT gives the statement
contained in the expression; this is always a COMPOUND_STMT. The
value of the expression is the value of the last sub-statement in the
COMPOUND_STMT. More precisely, the value is the value computed
by the last EXPR_STMT in the outermost scope of the
COMPOUND_STMT. For example, in:
({ 3; })
the value is 3 while in:
({ if (x) { 3; } })
(represented by a nested COMPOUND_STMT), there is no value. If
the STMT_EXPR does not yield a value, it's type will be
void.
BIND_EXPR
TREE_CHAIN field. These
will never require cleanups. The scope of these variables is just the
body of the BIND_EXPR. The body of the BIND_EXPR is the
second operand.
LOOP_EXPR
LOOP_EXPR_BODY
represents the body of the loop. It should be executed forever, unless
an EXIT_EXPR is encountered.
EXIT_EXPR
LOOP_EXPR. The single operand is the condition; if it is
non-zero, then the loop should be exited. An EXIT_EXPR will only
appear within a LOOP_EXPR.
CLEANUP_POINT_EXPR
CONSTRUCTOR
TREE_LIST. If the TREE_TYPE of the
CONSTRUCTOR is a RECORD_TYPE or UNION_TYPE, then
the TREE_PURPOSE of each node in the TREE_LIST will be a
FIELD_DECL and the TREE_VALUE of each node will be the
expression used to initialize that field. You should not depend on the
fields appearing in any particular order, nor should you assume that all
fields will be represented. Unrepresented fields may be assigned any
value.
If the TREE_TYPE of the CONSTRUCTOR is an
ARRAY_TYPE, then the TREE_PURPOSE of each element in the
TREE_LIST will be an INTEGER_CST. This constant indicates
which element of the array (indexed from zero) is being assigned to;
again, the TREE_VALUE is the corresponding initializer. If the
TREE_PURPOSE is NULL_TREE, then the initializer is for the
next available array element.
Conceptually, before any initialization is done, the entire area of
storage is initialized to zero.
SAVE_EXPR
SAVE_EXPR represents an expression (possibly involving
side-effects) that is used more than once. The side-effects should
occur only the first time the expression is evaluated. Subsequent uses
should just reuse the computed value. The first operand to the
SAVE_EXPR is the expression to evaluate. The side-effects should
be executed where the SAVE_EXPR is first encountered in a
depth-first preorder traversal of the expression tree.
TARGET_EXPR
TARGET_EXPR represents a temporary object. The first operand
is a VAR_DECL for the temporary variable. The second operand is
the initializer for the temporary. The initializer is evaluated, and
copied (bitwise) into the temporary.
Often, a TARGET_EXPR occurs on the right-hand side of an
assignment, or as the second operand to a comma-expression which is
itself the right-hand side of an assignment, etc. In this case, we say
that the TARGET_EXPR is "normal"; otherwise, we say it is
"orphaned". For a normal TARGET_EXPR the temporary variable
should be treated as an alias for the left-hand side of the assignment,
rather than as a new temporary variable.
The third operand to the TARGET_EXPR, if present, is a
cleanup-expression (i.e., destructor call) for the temporary. If this
expression is orphaned, then this expression must be executed when the
statement containing this expression is complete. These cleanups must
always be executed in the order opposite to that in which they were
encountered. Note that if a temporary is created on one branch of a
conditional operator (i.e., in the second or third operand to a
COND_EXPR), the cleanup must be run only if that branch is
actually executed.
See STMT_IS_FULL_EXPR_P for more information about running these
cleanups.
AGGR_INIT_EXPR
AGGR_INIT_EXPR represents the initialization as the return
value of a function call, or as the result of a constructor. An
AGGR_INIT_EXPR will only appear as the second operand of a
TARGET_EXPR. The first operand to the AGGR_INIT_EXPR is
the address of a function to call, just as in a CALL_EXPR. The
second operand are the arguments to pass that function, as a
TREE_LIST, again in a manner similar to that of a
CALL_EXPR. The value of the expression is that returned by the
function.
If AGGR_INIT_VIA_CTOR_P holds of the AGGR_INIT_EXPR, then
the initialization is via a constructor call. The address of the third
operand of the AGGR_INIT_EXPR, which is always a VAR_DECL,
is taken, and this value replaces the first argument in the argument
list. In this case, the value of the expression is the VAR_DECL
given by the third operand to the AGGR_INIT_EXPR; constructors do
not return a value.
Go to the first, previous, next, last section, table of contents.