Thursday, August 2, 2007

null pointers

Section 5. Null Pointers

5.1: What is this infamous null pointer, anyway?

A: The language definition states that for each pointer type, there
is a special value -- the "null pointer" -- which is
distinguishable from all other pointer values and which is
"guaranteed to compare unequal to a pointer to any object or
function." That is, the address-of operator & will never yield
a null pointer, nor will a successful call to malloc().
(malloc() does return a null pointer when it fails, and this is
a typical use of null pointers: as a "special" pointer value
with some other meaning, usually "not allocated" or "not
pointing anywhere yet.")

A null pointer is conceptually different from an uninitialized
pointer. A null pointer is known not to point to any object or
function; an uninitialized pointer might point anywhere. See
also questions 1.30, 7.1, and 7.31.

As mentioned above, there is a null pointer for each pointer
type, and the internal values of null pointers for different
types may be different. Although programmers need not know the
internal values, the compiler must always be informed which type
of null pointer is required, so that it can make the distinction
if necessary (see questions 5.2, 5.5, and 5.6 below).

References: K&R1 Sec. 5.4 pp. 97-8; K&R2 Sec. 5.4 p. 102; ANSI
Sec. 3.2.2.3; ISO Sec. 6.2.2.3; Rationale Sec. 3.2.2.3; H&S
Sec. 5.3.2 pp. 121-3.

5.2: How do I get a null pointer in my programs?

A: According to the language definition, a constant 0 in a pointer
context is converted into a null pointer at compile time. That
is, in an initialization, assignment, or comparison when one
side is a variable or expression of pointer type, the compiler
can tell that a constant 0 on the other side requests a null
pointer, and generate the correctly-typed null pointer value.
Therefore, the following fragments are perfectly legal:

char *p = 0;
if(p != 0)

(See also question 5.3.)

However, an argument being passed to a function is not
necessarily recognizable as a pointer context, and the compiler
may not be able to tell that an unadorned 0 "means" a null
pointer. To generate a null pointer in a function call context,
an explicit cast may be required, to force the 0 to be
recognized as a pointer. For example, the Unix system call
execl takes a variable-length, null-pointer-terminated list of
character pointer arguments, and is correctly called like this:

execl("/bin/sh", "sh", "-c", "date", (char *)0);

If the (char *) cast on the last argument were omitted, the
compiler would not know to pass a null pointer, and would pass
an integer 0 instead. (Note that many Unix manuals get this
example wrong .)

When function prototypes are in scope, argument passing becomes
an "assignment context," and most casts may safely be omitted,
since the prototype tells the compiler that a pointer is
required, and of which type, enabling it to correctly convert an
unadorned 0. Function prototypes cannot provide the types for
variable arguments in variable-length argument lists however, so
explicit casts are still required for those arguments. (See
also question 15.3.) It is safest to properly cast all null
pointer constants in function calls: to guard against varargs
functions or those without prototypes, to allow interim use of
non-ANSI compilers, and to demonstrate that you know what you
are doing. (Incidentally, it's also a simpler rule to
remember.)

Summary:

Unadorned 0 okay: Explicit cast required:

initialization function call,
no prototype in scope
assignment
variable argument in
comparison varargs function call

function call,
prototype in scope,
fixed argument

References: K&R1 Sec. A7.7 p. 190, Sec. A7.14 p. 192; K&R2
Sec. A7.10 p. 207, Sec. A7.17 p. 209; ANSI Sec. 3.2.2.3; ISO
Sec. 6.2.2.3; H&S Sec. 4.6.3 p. 95, Sec. 6.2.7 p. 171.

5.3: Is the abbreviated pointer comparison "if(p)" to test for non-
null pointers valid? What if the internal representation for
null pointers is nonzero?

A: When C requires the Boolean value of an expression (in the if,
while, for, and do statements, and with the &&, ||, !, and ?:
operators), a false value is inferred when the expression
compares equal to zero, and a true value otherwise. That is,
whenever one writes

if(expr)

where "expr" is any expression at all, the compiler essentially
acts as if it had been written as

if((expr) != 0)

Substituting the trivial pointer expression "p" for "expr," we
have

if(p) is equivalent to if(p != 0)

and this is a comparison context, so the compiler can tell that
the (implicit) 0 is actually a null pointer constant, and use
the correct null pointer value. There is no trickery involved
here; compilers do work this way, and generate identical code
for both constructs. The internal representation of a null
pointer does *not* matter.

The boolean negation operator, !, can be described as follows:

!expr is essentially equivalent to (expr)?0:1
or to ((expr) == 0)

which leads to the conclusion that

if(!p) is equivalent to if(p == 0)

"Abbreviations" such as if(p), though perfectly legal, are
considered by some to be bad style (and by others to be good
style; see question 17.10).

See also question 9.2.

References: K&R2 Sec. A7.4.7 p. 204; ANSI Sec. 3.3.3.3,
Sec. 3.3.9, Sec. 3.3.13, Sec. 3.3.14, Sec. 3.3.15, Sec. 3.6.4.1,
Sec. 3.6.5; ISO Sec. 6.3.3.3, Sec. 6.3.9, Sec. 6.3.13,
Sec. 6.3.14, Sec. 6.3.15, Sec. 6.6.4.1, Sec. 6.6.5; H&S
Sec. 5.3.2 p. 122.

5.4: What is NULL and how is it #defined?

A: As a matter of style, many programmers prefer not to have
unadorned 0's scattered through their programs. Therefore, the
preprocessor macro NULL is #defined (by or )
with the value 0, possibly cast to (void *) (see also question
5.6). A programmer who wishes to make explicit the distinction
between 0 the integer and 0 the null pointer constant can then
use NULL whenever a null pointer is required.

Using NULL is a stylistic convention only; the preprocessor
turns NULL back into 0 which is then recognized by the compiler,
in pointer contexts, as before. In particular, a cast may still
be necessary before NULL (as before 0) in a function call
argument. The table under question 5.2 above applies for NULL
as well as 0 (an unadorned NULL is equivalent to an unadorned
0).

NULL should *only* be used for pointers; see question 5.9.

References: K&R1 Sec. 5.4 pp. 97-8; K&R2 Sec. 5.4 p. 102; ANSI
Sec. 4.1.5, Sec. 3.2.2.3; ISO Sec. 7.1.6, Sec. 6.2.2.3;
Rationale Sec. 4.1.5; H&S Sec. 5.3.2 p. 122, Sec. 11.1 p. 292.


5.5: How should NULL be defined on a machine which uses a nonzero bit
pattern as the internal representation of a null pointer?

A: The same as on any other machine: as 0 (or ((void *)0)).

Whenever a programmer requests a null pointer, either by writing
"0" or "NULL," it is the compiler's responsibility to generate
whatever bit pattern the machine uses for that null pointer.
Therefore, #defining NULL as 0 on a machine for which internal
null pointers are nonzero is as valid as on any other: the
compiler must always be able to generate the machine's correct
null pointers in response to unadorned 0's seen in pointer
contexts. See also questions 5.2, 5.10, and 5.17.

References: ANSI Sec. 4.1.5; ISO Sec. 7.1.6; Rationale
Sec. 4.1.5.

5.6: If NULL were defined as follows:

#define NULL ((char *)0)

wouldn't that make function calls which pass an uncast NULL
work?

A: Not in general. The problem is that there are machines which
use different internal representations for pointers to different
types of data. The suggested definition would make uncast NULL
arguments to functions expecting pointers to characters work
correctly, but pointer arguments of other types would still be
problematical, and legal constructions such as

FILE *fp = NULL;

could fail.

Nevertheless, ANSI C allows the alternate definition

#define NULL ((void *)0)

for NULL. Besides potentially helping incorrect programs to
work (but only on machines with homogeneous pointers, thus
questionably valid assistance), this definition may catch
programs which use NULL incorrectly (e.g. when the ASCII NUL
character was really intended; see question 5.9).

References: Rationale Sec. 4.1.5.

5.9: If NULL and 0 are equivalent as null pointer constants, which
should I use?

A: Many programmers believe that NULL should be used in all pointer
contexts, as a reminder that the value is to be thought of as a
pointer. Others feel that the confusion surrounding NULL and 0
is only compounded by hiding 0 behind a macro, and prefer to use
unadorned 0 instead. There is no one right answer. (See also
questions 9.2 and 17.10.) C programmers must understand that
NULL and 0 are interchangeable in pointer contexts, and that an
uncast 0 is perfectly acceptable. Any usage of NULL (as opposed
to 0) should be considered a gentle reminder that a pointer is
involved; programmers should not depend on it (either for their
own understanding or the compiler's) for distinguishing pointer
0's from integer 0's.

NULL should *not* be used when another kind of 0 is required,
even though it might work, because doing so sends the wrong
stylistic message. (Furthermore, ANSI allows the definition of
NULL to be ((void *)0), which will not work at all in non-
pointer contexts.) In particular, do not use NULL when the
ASCII null character (NUL) is desired. Provide your own
definition

#define NUL '\0'

if you must.

References: K&R1 Sec. 5.4 pp. 97-8; K&R2 Sec. 5.4 p. 102.

5.10: But wouldn't it be better to use NULL (rather than 0), in case
the value of NULL changes, perhaps on a machine with nonzero
internal null pointers?

A: No. (Using NULL may be preferable, but not for this reason.)
Although symbolic constants are often used in place of numbers
because the numbers might change, this is *not* the reason that
NULL is used in place of 0. Once again, the language guarantees
that source-code 0's (in pointer contexts) generate null
pointers. NULL is used only as a stylistic convention. See
questions 5.5 and 9.2.

5.12: I use the preprocessor macro

#define Nullptr(type) (type *)0

to help me build null pointers of the correct type.

A: This trick, though popular and superficially attractive, does
not buy much. It is not needed in assignments and comparisons;
see question 5.2. It does not even save keystrokes. Its use
may suggest to the reader that the program's author is shaky on
the subject of null pointers, requiring that the #definition of
the macro, its invocations, and *all* other pointer usages be
checked. See also questions 9.1 and 10.2.

5.13: This is strange. NULL is guaranteed to be 0, but the null
pointer is not?

A: When the term "null" or "NULL" is casually used, one of several
things may be meant:

1. The conceptual null pointer, the abstract language concept
defined in question 5.1. It is implemented with...

2. The internal (or run-time) representation of a null
pointer, which may or may not be all-bits-0 and which may
be different for different pointer types. The actual
values should be of concern only to compiler writers.
Authors of C programs never see them, since they use...

3. The null pointer constant, which is a constant integer 0
(see question 5.2). It is often hidden behind...

4. The NULL macro, which is #defined to be "0" or
"((void *)0)" (see question 5.4). Finally, as red
herrings, we have...

5. The ASCII null character (NUL), which does have all bits
zero, but has no necessary relation to the null pointer
except in name; and...

6. The "null string," which is another name for the empty
string (""). Using the term "null string" can be
confusing in C, because an empty string involves a null
('\0') character, but *not* a null pointer, which brings
us full circle...

This article uses the phrase "null pointer" (in lower case) for
sense 1, the character "0" or the phrase "null pointer constant"
for sense 3, and the capitalized word "NULL" for sense 4.

5.14: Why is there so much confusion surrounding null pointers? Why
do these questions come up so often?

A: C programmers traditionally like to know more than they need to
about the underlying machine implementation. The fact that null
pointers are represented both in source code, and internally to
most machines, as zero invites unwarranted assumptions. The use
of a preprocessor macro (NULL) may seem to suggest that the
value could change some day, or on some weird machine. The
construct "if(p == 0)" is easily misread as calling for
conversion of p to an integral type, rather than 0 to a pointer
type, before the comparison. Finally, the distinction between
the several uses of the term "null" (listed in question 5.13
above) is often overlooked.

One good way to wade out of the confusion is to imagine that C
used a keyword (perhaps "nil", like Pascal) as a null pointer
constant. The compiler could either turn "nil" into the correct
type of null pointer when it could determine the type from the
source code, or complain when it could not. Now in fact, in C
the keyword for a null pointer constant is not "nil" but "0",
which works almost as well, except that an uncast "0" in a non-
pointer context generates an integer zero instead of an error
message, and if that uncast 0 was supposed to be a null pointer
constant, the code may not work.

5.15: I'm confused. I just can't understand all this null pointer
stuff.

A: Follow these two simple rules:

1. When you want a null pointer constant in source code,
use "0" or "NULL".

2. If the usage of "0" or "NULL" is an argument in a
function call, cast it to the pointer type expected by
the function being called.

The rest of the discussion has to do with other people's
misunderstandings, with the internal representation of null
pointers (which you shouldn't need to know), and with ANSI C
refinements. Understand questions 5.1, 5.2, and 5.4, and
consider 5.3, 5.9, 5.13, and 5.14, and you'll do fine.

5.16: Given all the confusion surrounding null pointers, wouldn't it
be easier simply to require them to be represented internally by
zeroes?

A: If for no other reason, doing so would be ill-advised because it
would unnecessarily constrain implementations which would
otherwise naturally represent null pointers by special, nonzero
bit patterns, particularly when those values would trigger
automatic hardware traps for invalid accesses.

Besides, what would such a requirement really accomplish?
Proper understanding of null pointers does not require knowledge
of the internal representation, whether zero or nonzero.
Assuming that null pointers are internally zero does not make
any code easier to write (except for a certain ill-advised usage
of calloc(); see question 7.31). Known-zero internal pointers
would not obviate casts in function calls, because the *size* of
the pointer might still be different from that of an int. (If
"nil" were used to request null pointers, as mentioned in
question 5.14 above, the urge to assume an internal zero
representation would not even arise.)

5.17: Seriously, have any actual machines really used nonzero null
pointers, or different representations for pointers to different
types?

A: The Prime 50 series used segment 07777, offset 0 for the null
pointer, at least for PL/I. Later models used segment 0, offset
0 for null pointers in C, necessitating new instructions such as
TCNP (Test C Null Pointer), evidently as a sop to all the extant
poorly-written C code which made incorrect assumptions. Older,
word-addressed Prime machines were also notorious for requiring
larger byte pointers (char *'s) than word pointers (int *'s).

The Eclipse MV series from Data General has three
architecturally supported pointer formats (word, byte, and bit
pointers), two of which are used by C compilers: byte pointers
for char * and void *, and word pointers for everything else.

Some Honeywell-Bull mainframes use the bit pattern 06000 for
(internal) null pointers.

The CDC Cyber 180 Series has 48-bit pointers consisting of a
ring, segment, and offset. Most users (in ring 11) have null
pointers of 0xB00000000000. It was common on old CDC ones-
complement machines to use an all-one-bits word as a special
flag for all kinds of data, including invalid addresses.

The old HP 3000 series uses a different addressing scheme for
byte addresses than for word addresses; like several of the
machines above it therefore uses different representations for
char * and void * pointers than for other pointers.

The Symbolics Lisp Machine, a tagged architecture, does not even
have conventional numeric pointers; it uses the pair
(basically a nonexistent handle) as a C null
pointer.

Depending on the "memory model" in use, 8086-family processors
(PC compatibles) may use 16-bit data pointers and 32-bit
function pointers, or vice versa.

Some 64-bit Cray machines represent int * in the lower 48 bits
of a word; char * additionally uses the upper 16 bits to
indicate a byte address within a word.

References: K&R1 Sec. A14.4 p. 211.

5.20: What does a run-time "null pointer assignment" error mean? How
do I track it down?

A: This message, which typically occurs with MS-DOS compilers (see,
therefore, section 19) means that you've written, via a null
(perhaps because uninitialized) pointer, to location 0. (See
also question 16.8.)

A debugger may let you set a data breakpoint or watchpoint or
something on location 0. Alternatively, you could write a bit
of code to stash away a copy of 20 or so bytes from location 0,
and periodically check that the memory at location 0 hasn't
changed.

No comments: