Thursday, August 2, 2007

declartations and initalisations

Section 1. Declarations and Initializations

1.1: How do you decide which integer type to use?

A: If you might need large values (above 32,767 or below -32,767),
use long. Otherwise, if space is very important (i.e. if there
are large arrays or many structures), use short. Otherwise, use
int. If well-defined overflow characteristics are important and
negative values are not, or if you want to steer clear of sign-
extension problems when manipulating bits or bytes, use one of
the corresponding unsigned types. (Beware when mixing signed
and unsigned values in expressions, though.)

Although character types (especially unsigned char) can be used
as "tiny" integers, doing so is sometimes more trouble than it's
worth, due to unpredictable sign extension and increased code
size. (Using unsigned char can help; see question 12.1 for a
related problem.)

A similar space/time tradeoff applies when deciding between
float and double. None of the above rules apply if the address
of a variable is taken and must have a particular type.

If for some reason you need to declare something with an *exact*
size (usually the only good reason for doing so is when
attempting to conform to some externally-imposed storage layout,
but see question 20.5), be sure to encapsulate the choice behind
an appropriate typedef.

References: K&R1 Sec. 2.2 p. 34; K&R2 Sec. 2.2 p. 36, Sec. A4.2
pp. 195-6, Sec. B11 p. 257; ANSI Sec. 2.2.4.2.1, Sec. 3.1.2.5;
ISO Sec. 5.2.4.2.1, Sec. 6.1.2.5; H&S Secs. 5.1,5.2 pp. 110-114.

1.4: What should the 64-bit type on new, 64-bit machines be?

A: Some vendors of C products for 64-bit machines support 64-bit
long ints. Others fear that too much existing code is written
to assume that ints and longs are the same size, or that one or
the other of them is exactly 32 bits, and introduce a new,
nonstandard, 64-bit long long (or __longlong) type instead.

Programmers interested in writing portable code should therefore
insulate their 64-bit type needs behind appropriate typedefs.
Vendors who feel compelled to introduce a new, longer integral
type should advertise it as being "at least 64 bits" (which is
truly new, a type traditional C does not have), and not "exactly
64 bits."

References: ANSI Sec. F.5.6; ISO Sec. G.5.6.

1.7: What's the best way to declare and define global variables?

A: First, though there can be many "declarations" (and in many
translation units) of a single "global" (strictly speaking,
"external") variable or function, there must be exactly one
"definition". (The definition is the declaration that actually
allocates space, and provides an initialization value, if any.)
The best arrangement is to place each definition in some
relevant .c file, with an external declaration in a header
(".h") file, which is #included wherever the declaration is
needed. The .c file containing the definition should also
#include the same header file, so that the compiler can check
that the definition matches the declarations.

This rule promotes a high degree of portability: it is
consistent with the requirements of the ANSI C Standard, and is
also consistent with most pre-ANSI compilers and linkers. (Unix
compilers and linkers typically use a "common model" which
allows multiple definitions, as long as at most one is
initialized; this behavior is mentioned as a "common extension"
by the ANSI Standard, no pun intended. A few very odd systems
may require an explicit initializer to distinguish a definition
from an external declaration.)

It is possible to use preprocessor tricks to arrange that a line
like

DEFINE(int, i);

need only be entered once in one header file, and turned into a
definition or a declaration depending on the setting of some
macro, but it's not clear if this is worth the trouble.

It's especially important to put global declarations in header
files if you want the compiler to catch inconsistent
declarations for you. In particular, never place a prototype
for an external function in a .c file: it wouldn't generally be
checked for consistency with the definition, and an incompatible
prototype is worse than useless.

See also questions 10.6 and 18.8.

References: K&R1 Sec. 4.5 pp. 76-7; K&R2 Sec. 4.4 pp. 80-1; ANSI
Sec. 3.1.2.2, Sec. 3.7, Sec. 3.7.2, Sec. F.5.11; ISO
Sec. 6.1.2.2, Sec. 6.7, Sec. 6.7.2, Sec. G.5.11; Rationale
Sec. 3.1.2.2; H&S Sec. 4.8 pp. 101-104, Sec. 9.2.3 p. 267; CT&P
Sec. 4.2 pp. 54-56.

1.11: What does extern mean in a function declaration?

A: It can be used as a stylistic hint to indicate that the
function's definition is probably in another source file, but
there is no formal difference between

extern int f();

and

int f();

References: ANSI Sec. 3.1.2.2, Sec. 3.5.1; ISO Sec. 6.1.2.2,
Sec. 6.5.1; Rationale Sec. 3.1.2.2; H&S Secs. 4.3,4.3.1 pp. 75-
6.

1.12: What's the auto keyword good for?

A: Nothing; it's archaic. See also question 20.37.

References: K&R1 Sec. A8.1 p. 193; ANSI Sec. 3.1.2.4,
Sec. 3.5.1; ISO Sec. 6.1.2.4, Sec. 6.5.1; H&S Sec. 4.3 p. 75,
Sec. 4.3.1 p. 76.

1.14: I can't seem to define a linked list successfully. I tried

typedef struct {
char *item;
NODEPTR next;
} *NODEPTR;

but the compiler gave me error messages. Can't a structure in C
contain a pointer to itself?

A: Structures in C can certainly contain pointers to themselves;
the discussion and example in section 6.5 of K&R make this
clear. The problem with the NODEPTR example is that the typedef
has not been defined at the point where the "next" field is
declared. To fix this code, first give the structure a tag
("struct node"). Then, declare the "next" field as a simple
"struct node *", or disentangle the typedef declaration from the
structure definition, or both. One corrected version would be

struct node {
char *item;
struct node *next;
};

typedef struct node *NODEPTR;

and there are at least three other equivalently correct ways of
arranging it.

A similar problem, with a similar solution, can arise when
attempting to declare a pair of typedef'ed mutually referential
structures.

See also question 2.1.

References: K&R1 Sec. 6.5 p. 101; K&R2 Sec. 6.5 p. 139; ANSI
Sec. 3.5.2, Sec. 3.5.2.3, esp. examples; ISO Sec. 6.5.2,
Sec. 6.5.2.3; H&S Sec. 5.6.1 pp. 132-3.

1.21: How do I declare an array of N pointers to functions returning
pointers to functions returning pointers to characters?

A: The first part of this question can be answered in at least
three ways:

1. char *(*(*a[N])())();

2. Build the declaration up incrementally, using typedefs:

typedef char *pc; /* pointer to char */
typedef pc fpc(); /* function returning pointer to char */
typedef fpc *pfpc; /* pointer to above */
typedef pfpc fpfpc(); /* function returning... */
typedef fpfpc *pfpfpc; /* pointer to... */
pfpfpc a[N]; /* array of... */

3. Use the cdecl program, which turns English into C and vice
versa:

cdecl> declare a as array of pointer to function returning
pointer to function returning pointer to char
char *(*(*a[])())()

cdecl can also explain complicated declarations, help with
casts, and indicate which set of parentheses the arguments
go in (for complicated function definitions, like the one
above). Versions of cdecl are in volume 14 of
comp.sources.unix (see question 18.16) and K&R2.

Any good book on C should explain how to read these complicated
C declarations "inside out" to understand them ("declaration
mimics use").

The pointer-to-function declarations in the examples above have
not included parameter type information. When the parameters
have complicated types, declarations can *really* get messy.
(Modern versions of cdecl can help here, too.)

References: K&R2 Sec. 5.12 p. 122; ANSI Sec. 3.5ff (esp.
Sec. 3.5.4); ISO Sec. 6.5ff (esp. Sec. 6.5.4); H&S Sec. 4.5 pp.
85-92, Sec. 5.10.1 pp. 149-50.

1.22: How can I declare a function that can return a pointer to a
function of the same type? I'm building a state machine with
one function for each state, each of which returns a pointer to
the function for the next state. But I can't find a way to
declare the functions.

A: You can't quite do it directly. Either have the function return
a generic function pointer, with some judicious casts to adjust
the types as the pointers are passed around; or have it return a
structure containing only a pointer to a function returning that
structure.

1.25: My compiler is complaining about an invalid redeclaration of a
function, but I only define it once and call it once.

A: Functions which are called without a declaration in scope
(perhaps because the first call precedes the function's
definition) are assumed to be declared as returning int (and
without any argument type information), leading to discrepancies
if the function is later declared or defined otherwise. Non-int
functions must be declared before they are called.

Another possible source of this problem is that the function has
the same name as another one declared in some header file.

See also questions 11.3 and 15.1.

References: K&R1 Sec. 4.2 p. 70; K&R2 Sec. 4.2 p. 72; ANSI
Sec. 3.3.2.2; ISO Sec. 6.3.2.2; H&S Sec. 4.7 p. 101.

1.30: What can I safely assume about the initial values of variables
which are not explicitly initialized? If global variables start
out as "zero," is that good enough for null pointers and
floating-point zeroes?

A: Variables with "static" duration (that is, those declared
outside of functions, and those declared with the storage class
static), are guaranteed initialized (just once, at program
startup) to zero, as if the programmer had typed "= 0".
Therefore, such variables are initialized to the null pointer
(of the correct type; see also section 5) if they are pointers,
and to 0.0 if they are floating-point.

Variables with "automatic" duration (i.e. local variables
without the static storage class) start out containing garbage,
unless they are explicitly initialized. (Nothing useful can be
predicted about the garbage.)

Dynamically-allocated memory obtained with malloc() and
realloc() is also likely to contain garbage, and must be
initialized by the calling program, as appropriate. Memory
obtained with calloc() is all-bits-0, but this is not
necessarily useful for pointer or floating-point values (see
question 7.31, and section 5).

References: K&R1 Sec. 4.9 pp. 82-4; K&R2 Sec. 4.9 pp. 85-86;
ANSI Sec. 3.5.7, Sec. 4.10.3.1, Sec. 4.10.5.3; ISO Sec. 6.5.7,
Sec. 7.10.3.1, Sec. 7.10.5.3; H&S Sec. 4.2.8 pp. 72-3, Sec. 4.6
pp. 92-3, Sec. 4.6.2 pp. 94-5, Sec. 4.6.3 p. 96, Sec. 16.1 p.
386.

1.31: This code, straight out of a book, isn't compiling:

f()
{
char a[] = "Hello, world!";
}

A: Perhaps you have a pre-ANSI compiler, which doesn't allow
initialization of "automatic aggregates" (i.e. non-static local
arrays, structures, and unions). As a workaround, you can make
the array global or static (if you won't need a fresh copy
during any subsequent calls), or replace it with a pointer (if
the array won't be written to). (You can always initialize
local char * variables to point to string literals, but see
question 1.32 below.) If neither of these conditions hold,
you'll have to initialize the array by hand with strcpy() when
f() is called. See also question 11.29.

1.32: What is the difference between these initializations?

char a[] = "string literal";
char *p = "string literal";

My program crashes if I try to assign a new value to p[i].

A: A string literal can be used in two slightly different ways. As
an array initializer (as in the declaration of char a[]), it
specifies the initial values of the characters in that array.
Anywhere else, it turns into an unnamed, static array of
characters, which may be stored in read-only memory, which is
why you can't safely modify it. In an expression context, the
array is converted at once to a pointer, as usual (see section
6), so the second declaration initializes p to point to the
unnamed array's first element.

(For compiling old code, some compilers have a switch
controlling whether strings are writable or not.)

See also questions 1.31, 6.1, 6.2, and 6.8.

References: K&R2 Sec. 5.5 p. 104; ANSI Sec. 3.1.4, Sec. 3.5.7;
ISO Sec. 6.1.4, Sec. 6.5.7; Rationale Sec. 3.1.4; H&S Sec. 2.7.4
pp. 31-2.

1.34: I finally figured out the syntax for declaring pointers to
functions, but now how do I initialize one?

A: Use something like

extern int func();
int (*fp)() = func;

When the name of a function appears in an expression like this,
it "decays" into a pointer (that is, it has its address
implicitly taken), much as an array name does.

An explicit declaration for the function is normally needed,
since implicit external function declaration does not happen in
this case (because the function name in the initialization is
not part of a function call).

See also question 4.12.

No comments: