Thursday, August 2, 2007

memory allocation

Section 7. Memory Allocation

7.1: Why doesn't this fragment work?

char *answer;
printf("Type something:\n");
gets(answer);
printf("You typed \"%s\"\n", answer);

A: The pointer variable answer(), which is handed to gets() as the
location into which the response should be stored, has not been
set to point to any valid storage. That is, we cannot say where
the pointer answer() points. (Since local variables are not
initialized, and typically contain garbage, it is not even
guaranteed that answer() starts out as a null pointer. See
questions 1.30 and 5.1.)

The simplest way to correct the question-asking program is to
use a local array, instead of a pointer, and let the compiler
worry about allocation:

#include
#include

char answer[100], *p;
printf("Type something:\n");
fgets(answer, sizeof answer, stdin);
if((p = strchr(answer, '\n')) != NULL)
*p = '\0';
printf("You typed \"%s\"\n", answer);

This example also uses fgets() instead of gets(), so that the
end of the array cannot be overwritten. (See question 12.23.
Unfortunately for this example, fgets() does not automatically
delete the trailing \n, as gets() would.) It would also be
possible to use malloc() to allocate the answer buffer.

7.2: I can't get strcat() to work. I tried

char *s1 = "Hello, ";
char *s2 = "world!";
char *s3 = strcat(s1, s2);

but I got strange results.

A: As in question 7.1 above, the main problem here is that space
for the concatenated result is not properly allocated. C does
not provide an automatically-managed string type. C compilers
only allocate memory for objects explicitly mentioned in the
source code (in the case of "strings," this includes character
arrays and string literals). The programmer must arrange for
sufficient space for the results of run-time operations such as
string concatenation, typically by declaring arrays, or by
calling malloc().

strcat() performs no allocation; the second string is appended
to the first one, in place. Therefore, one fix would be to
declare the first string as an array:

char s1[20] = "Hello, ";

Since strcat() returns the value of its first argument (s1, in
this case), the variable s3 is superfluous.

The original call to strcat() in the question actually has two
problems: the string literal pointed to by s1, besides not being
big enough for any concatenated text, is not necessarily
writable at all. See question 1.32.

References: CT&P Sec. 3.2 p. 32.

7.3: But the man page for strcat() says that it takes two char *'s as
arguments. How am I supposed to know to allocate things?

A: In general, when using pointers you *always* have to consider
memory allocation, if only to make sure that the compiler is
doing it for you. If a library function's documentation does
not explicitly mention allocation, it is usually the caller's
problem.

The Synopsis section at the top of a Unix-style man page or in
the ANSI C standard can be misleading. The code fragments
presented there are closer to the function definitions used by
an implementor than the invocations used by the caller. In
particular, many functions which accept pointers (e.g. to
structures or strings) are usually called with the address of
some object (a structure, or an array -- see questions 6.3 and
6.4). Other common examples are time() (see question 13.12)
and stat().

7.5: I have a function that is supposed to return a string, but when
it returns to its caller, the returned string is garbage.

A: Make sure that the pointed-to memory is properly allocated. The
returned pointer should be to a statically-allocated buffer, or
to a buffer passed in by the caller, or to memory obtained with
malloc(), but *not* to a local (automatic) array. In other
words, never do something like

char *itoa(int n)
{
char retbuf[20]; /* WRONG */
sprintf(retbuf, "%d", n);
return retbuf; /* WRONG */
}

One fix (which is imperfect, especially if the function in
question is called recursively, or if several of its return
values are needed simultaneously) would be to declare the return
buffer as

static char retbuf[20];

See also questions 12.21 and 20.1.

References: ANSI Sec. 3.1.2.4; ISO Sec. 6.1.2.4.

7.6: Why am I getting "warning: assignment of pointer from integer
lacks a cast" for calls to malloc()?

A: Have you #included , or otherwise arranged for
malloc() to be declared properly?

References: H&S Sec. 4.7 p. 101.

7.7: Why does some code carefully cast the values returned by malloc
to the pointer type being allocated?

A: Before ANSI/ISO Standard C introduced the void * generic pointer
type, these casts were typically required to silence warnings
(and perhaps induce conversions) when assigning between
incompatible pointer types. (Under ANSI/ISO Standard C, these
casts are no longer necessary.)

References: H&S Sec. 16.1 pp. 386-7.

7.8: I see code like

char *p = malloc(strlen(s) + 1);
strcpy(p, s);

Shouldn't that be malloc((strlen(s) + 1) * sizeof(char))?

A: It's never necessary to multiply by sizeof(char), since
sizeof(char) is, by definition, exactly 1. (On the other hand,
multiplying by sizeof(char) doesn't hurt, and may help by
introducing a size_t into the expression.) See also question
8.9.

References: ANSI Sec. 3.3.3.4; ISO Sec. 6.3.3.4; H&S Sec. 7.5.2
p. 195.

7.14: I've heard that some operating systems don't actually allocate
malloc'ed memory until the program tries to use it. Is this
legal?

A: It's hard to say. The Standard doesn't say that systems can act
this way, but it doesn't explicitly say that they can't, either.

References: ANSI Sec. 4.10.3; ISO Sec. 7.10.3.

7.16: I'm allocating a large array for some numeric work, using the
line

double *array = malloc(256 * 256 * sizeof(double));

malloc() isn't returning null, but the program is acting
strangely, as if it's overwriting memory, or malloc() isn't
allocating as much as I asked for, or something.

A: Notice that 256 x 256 is 65,536, which will not fit in a 16-bit
int, even before you multiply it by sizeof(double). If you need
to allocate this much memory, you'll have to be careful. If
size_t (the type accepted by malloc()) is a 32-bit type on your
machine, but int is 16 bits, you might be able to get away with
writing 256 * (256 * sizeof(double)) (see question 3.14).
Otherwise, you'll have to break your data structure up into
smaller chunks, or use a 32-bit machine, or use some nonstandard
memory allocation routines. See also question 19.23.

7.17: I've got 8 meg of memory in my PC. Why can I only seem to
malloc() 640K or so?

A: Under the segmented architecture of PC compatibles, it can be
difficult to use more than 640K with any degree of transparency.
See also question 19.23.

7.19: My program is crashing, apparently somewhere down inside malloc,
but I can't see anything wrong with it.

A: It is unfortunately very easy to corrupt malloc's internal data
structures, and the resulting problems can be stubborn. The
most common source of problems is writing more to a malloc'ed
region than it was allocated to hold; a particularly common bug
is to malloc(strlen(s)) instead of strlen(s) + 1. Other
problems may involve using pointers to freed storage, freeing
pointers twice, freeing pointers not obtained from malloc, or
trying to realloc a null pointer (see question 7.30).

See also questions 7.26, 16.8, and 18.2.

7.20: You can't use dynamically-allocated memory after you free it,
can you?

A: No. Some early documentation for malloc() stated that the
contents of freed memory were "left undisturbed," but this ill-
advised guarantee was never universal and is not required by the
C Standard.

Few programmers would use the contents of freed memory
deliberately, but it is easy to do so accidentally. Consider
the following (correct) code for freeing a singly-linked list:

struct list *listp, *nextp;
for(listp = base; listp != NULL; listp = nextp) {
nextp = listp->next;
free((void *)listp);
}

and notice what would happen if the more-obvious loop iteration
expression listp = listp->next were used, without the temporary
nextp pointer.

References: K&R2 Sec. 7.8.5 p. 167; ANSI Sec. 4.10.3; ISO
Sec. 7.10.3; Rationale Sec. 4.10.3.2; H&S Sec. 16.2 p. 387; CT&P
Sec. 7.10 p. 95.

7.21: Why isn't a pointer null after calling free()?
How unsafe is it to use (assign, compare) a pointer value after
it's been freed?

A: When you call free(), the memory pointed to by the passed
pointer is freed, but the value of the pointer in the caller
remains unchanged, because C's pass-by-value semantics mean that
called functions never permanently change the values of their
arguments. (See also question 4.8.)

A pointer value which has been freed is, strictly speaking,
invalid, and *any* use of it, even if is not dereferenced can
theoretically lead to trouble, though as a quality of
implementation issue, most implementations will probably not go
out of their way to generate exceptions for innocuous uses of
invalid pointers.

References: ANSI Sec. 4.10.3; ISO Sec. 7.10.3; Rationale
Sec. 3.2.2.3.

7.22: When I call malloc() to allocate memory for a local pointer, do
I have to explicitly free() it?

A: Yes. Remember that a pointer is different from what it points
to. Local variables are deallocated when the function returns,
but in the case of a pointer variable, this means that the
pointer is deallocated, *not* what it points to. Memory
allocated with malloc() always persists until you explicitly
free it. In general, for every call to malloc(), there should
be a corresponding call to free().

7.23: I'm allocating structures which contain pointers to other
dynamically-allocated objects. When I free a structure, do I
have to free each subsidiary pointer first?

A: Yes. In general, you must arrange that each pointer returned
from malloc() be individually passed to free(), exactly once (if
it is freed at all).

A good rule of thumb is that for each call to malloc() in a
program, you should be able to point at the call to free() which
frees the memory allocated by that malloc() call.

See also question 7.24.

7.24: Must I free allocated memory before the program exits?

A: You shouldn't have to. A real operating system definitively
reclaims all memory when a program exits. Nevertheless, some
personal computers are said not to reliably recover memory, and
all that can be inferred from the ANSI/ISO C Standard is that
this is a "quality of implementation issue."

References: ANSI Sec. 4.10.3.2; ISO Sec. 7.10.3.2.

7.25: I have a program which mallocs and later frees a lot of memory,
but memory usage (as reported by ps) doesn't seem to go back
down.

A: Most implementations of malloc/free do not return freed memory
to the operating system (if there is one), but merely make it
available for future malloc() calls within the same program.

7.26: How does free() know how many bytes to free?

A: The malloc/free implementation remembers the size of each block
allocated and returned, so it is not necessary to remind it of
the size when freeing.

7.27: So can I query the malloc package to find out how big an
allocated block is?

A: Not portably.

7.30: Is it legal to pass a null pointer as the first argument to
realloc()? Why would you want to?

A: ANSI C sanctions this usage (and the related realloc(..., 0),
which frees), although several earlier implementations do not
support it, so it may not be fully portable. Passing an
initially-null pointer to realloc() can make it easier to write
a self-starting incremental allocation algorithm.

References: ANSI Sec. 4.10.3.4; ISO Sec. 7.10.3.4; H&S Sec. 16.3
p. 388.

7.31: What's the difference between calloc() and malloc()? Is it safe
to take advantage of calloc's zero-filling? Does free() work
on memory allocated with calloc(), or do you need a cfree()?

A: calloc(m, n) is essentially equivalent to

p = malloc(m * n);
memset(p, 0, m * n);

The zero fill is all-bits-zero, and does *not* therefore
guarantee useful null pointer values (see section 5 of this
list) or floating-point zero values. free() is properly used to
free the memory allocated by calloc().

References: ANSI Sec. 4.10.3 to 4.10.3.2; ISO Sec. 7.10.3 to
7.10.3.2; H&S Sec. 16.1 p. 386, Sec. 16.2 p. 386; PCS Sec. 11
pp. 141,142.

7.32: What is alloca() and why is its use discouraged?

A: alloca() allocates memory which is automatically freed when the
function which called alloca() returns. That is, memory
allocated with alloca is local to a particular function's "stack
frame" or context.

alloca() cannot be written portably, and is difficult to
implement on machines without a conventional stack. Its use is
problematical (and the obvious implementation on a stack-based
machine fails) when its return value is passed directly to
another function, as in fgets(alloca(100), 100, stdin).

For these reasons, alloca() is not Standard and cannot be used
in programs which must be widely portable, no matter how useful
it might be.

See also question 7.22.

References: Rationale Sec. 4.10.3.

No comments: