2009-10-07

Post date: Oct 8, 2009 2:06:29 AM

Overview of program's perception of memory

    • Each program is loaded into an isolated, private address space. It cannot see the memory of another program. The hardware and the OS works together to enforce the address space protection.
    • The address space on a 32-bit machine ranges from 0 through 0xffffffff (232 - 1); on a 64-bit machine, it ranges from 0 through 0xffffffffffffffff (264 - 1). A number that represents memory location in the address is called a pointer.
    • At address 0, there is a reserved NULL page intentionally made inaccessible in order to catch NULL pointer reference. This is because functions can only return one value, and NULL pointer (which is numeric 0) conventionally indicates an error condition. If a programmer forgets to check for error, the program can dereference NULL pointer and crash. This is the best case scenario so you can debug. It is better than if the program keeps running, reading random garbage from memory. NULL pointer dereference causes segmentation fault (see null.c below).
      • The asterisk serves two purposes. In null.c, char *p declares variable p to be a character pointer. In an expression, *p causes the character value at pointer p to be read out, or dereferenced.
    • Use the command: objdump -h a.out and you'll find the sections in your executable. The rodata, data, and bss sections store global variables: rodata (read-only) stores string literals (what you write in the program inside double quotes) and other things; data stores initialized read-write variables; bss stores uninitialized global variables (or those initialized to 0). You can observe the change of bss and data section sizes (see section.c below).
    • Heap grows upward (not going to be observed for now), but stack grows downward (see stack.c). Local variables are placed on stack, and function call forces the stack to grow, so the char c in main() occupies a higher address than the char c in foo(). &c (unary & operator) gives you the address of the variable c.

Array, pointer arithmetic, and string

    • Summation of an integer array (see sum.c below).
      • Use size_t instead of int for the size of the array. size_t is basically an unsigned integer that has the same width as a pointer. On a 64-bit machine, using int to represent size of a memory object could overflow; size_t makes your program portable. size_t can be 0 or positive, but there cannot be negative sizes.
      • In C, pointer and array are synonymous. int *p is the same as int p[]. We can dereference pointer p either using *p or p[0]. Furthermore, p[i] is the same as *(p + i), and &p[i] is the same is (p + i).
      • *p++ is parsed as *(p++), to increment the pointer. On the other hand, (*p)++ means increment the memory location at p.
      • Prefix ++x and --x operators causes the variable x to be incremented or decremented before the statement is executed. Postfix x++ and x-- causes the same effect after the statement.
    • NUL '\0' terminator of a string (see nulterm.c below).
      • NUL character is a byte of value 0, denoted as '\0', not the character for number zero, denoted as '0'.
      • String is a character array that is NUL terminated. This is so we can pass a string without also needing to pass the size of the string.
      • If we pass a char buffer without NUL terminator, puts() will print garbage. The int x increases the likelihood that buf[] will not be immediately followed by a zero byte.
    • Computing the length of a string (see string.c below).
      • const char *s means we do not intend to modify the char pointed to by s, so the content of s is supposed to be constant.