The C Programming Language - Pointers and Arrays

Edusagar - notes - Pointers and Arrays
  • Unary operator (& is used to get the address of a variable. This operator only applies to objects in memory: variables and array elements. It cannot be applied to expressions, constants or register variables.

  • A pointer is constrained to point to a particular kind of object, e.g. pointer to integer, float, character etc. except pointer to void which can hold any type of pointer but cannot be de-referenced.

  • *ip + 1  // increments the value pointed by ip since * has higher precedence than +
    ++*ip    // increments the value pointed by ip
    (*ip)++  // increments the vlaue pointed by ip, needs parenthesis around *ip, since precedence of ++ and * is same and they are evaluated from right to left
  • Pointer argument enables a function to access and change objects in the caller routine. They can also be used to return back multiple set of values from to the caller (otherwise function can only return one value using return statement)

  • Adding 1 to a pointer points the pointer to the next object regardless of the data type. e.g.


    int arr[5];
    int *p;
    p = &arr[0];  // p contains the address of first element of the array , let us assume 0x1000.
    p + 1         // p points to the 1st element of the array now, the content of pointer would be 0x1004 considering the size of an int is 4 bytes.
  • An array-and-index expression is equivalent to one written as pointer and offset. e.g.

    a[i] <==> *(a+i)

  • There is one difference between an array name and a pointer that must be kept in mind. A pointer is a variable, so pa=a and pa++ are legal. But an array name is not a variable; constructions like a=pa and a++ are illegal.

  • printf("%d", strlen(strarr));  // char strarr[]
    printf("%d", strlen("Hello, World"));   // string constant
    printf("%d", strlen(ptr));   // char *ptr
    
    int strlen(char *s)   // alternate -> int strlen(char s[])
    {
    	int n;
    	
    	for(n=0; *s != '\0'; s++) {
    		n++;
    	}
    	
    	return n;
    }

    When an array name is passed to a function, it is the address of the first element of that array which is actually passed to the function which is nothing but a pointer. Hence, it is quite legal to do s++ in the above program and this increment would not affect the caller as 's' is now a local variable to this strlen function.

  • Simple Array allocator

    #define ALLOCSIZE 10000 /* size of available space */
    static char allocbuf[ALLOCSIZE]; /* storage for alloc */
    static char *allocp = allocbuf; /* next free position */
    
    /* return pointer to n characters */
    char *alloc(int n)
    {
    	if (allocbuf + ALLOCSIZE - allocp >= n) { /* it fits */
    		allocp += n;
    		return allocp - n; /* old p */
    	} else
    		/* not enough room */
    	return 0;
    }
    
    void afree(char *p) /* free storage pointed to by p */
    {
    	if (p >= allocbuf && p < allocbuf + ALLOCSIZE)
    		allocp = p;
    }

    C guarantees that zero is never a valid address for data, so a return value of zero can be used to signal an abnormal event, in this case no space.

  • Pointers and integers are not interchangeable except 0 (zero). zero may be assigned to a pointer and pointer can be compared with constant zero. Infact NULL is a symbolic constant often used in place of zero. NULL is defined in <stdio.h>

  • If p and q points to the same array, then relations like ==, !=, <, >= etc., work properly. Behavior is undefined if p and q do not point to the same array.

  • An integer can be added or subtracted from a pointer. increment and decrement is based on the data-type to which the pointer is pointing to. If an int is 4 bytes, the int pointer will move by 4 bytes with each increment.

  • Pointer subtraction is also valid. See the use in the following program:


    int strlen(char *s) {
    	char *p = s;
    	
    	while (*p != '\0') {
    		p++;
    	}
    	
    	return p - s;
    }

    The header <stddef.h> defines a type ptrdiff_t that is large enough to hold the signed difference of two pointer values. If we were being cautious, however, we would use size_t for the return value of strlen, to match the standard library version. size_t is the unsigned integer type returned by the sizeof operator.

  • Pointer arighmetic:

    Pointer arithmetic is consistent: if we had been dealing with floats, the earlier version of alloc and afree could easily be reused by just modifying char to float throughout the code. All the pointer manipulations automatically take into account the size of the objects pointed to.

    The valid pointer operations are assignment of pointers of the same type, adding or subtracting a pointer and an integer, subtracting or comparing two pointers to members of the same array, and assigning or comparing to zero. All other pointer arithmetic is illegal. It is not legal to add two pointers, or to multiply or divide or shift or mask them, or to add float or double to them, or even, except for void *, to assign a pointer of one type to a pointer of another type without a cast.

  • A string is an array of characters. Internally, the array is terminated with the null character '\0' so that programs can find the end of the string.

    C does not provide any operators for processing an entire string of characters as a unit. It has to be done character by character.

  • char amessage[] = "now is the time";  /* an array */
    char *pmessage  = "now is the time";  /* a pointer */

    amessage is an array, just enough to hold the sequence of characters and '\0' that initializes it. Individual characters can be changed at will, but amessage will always refer to the same storage. On the other hand, pmessage is a pointer, initialized to hold address of a location where the string is stored. The pointer can be modified to point to any other location, but the result is undefined if you try to modify the string contents (because the location where this string is stored is un-modifiable)

  • /* strcpy: copy t to s; pointer version 2 */
    void strcpy(char *s, char *t)
    {
    	while (*s++ = *t++)
    		;
    }
    
    /* strcmp: return <0 if s<t, 0 if s==t, >0 if s>t */
    int strcmp(char *s, char *t)
    {
    	for ( ; *s == *t; s++, t++)
    		if (*s == '\0')
    			return 0;
    
    	return *s - *t;
    }
  • *p++ = val; /* push into stack */
    val = *--p; /* pop from stack */
  • Since pointers are variables themselves, they can be stored in arrays just as other variables can. This is known as Pointer array and has the following representation:

    char *lineptr[MAXLINES];

    Such a representation is neccessary when we have to store a set of variable length strings. Using pointers we can dynamically allocate only the required storage for each string and store all those pointers in an array for easy access later on. For example, if we have to sort these strings, we can use our earlier written qsort() and swap the pointers in the array with the help of strcmp() to compare the two strings.

  • C also provides rectangular multi-dimensional arrays.


    static char daytab[2][13] = {
    	{0, 31, 28, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31},
    	{0, 31, 29, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31}
    };
    
    /* day_of_year: set day of year from month & day */
    int day_of_year(int year, int month, int day)
    {
    	int i, leap;
    	leap = year%4 == 0 && year%100 != 0 || year%400 == 0;
    	for (i = 1; i < month; i++)
    		day += daytab[leap][i];
    	return day;
    }
    
    /* month_day: set month, day from day of year */
    void month_day(int year, int yearday, int *pmonth, int *pday)
    {
    	int i, leap;
    
    	leap = year%4 == 0 && year%100 != 0 || year%400 == 0;
    	for (i = 1; yearday > daytab[leap][i]; i++)
    		yearday -= daytab[leap][i];
    	*pmonth = i;
    	*pday = yearday;
    }
  • Note that 2-dimentional array is really a 1-dimentional array with each element is yet another 1-dimentional array. To access an element in 2-d matrix use the following notation:

    daytab[i][j] /* [row][col] */

    Initialization is done per-row basis as shown above; each row again is initialized by a corresponding sub-list. We started the array daytab with a column of zero so that month numbers can run from the natural 1 to 12 instead of 0 to 11. Since space is not at a premium here, this is clearer than adjusting the indices.

    f(int daytab[2][13]) { ... }

    It could also be

    f(int daytab[][13]) { ... }

    since the number of rows is irrelevant, or it could be:

    f(int (*daytab)[13]) { ... }

    which says that the parameter is a pointer to an array of 13 integers. The parentheses are necessary since brackets [] have higher precedence than *. Without parentheses, the declaration

    int *daytab[13]

    is an array of 13 pointers to integers. More generally, only the first dimension (subscript) of an array is free; all the others have to be specified.

  • /* month_name: return name of n-th month */
    char *month_name(int n)
    {
    	static char *name[] = {
    		"Illegal month",
    		"January", "February", "March",
    		"April", "May", "June",
    		"July", "August", "September",
    		"October", "November", "December"
    	};
    	return (n < 1 || n > 12) ? name[0] : name[n];
    }

    Each array element here is a pointer to char and each of the pointer is initialized as constant character string. The characters of the i-th string are placed somewhere, and a pointer to them is stored in name[i]. Since the size of the array name is not specified, the compiler counts the initializers and fills in the correct number.

  • int a[10][20];  /* 2 dimensional array with size fixed at 200 ints*/
    int *b[10];     /* pointer array */
    

    Referring to an element in arrays a and b is exactly same - a[3][4] and b[3][4]. However, a has a fixed size of 200 int sized locations, but b is just an array of 10 pointers. Assuming that every element of b point to a 20-element array, the size would be 200-int sized locations + 10 extra cells for the pointers. The important advantage of the pointer array is that the rows of the array may be of different lengths. That is, each element of b need not point to a twenty-element vector; some may point to two elements, some to fifty, and some to none at all.

  • In environments that support C, there is a way to pass command-line arguments or parameters to main function. main can be called with two arguments.

    The first (conventionally called argc, for argument count) is the number of command-line arguments the program was invoked with; the second (argv, for argument vector) is a pointer to an array of character strings that contain the arguments, one per string.

  • By default, argv[0] is the name of the program itself. so argc is at least 1. Consider an example:

    echo hello world

    here argc will be 3, with argv[0] pointing to echo, argv[1] to hello and argv[2] to world. Additionaly, the standard requires that argv[argc] is a null pointer.


    #include <stdio.h>
    /* echo command-line arguments; 2nd version */
    main(int argc, char *argv[])
    {
    	while (--argc > 0)
    	printf("%s%s", *++argv, (argc > 1) ? " " : "");
    	printf("\n");
    	return 0;
    }
    
  • /* find: print lines that match pattern from 1st arg */
    #include <stdio.h>
    #include <string.h>
    #define MAXLINE 1000
    int getline(char *line, int max);
    
    
    main(int argc, char *argv[])
    {
    	char line[MAXLINE];
    	int found = 0;
    	if (argc != 2)
    		printf("Usage: find pattern\n");
    	else
    		while (getline(line, MAXLINE) > 0)
    			if (strstr(line, argv[1]) != NULL) {
    				printf("%s", line);
    				found++;
    			}
    	return found;
    }
    
  • /* 
     * find -nx pattern 
     * x - find all except this pattern
     * n - print line number too
    */
    #include <stdio.h>
    #include <string.h>
    #define MAXLINE 100
    
    int getline(char *line, int max);
    
    main(int argc, char *argv[]) 
    {
    	char line[MAXLINE];
    	long lineno = 0;
    	int c, except = 0, number = 0, found = 0;
    	
    	while ((--argc > 0) && (*++argv)[0] == '-')) {
    		while (c = (*++argv)[0]) {
    			switch(c) {
    			case 'x':
    				except = 1;
    				break;
    			case 'n':
    				number = 1;
    				break;
    			default:
    				printf("find: illegal option %c\n", c);
    				argc = 0;
    				found = -1;
    				break;
    			}
    		}
    	}
    	
    	if (argc != 1) {
    		printf("Usage: find -x -n pattern\n");
    	} else {
    		while (getline(line, MAXLINE) > 0) {
    			lineno++;
    			if ((strstr(line, *argv) != NULL) != except) {
    				if (number)
    					printf("%ld:", lineno);
    				printf("%s", line);	
    				found++;
    			}
    		}
    	}
    	return found;
    }
  • ++argv is a pointer to an argument string, so (*++argv)[0] is its first character. (An alternate valid form would be **++argv.) Because [] binds tighter than * and ++, the parentheses are necessary; without them the expression would be taken as *++(argv[0]). In fact, that is what we have used in the inner loop, where the task is to walk along a specific argument string. In the inner loop, the expression *++argv[0] increments the pointer argv[0] (a particular string)

  • In C, a function itself is not a variable, but it is possible to define pointers to functions, which can be assigned, placed in arrays, passed to functions, returned by functions, and so on.

    int (*comp)(void*, void*) : comp is a pointer to function that accepts two void * arguments and return an integer. () are neccessary here, since () has higher precedence than *. Without () the declaration becomes - int * comp(void*, void*) which declares a function comp returning a pointer to an integer.

    (*comp)(v[i], v[j]) : calling the function pointed by comp. note the parentheses.

    qsort((void**) lineptr, 0, nlines-1, (int (*)(void*,void*))(numeric ? numcmp : strcmp)) : passing pointer to function as an argument to qsort()

    numcmp and strcmp respresent the address of corresponding functions. Since they are known to be functions, '&' is not required to get the address - similar to name of any array.

  • Some complex declarations:

    char **argv
    argv: pointer to char
    
    int (*daytab)[13]
    daytab: pointer to array[13] of int
    
    int *daytab[13]
    daytab: array[13] of pointer to int
    
    void *comp()
    comp: function returning pointer to void
    
    void (*comp)()
    comp: pointer to function returning void
    
    char (*(*x())[])()
    x: function returning pointer to array[] of
    pointer to function returning char
    
    char (*(*x[3])())[5]
    x: array[3] of pointer to function returning
    pointer to array[5] of char
comments powered by Disqus