Pointers

Next: Global variables Up: Scientific programming in C Previous: Functions

Pointers

One of the main characteristics of a scientific program is that large amounts of numerical information are exchanged between the various functions which make up the program. It is generally most convenient to pass this information via the argument lists, rather than the names, of the these functions. After all, only one number can be passed via a function name, whereas scientific programs generally require far more than one number to be passed during a function call. Hence, the functions employed in scientific programs generally return no values via their names (i.e., they tend to be of data type void) but possess large strings of arguments. There is one obvious problem with this approach. Namely, a void function which passes all of its arguments by value is incapable of returning any information to the program segment from which it was called. Fortunately, there is a way of getting around this difficulty: we can pass the arguments of a function by reference, rather than by value, using constructs known as pointers. This allows the two-way communication of information via arguments during function calls. Pointers are discussed in the following.

Suppose that v is a variable in a C program which represents some particular data item. Of course, the program stores this data item at some particular location in the computer's memory. The data item can thus be accessed if we know its location, or address, in computer memory. The address of v's memory location is determined by the expression &v, where & is a unary operator known as the address operator.

Suppose that we assign the address of v to another variable pv. In other words,

pv = &v

This new variable is called a pointer to v, since it points to the location where v is stored in memory. Remember, however, that pv represents v's address, and not its value.

The data item represented by v (i.e., the data item stored at v's memory location) can be accessed via the expression *pv, where * is a unary operator, called the indirection operator, which only operates on pointer variables. Thus, *pv and v both represent the same data item. Furthermore, if we write pv = &v and u = *pv then both u and v represent the same value.

The simple program listed below illustrates some of the points made above:

/* pointer.c */
/* 
   Simple illustration of the action of pointers
*/

#include <stdio.h>

main() 
{
  int u = 5;     
  int v;
  int *pu;      // Declare pointer to an integer variable
  int *pv;      // Declare pointer to an integer variable

  pu = &u;      // Assign address of u to pu
  v = *pu;      // Assign value of u to v
  pv = &v;      // Assign address of v to pv

  printf("\nu = %d  &u = %X  pu = %X  *pu = %d", u, &u, pu, *pu);
  printf("\nv = %d  &v = %X  pv = %X  *pv = %d\n", v, &v, pv, *pv);

  return 0;
}

Note that pu is a pointer to u, whereas pv is a pointer to v. Incidentally, the conversion character X, which appears in the control strings of the above printf() function calls, indicates that the associated data item should be output as a hexadecimal number--this is the conventional method of representing an address in computer memory. Execution of the above program yields the following output:

u = 5  &u = BFFFFA24  pu = BFFFFA24  *pu = 5
v = 5  &v = BFFFFA20  pv = BFFFFA20  *pv = 5         
%

In the first line, we see that u represents the value 5, as specified in its declaration statement. The address of u is determined automatically by the compiler to be BFFFFA24 (hexadecimal). The pointer pu is assigned this value. Finally, the value to which pu points is 5, as expected. Similarly, the second line shows that v also represents the value 5. This is as expected, since we assigned the value *pu to v. The address of v is BFFFFA20. Of course, u and v have different addresses.

The unary operators & and * are members of the same precedence group as the other unary operators (e.g., ++ and --). The address operator (&) can only act upon operands which possess a unique address, such as ordinary variables. Thus, the address operator cannot act upon arithmetic expression, such as 2 * (u + v). The indirection operator (*) can only act upon operands which are pointers.

Pointer variables, like all other variables, must be declared before they can appear in executable statements. A pointer declaration takes the general form

data-type  *ptvar;

where ptvar is the name of the pointer variable, and data-type is the data type of the data item towards which the pointer points. Note that an asterisk must always precede the name of a pointer variable in a pointer declaration.

Referring to Sect. 2.6, we can now appreciate that the mysterious asterisk which appears in the declaration of an input/output stream, e.g.,

FILE  *stream;

is there because stream is a pointer variable (pointing towards an object of the special data type FILE). In fact, stream points towards the beginning of the associated input/output stream in memory.

Pointers are often passed to a function as arguments. This allows data items within the calling part of the program to be accessed by the function, altered within the function, and then passed back to the calling portion of the program in altered form. This use of pointers is referred to as passing arguments by reference, rather than by value.

When an argument is passed by value, the associated data item is simply copied to the function. Thus, any alteration to the data item within the function is not passed back to the calling routine. When an argument is passed by reference, however, the address of the associated data item is passed to the function. The contents of this address can be freely accessed by both the function and the calling routine. Furthermore, any changes made to the data item stored at this address are recognized by both the function and the calling routine. Thus, the use of a pointer as an argument allows the two-way communication of information between a function and its calling routine.

The program listed below, which is yet another modified version of printfact.c, uses a pointer to pass back information from a function to its calling routine:

/* printfact3.c */
/*
  Program to print factorials of all integers
  between 0 and 20
*/

#include <stdio.h>
#include <stdlib.h>

/* Prototype for function factorial() */
void factorial(int, double *);    

int main() 
{
  int j;
  double fact;

  /* Print factorials of all integers between 0 and 20 */
  for (j = 0; j <= 20; ++j) 
   {
    factorial(j, &fact);
    printf("j = %3d    factorial(j) = %12.3e\n", j, fact);
   }
  return 0;
}

//%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

void factorial(int n, double *fact) 
{
  /* 
     Function to evaluate factorial *fact (in floating-point form)
     of non-negative integer n.
  */

  *fact = 1.;

  /* Abort if n is negative integer */
  if (n < 0) 
   {
    printf("\nError: factorial of negative integer not defined\n");
    exit(1);
   }

  /* Calculate factorial */
  for (; n > 0; --n) *fact *= (double) n;

  return;      
}

The output from this program is again identical to that from printfact.c. Note that the function factorial() has been modified such that there is no data item associated with its name (i.e., the function is of data type void). However, the argument list of this function has been extended such that there are now two arguments. As before, the first argument is the value of the positive integer n whose factorial is to be evaluated by the function. The second argument, fact, is a pointer which passes back the factorial of n (in the form of a floating-point number) to the main part of the program. Incidentally, the compiler knows that fact is a pointer because its name is proceeded by an asterisk in the argument declaration for factorial(). Of course, in the body of the function, reference is made to *fact (i.e., the value of the data item stored in the memory location towards which fact points) rather than fact (i.e., the address of the memory location towards which fact points). Note that a void function, which returns no value, can only be called via a statement consisting of the function name followed by a list of its arguments (in parentheses and separated by commas). Thus, the function factorial() is called in the main part of the program via the statement

factorial(j, &fact);

This statement passes the integer value j to factorial(), which, in turn, passes back the value of the factorial of j via its second argument. Note that since the second argument is passed by reference, rather than by value, it is written &fact (i.e., the address of the memory location where the floating-point value fact is stored) rather than fact (i.e., the value of the floating-point variable fact). Note, finally, that the function prototype for factorial() takes the form

void factorial(int, double *);

Here, the asterisk after double indicates that the second argument is a pointer to a floating-point data item.

We can now appreciate that the mysterious ampersands which must precede variable names in scanf() calls: e.g.,

scanf("%d %lf %lf", &k, &x, &y);

are not so mysterious, after all. scanf() is a function which returns data to its calling routine via its arguments (excluding its first argument, which is a control string). Hence, these arguments must be passed to scanf() by reference, rather than by value, otherwise they would be unable to pass information back to the calling routine. It follows that we must pass the addresses of variables (e.g., &k) to scanf(), rather than the values of these variables (e.g., k). Note that since the printf() function does not return any information to its calling routine via its arguments, there is no need to pass these arguments by reference--passing by value is fine. This explains why there are no ampersands in the argument list of a printf() function.

A pointer to a function can be passed to another function as an argument. This allows one function to be transferred to another, as though the first function were a variable. This is very useful in scientific programming. Imagine that we have a routine which numerically integrates a general one-dimensional function. Ideally, we would like to use this routine to integrate more than one specific function. We can achieve this by passing (to the routine) the name of the function to be integrated as an argument. Thus, for example, we can use the same routine to integrate a polynomial, a trigonometric function, or a logarithmic function.

Let us refer to the function whose name is passed as an argument as the guest function. Likewise, the function to which this name is passed is called the host function. A pointer to a guest function is identified in the host function definition by an entry of the form

data-type  (*function-name)(type 1,  type 2, ...)

in the host function's argument declaration.⁹ Here, data-type is the data type of the guest function, function-name is the local name of the guest function in the host function definition, and type 1, type 2, ... are the data types of the guest function's arguments. The pointer to the guest function also requires an entry of the form

data-type  (*)(type 1,  type 2,  ...)

in the argument declaration of the host function's prototype. The guest function can be accessed within the host function definition by means of the indirection operator. To achieve this, the indirection operator must precede the guest function name, and both the indirection operator and the guest function name must be enclosed in parenthesis: i.e.,

(*function-name)(arg 1,  arg 2,  ...)

Here, arg 1, arg 2,... are the arguments passed to the guest function. Finally, the name of a guest function is passed to the host function, during a call to the latter function, via an entry like

function-name

in the host function's argument list.

The program listed below is a rather silly example which illustrates the passing of function names as arguments to another function:

/* passfunction.c */
/*
  Program to illustrate the passing of function names as
  arguments to other functions via pointers
*/

#include <stdio.h>

/* Function prototype for host fun. */
void cube(double (*)(double), double, double *); 
                                                
double fun1(double);    // Function prototype for first guest function
double fun2(double);    // Function prototype for second guest function

int main() 
{
  double x, res1, res2;

  /* Input value of x */
  printf("\nx = ");
  scanf("%lf", &x);

  /* Evaluate cube of value of first guest function at x */
  cube(fun1, x, &res1);

  /* Evaluate cube of value of second guest function at x */
  cube(fun2, x, &res2);

  /* Output results */
  printf("\nx = %8.4f   res1 = %8.4f   res2 = %8.4f\n", x, res1, res2);
  
  return 0;
}

//%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

void cube(double (*fun)(double), double x, double *result) 
{
  /*
    Host function: accepts name of floating-point guest function 
    with single floating-point argument as its first argument, 
    evaluates this function at x (the value of its second argument), 
    cubes the result, and returns final result via its third argument.
  */

  double y;

  y = (*fun)(x);        // Evaluate guest function at x
  *result = y * y * y;  // Cube value of guest function at x

  return;
}

//%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

double fun1(double z) 
{
  /*
    First guest function
  */

  return 3.0 * z * z - z;
}

//%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

double fun2(double z) 
{
  /*
    Second guest function
  */
  
  return 4.0 * z  - 5.0 * z * z * z;
}

In the above program, the function cube() accepts the name of a guest function (with one argument) as its first argument, evaluates this function at x (the value of its second argument, which is ultimately specified by the user), cubes the result, and then passes the final result back to the main part of the program via its third argument (which, of course, is a pointer). The two guest functions, fun1() and fun2(), whose names are passed to cube(), are both simple polynomials. The output from the above program looks like:

x = 2  

x =   2.0000   res1 = 1000.0000   res2 = -32768.0000
%

Next: Global variables Up: Scientific programming in C Previous: Functions

Richard Fitzpatrick 2006-03-29