Chapter 10
Pointers and Dynamic Arrays
0. Introduction
The topic of this chapter is the notion of pointer and some of the uses of pointers.
Benjamin Whorf said that the language one uses has a great effect on how you think, even to the extent of determining what you can think. Few natural language experts think the human language a person uses has quite that control over thoughts. Programming languages on the other hand, have a quite strong influence on what we can code. Some programming languages do not have the idea of an address in the language, making some kinds of programming very difficult.1 Most modern programming languages support the memory address abstraction we call pointer.
The notion of pointer gives us the abstraction of memory addresses in C and C++ programming. The main use of pointers is to construct and use linked structures and dynamically allocated arrays. (We will study linked structures in Chapter 17.) The C and C++ pointer construct gives almost full reign to our ability to allocate, structure, and deallocate memory.
A pointer variable holds a pointer value. Pointer variables are typed, like all objects in C++. Typed means the pointer variable’s pointer value can only point to an object of the type with which the pointer variable was declared. Following a common abuse of the language, we will use the word pointer indiscriminately to refer to either a pointer variable or a pointer value.
1. Outline of topics in the chapter
10.1 Pointers
Pointer Variables
Basic Memory Management
Dynamic Variables and Automatic Variables
Uses for Pointers
10.2 Dynamic Arrays
Array Variables and Pointer Variables
Creating and Using Dynamic Arrays
Example: A Function that Returns an Array
Pointer Arithmetic
Multidimensional Dynamic Arrays
10.3 Classes, Pointers, and Dynamic Arrays
The -> Operator
The this pointer
Overloading the Assignment Operator
Example: A class for Partially Filled Arrays
Destructors
Copy Constructors
From this point in the book, you will find it very difficult to use of any version of Windows 9x or Windows Me to run programs you develop. Avoiding errors while developing programs that use pointers is nearly impossible. With these operating systems, almost any pointer error will either crash the system or worse, render the system unstable so that it crashes shortly.
Preemptively rebooting after a pointer error isn’t easy. My Windows 98 2nd edition system continues to claim I have a “program” it does not name that has to be ended before rebooting is possible, but the system make killing these “programs” difficult.
I strongly recommend that you obtain and use some version of the (free) Linux operating system or Windows NT, 2000 or XP. Some Linux distributions are Red Hat, Debian, Suse, and Mandrake. I use and recommend Debian because of the ease of upgrading from one release to the next. I have found Windows NT and 2000 stable enough for development of student programs. I have not used XP. However, I have been told by XP users that it is sufficiently stable for this use.
Most Linux distributions provide GCC, the Gnu Collections of Compilers, among which are g++, the GNU c++ compiler. This is free software. There are a few commercial compilers for Linux. (A few are inexpensive. These compete with free compilers, so it is reasonable that these should all be very good compilers.
For Windows, Borland’s C++ command line compiler may be down loaded free from Borland’s web site. There is no GUI development environment. Borland Builder uses this same compiler, and is very good. Microsoft’s VC++6.0 is bundled with the book.
At the time of this writing, to use the introductory version of VC++ 6.0 bundled with the text, you must to go to the Microsoft download web site2 down load and install the level 5 patch. You can compile the following program to determine whether your version needs patching:
class A
{
public:
friend A operator+(A&, A&);
private:
int a;
};
A operator+(A& lhs, A& rhs)
{
int x;
x = lhs.a + rhs.a;
return A();
}
int main()
{
A u, v, w;
u = v + w;
return 0;
}
If your compiler requires the patch, the compiler will ignore the friend declaration. It complains about the access to private class data by the friend function operator+().
You must have a required Microsoft program, MDAC version 2.6 or later to successfully install the patch for all Visual Studio components. Version 2.5 will install the patches for VC++6.0. I do not know how to determine what version of MDAC is on a system other than downloading the patch and trying to install it.
I do know nothing about Macintosh systems or development environments, so I refrain from commenting on them.
10.1 Pointers
The text states that a pointer is the address of a variable. A variable is the name of a memory location. A memory location has both an address and an extent. An example is the array, which has an address, the address of the index 0 element, and extent, which is the number of elements times the size of the base type. The two examples the text gives for pointers that we have already used are passing an array to a function and passing any variable by reference.
Pointer Variables
A pointer variable holds a pointer value. A pointer value is the address of a variable in memory. The C++ type system requires that there be a type for every object. Pointers and pointer variables are no exception. A pointer can point to an address, and the type object that a pointer is able to point at must be specified.
The following statement3 declares dPtr to be a pointer type (that is what the * says) capable of holding the address of a double.
double *dPtr;
The strong typing of the C++ language requires that pointers declared to point to a particular type object be used only to point to that type object. An attempt to assign a variable of type pointer to int to a variable of type pointer to long will get a compiler error message to the effect that there has been an "assignment between incompatible pointer types." Each type object requires a pointer having type pointer-to-that-type. The type dPtr is "the type pointer to double" or, in C++ parlance, "dPtr is the type double*".
The text already warns of one of the pitfalls of pointer declarations:
int *x, y;
This is a declaration of x, with type pointer-to-int, and y, with type int. My students tend to have a problem declaring several pointers in one declaration. The problem is that the following appears to declare three pointer-to-int variables. Really, this declares only one pointer-to-int, and two int variables.
int* p1,p2,p3;// p1 is pointer-to-int; but p2,p3 are int.
The reason for this is that the pointer declarator, *, binds to the identifier. If the student will declare one variable per line, this problem does not occur. Of course, we could use a typedef. For example,
typedef int* intPtr;
intPtr p1, p2, p3; //p1 p2 and p3 all have type int*
Programming Language Notes (This material is not in the text.)
We mentioned overloading of operators in C++. The language overloads the operators for us with regard to the primitive types. For example, + is able to add any of char, short, int, long, float, double, or long double. Most languages have operator overloading at this level. This operator overloading is carried out by the compiler, just as operator overloading is done by the programmer. The compiler recognizes the type of the operands of the + operator (or whatever operator we are discussing) then generates the appropriate machine instructions to carry out the operation. The machine instruction is usually different for each primitive type.
A primitive type is a type that is built into the language. For C++, the list of primitive types is bool, char, short, int, long, float, double, long double, and the unsigned variants of these types. This is by contrast with user defined types such as array, class, and enums. To build these requires use of machinery provided in the language.
The * operator is overloaded in a more extensive way than + is overloaded. We have the * for multiplication. The * also stands for 'pointer to' in a pointer declaration (a pointer declarator), and it stands for 'dereference' or 'follow the pointer' when it occurs before a pointer expression (a dereference operator). A dereferenced pointer can be either an l-value or an r-value.4
The text points out that from the point of view of the C++ language, a pointer is not an integer. An integer would be a char, short, int, or long, or perhaps an unsigned variant. A pointer type isn't any of these. It is declared a pointer to something, and that object is an address of a type. The 'address of a type' is itself a type.
Pointer arithmetic does some special things that will be discussed in the text and in this IRM shortly. Here is a place where adding 1, in a sense, adds more than 1, most of the time. Adding and multiplying pointers themselves is not defined. Subtracting pointers gives an integer in certain limited circumstances. Other times subtracting pointers give undefined results.
Operators: pointer declarator: *, indirection operator: *, address of: &
If used in the declaration,
int * iPtr;
the * is called a "pointer declarator" by language lawyers. This statement declares iPtr as a "pointer to int". Read it in reverse: iPtr is a pointer (*) to an int. This works on all but the most complex declarations.
If used in the assignment,
*iPtr = 7;
we are using the * as the indirection operator or dereferencing operator. This statement says, store 7 where the pointer, iPtr, points.
We haven't said where iPtr points. We should have. Let's do it:
int x = 49;
iPtr = &x;
The & in front of the x is called the "address of operator". It takes the address of x. The assignment makes iPtr point at (have the value of the address of) the int variable x. (The & is another operator overloading. If it is used between the type and the formal parameter in a function declaration (prototype), language lawyers call this a "reference declarator" and it means in that context, "this is a reference parameter." If it is used in an expression where it has two arguments, it is the bitwise AND operator.5
Now let us execute the above assignment to the dereferenced pointer:
*iPtr = 7;
We have changed the value stored in x from 49 to 7. As long as iPtr contains a pointer that points to x, *iPtr and x refer to the same variable (or memory location).
Incidentally, you cannot assign the address of a variable, that is the result of the “address of” operator is not an l-value.
int x;
int *p;
&x = p; //invalid l-value in assignment
The text carefully distinguishes between assignment of pointers and assignment of the places where the pointers point. Pay particular attention to these ideas, and to Display 10.1, 10.3 and 10.5 in the text, that illustrate these ideas. Here is an example:
int x = 7;
int y = -15;
int *int_ptr1 = x;
int *int_ptr2 = y;
int_ptr1 = int_ptr2; //Pointer assignment. This makes
//int_ptr1 point at the same place
//int_ptr2 points.
*int_ptr1 = *int_ptr2; // Dereferenced pointer
//assignment. This assigns the
//values stored where the pointers
//point.
The most important uses of pointers is the case where pointers point to variables that have no name. The operator new is used to create variables that have no identifiers to serve as their names.
The phrase "creating a variable" is used to denote a request to the operating system to allocate a chunk of memory to the program during run time. In C++, this tells the compiler to only allow storage of objects having the type specified in the argument for new. If the type is a class, and a proper constructor is provided, the constructor is called to initialize the variable so created.
A common description of these anonymous variables is "dynamic variables". The word dynamic "dynamic" is used because these variables are not allocated statically, that is, not allocated at compile time. Rather, they are allocated at the request of the program, during execution. When and how much allocation is done is usually a decision made during the execution of the program based on the state of the program.
The area of memory where dynamic variables are stored is called the heap, (the text's usage) and free store. The student should be aware that both terms are used, interchangeably. Heap seems to be usage inherited from the C language. The terms mean exactly the same thing.
There is one caution. I find that there is a little confusion between 'heap', meaning a tree-like data structure with certain order relationships imposed on the values in the nodes, and a memory management heap, used for allocation of dynamic variables.
A memory allocation 'heap' is a linked list of chunks of memory waiting for allocation, together with another collection of memory chunks that have been already allocated to the program, usually with library functions that carry out the management.
(Aside: I use 'heap' as often as 'free store', partly because I am (still) a Pascal/C refugee, and that was the usage there. There is a program language designer's maxim that the C++ developers have not violated. Perhaps they should have. The maxim reads, "Every designer of a (new) programming language must invent new terminology for every (old) language concept.")
Share with your friends: |