ChiLib Documentation

ChiLib Library Documentation

Array is a `smart array' class template. Unlike the more dangerous and inconvenient C arrays, Array objects handle their own storage: They have associated operations that make accessing and manipulating their contents less prone to memory overruns and indexing mistakes.

An Array object is a container and so `contains' other objects: usually user-defined classes, but also pointers or numeric types (doubles, for instance). Here is an array of Customer objects:

      Customer joe, harry, tran;

      Array<Customer> a(1, 10); // declare an array of 10 Customers
      Array<double> d(0,4); // an array of five doubles.
      a[1] = joe;
      a[10] = tran;
      Array<Customer> b = a; // declare another array as a copy of a
      // b has space for elements 1...10, just like a.
      // b[1] contains joe, b[10] contains tran
      b[1] = harry;
      // a[1] still contains joe, b[1] now contains harry.

Contrast this with C arrays and pointers:

      Customer a[11];   // C arrays always start at 0
      a[1] = joe;
      a[10] = tran;
      Customer* b = a; // Customer b[11] = a;  is illegal
      b[1] = harry;
      // now both a[1] and b[1] contain harry

The documentation just refers to the template as Array. Of course, you must add the <...> to instantiate a specific array class.

2.2. Construction

      Array();
      Array(int low, int high);

The default constructor initializes the instance to an empty array; you must use the set or grow operation (described in the following section) to allocate a range for elements. The second constructor establishes the initial low and high indices of the array. Note that any integers are valid as long as low ó high. In particular, low need not be zero, and low, high, or both can be negative.

      Array<Customer> a; 
         // creates an empty array
      Array<double> b(-10, 10);      
         // an array, with valid range b[-10]..b[10]

2.3. Resizing

      void grow(int low, int high);
      void shrink(int low, int high);
      void empty();

The grow operation ensures that the array contains at least the index range low ... high, expanding the low or high bound, or both, as necessary. The array is not reduced in size. That is, elements that are below low or above high are not removed.

The shrink operation ensures that the array contains no more than the index range low ... high, reducing the low or high bound, or both, as necessary. The array is not increased in size. That is, no new elements are created if the bounds are already less than those specified in the arguments.

The empty operation removes all elements of the array, making it into an empty array.

2.4. Element access

      void set(int i, const X& t);
      X get(int i) const;
      X& operator[](int i);
      const X& operator[](int i) const;

The set operation sets the ith element of the array to the value of t. If i is not within the current array bounds, the array grows to include i.

The get operation returns a copy of the value at position i. If i is not within the current array bounds, a default value is returned. Don't use get to try to change an array element. The code

      a.get(i).change(); // DON'T

changes the copy of a[i] that get returns and then abandons the copy.

The [] operator allows both read and write access to the element at position i. But applying [] does not grow the array. Instead, it triggers a bounds error. This was not done just to be difficult, but for technical reasons.

Use get/set when possible. They are always safe.

There is, however, one reason to use []: for in-place modification. It is easier and more efficient to change an object in place,

      a[i].change();

than it is to do a get/change/set sequence,

      X x = a.get(i);
      x.change();
      a.set(i, x);

Here are some examples:

      Array<double> a(1, 10); // a is a 10-element array, a[1]...a[10]
      a.set(20, 400.0); // now bounds are 1...20
      a[30] = 900.0;  // NO--Bounds error
      double x = a.get(30); // OK--x is set to 0
      x = a[30]; // NO--Bounds error
      a[20] += 10.5; // In-place modification. Now a[20] is 410.5

2.5. Bounds range

      int length() const;
      int low() const;
      int high() const;

The length operation returns the number of elements in the array. The low and high operations return the lowest and highest array index, respectively. The latter are typically used to traverse an array:

      for (int i = a.low(); i <= a.high(); i++)
         do something with a[i];

An empty array arbitrarily has low bound 0, high bound -1, and, of course, length 0.

2.6. Inserting elements

Inserting elements into the middle of an array is not efficient. The elements above the insertion point must be moved up, or the entire array needs to be reallocated if there isn't enough space to hold the inserted values. Nevertheless, it is a convenient operation, and for many programs the inefficiency is not important.

    void insert(int i, X x)
    void insert_range(int start, int count);
    void append(X t);

The insert operation inserts x at location i, moving the elements above i up by 1. The insert_range operation inserts count elements, beginning at element start, and shifts higher elements up as appropriate. The newly created elements are filled with default values of type X.

The append operation places x into the element with index high()+1. No elements are shifted. If the array is empty, x is inserted at location 0.

      Array<int> a(1, 10);
      a[3] = 100;
      a.insert(3, 90); // a[3] is now 90, a[4] is now 100, etc.
      a.append(410); // a[11] is 410
      a.insert_range(3, 10); // now a has 20 elements
         // a[3] ... a[12] are 0, a[13] is 90

2.7. Removing elements

      X remove(int i);
      void remove_range(int start, int count);
      void empty();

The remove operation removes the element at index i from the array, shifting higher element down as needed, and returning the value. If i is out of bounds, a default value is returned. The remove_range operation removes count elements, beginning with the position start, shifting elements with higher index down.

The empty operation removes all elements from the array.

      Array<int> a(1, 10);
      for (int j = a.low(); j <= a.high(); j++)
         a[j] = j*j;
      int c = a.remove(5); // c is 25
      a.remove_range(2, 4);
      int d = a.get(2); // d is 49
      a.empty(); // now a is empty

2.8. Sorting

    void qsort (int (*compare) (const X&, const X&));
    void qsort (int (*compare) (const X&, const X&), 
         int from, int count);

These perform a quicksort on the array. In the latter, only the subrange of count elements of the array, beginning at from, is sorted.

To sort an array, you must specify how to compare elements. You do that by supplying a comparison function. A comparison function takes two objects a and b of type X and compares them. (Actually, for efficiency, the comparison function takes two constant references to the objects.) It should behave like this:

      if a < b, then return an integer less than 0
      if a = b, then return 0
      if a > b, then return an integer greater than 0

Here is a typical example. We want to sort an array of customers, first by age, and by name within the same age. Write the following function:

      int compare_cust(const Customer& a, const Customer& b)
      {  int d = a.age() - b.age();
         if (d != 0) return d;
         return a.name().compare(b.name());
      }

Then call

      Array<Customer> a;
      a.sort(compare_cust);

2.9. Array parameters

You can use smart arrays as function parameters. Like all C++ objects, but unlike C arrays, the array parameter is a copy of the array that is passed to the function. Modifying the copy is legal but has no effect on the array supplied by the caller. If you want to write a function that modifies the array, then pass it by reference:

      void raise_salaries(Array<Employee>& staff, double perc)
      {  for (int i = staff.low(); i <= staff.high(); i++)
            staff[i].raise_salary(perc); 
      }

If you do not write to the array, it is strongly recommend that you do not use [] to access the array for reading. The [] operator cannot tell whether it is invoked for reading or for writing, so it forces a copy of all array elements the first time it is applied. Just use get.

      double average_salary(Array<Employee> staff)
      {  double d = 0;
         for (int i = staff.low(); i <= staff.high(); i++)
            d += staff.get(i).salary();
         if (staff.length() > 0) d = d / staff.length();
         return d;
      }

Alternatively, if you declare the array argument as a const or const&, you can use [] because the const version of [] cannot be used for writing.

      double average_salary(const Array<Employee>& staff)
      {  double d = 0;
         for (int i = staff.low(); i <= staff.high(); i++)
            d += staff.get(i).salary();
         if (staff.length() > 0) d = d / staff.length();
         return d;
      }

This is a subtle point, and you can ignore it if you are not concerned about efficiency.

2.10. Reallocation

An array object always stores its elements as a single block of contiguous memory. That makes element access efficient, and it also makes it easy to locate and inspect elements during debugging. When a block is allocated, a few additional elements are supplied to anticipate further growth. (However, upon construction the array is sized exactly as specified in the construction arguments.) Eventually, as the array grows, it runs out of space, and the associated memory block must be reallocated.

This means that you should never take the address of an array element a[i]. That address becomes invalid as soon as the memory block relocates, and the element is moved elsewhere. This is the reason the [] operator does not grow the array. If it did, statements like

      swap(a[i], a[i + 100]);

would fail mysteriously if the compiler evaluated the reference a[i], and then a[i+100], and the array was relocated during the second computation.

All functions that relocate the array have been designed to have a void return value. Hence, no subexpression can trigger array relocation and invalidate a [] reference in another subexpression.

2.11. Template variants

A PtrArray template implements arrays of pointers. Semantically, PtrArray<X> is equivalent to Array<X*>. A NumArray<X> template implements arrays of numeric types (int, double, and so forth). For types without a copy constructor and destructor, NumArray<X> and Array<X> are semantically equivalent.

There are two reasons to use the template variants. First, they are much more efficient, by sharing common code and performing bitwise copies. More importantly, some compilers have faulty template implementations that make it impossible to use the Array template on non-class types. However, the template variants have one major disadvantage: The code sharing is achieved through the use of char blocks or void* pointers, and the array elements cannot be inspected in the debugger.

2.12. Default values

A few operations on arrays, as well as the other containers, are designed to return a default value when an out-of-range access is performed. That default value is guaranteed to be zero for numeric types and pointers. For class types, it is an object constructed with the default constructors. If a class doesn't have a default constructor, you cannot store it in a ChiLib container. The remedy is usually simple: Add a default constructor that does nothing to the fields of class type, but zeroes out the numeric and pointer fields:

      class Customer
      {
      public:
         Customer(); 
         Customer(String _name, int _age);
         // ...
      private:
         int _age;
         String _name;
      };

      Customer::Customer() 
      :  _age(0) 
      {}

That is not always acceptable, and it may be impossible to produce a reasonable default that fulfills the class invariant. (The Date class has that problem, and it `solves' it by producing an unreasonable default.)

If you cannot add a default constructor, you can still store pointers to classes in the containers, but you are then responsible for heap allocation and deallocation of the objects to which those pointers point.

2. Arrays

2.1. Introduction