CS 46B - Lecture 20

Cover page image

Pre-class reading

Section 16.4

Implementing a Hash Table

Set implementation
Adding, finding, removing an element must be fast
Order doesn't matter
Basic idea: put objects in an array
Index computed from object
```
buffer[index(obj)] = obj
```

Finding duplicates is easy:

if (buffer[index(obj)] != null) // obj already present

Assumptions
- Computing index(obj) is fast
- No two objects occupy the same locations

Hash Functions

Object class has a hashCode method
Computes an integer for each object
Override in your own class to be compatible with equals
Hash code should depend on object contents
Hash code should “spread around” objects into different numbers

Example: Hash code for String

final int HASH_MULTIPLIER = 31;
int h = 0;
for (int i = 0; i < s.length(); i++)
   h = HASH_MULTIPLIER * h + s.charAt(i);

Lecture 20 Clicker Question 1

Trivia fact of the day: 'd' has code 100.

What is the hash code for the string "cab"?

294
98244
97284
Something else

Implementing Hash Functions

Form hash code of each instance variable that is compared in equals
Combine these hash codes

public class Country
{
   public int hashCode()
   {
      int h1 = name.hashCode();
      int h2 = new Double(area).hashCode();
      final int HASH_MULTIPLIER = 31;
      int h = HASH_MULTIPLIER * h1 + h2;
      return h;
   }
}

Collisions

index(obj) = Math.abs(obj.hashCode) % tableSize
What if two objects occupy the same index?
A collision
Could move the colliding element somewhere else in the array
- E.g. the next empty position
- Gets complicated (Special Topic 16.2)
Easier approach: Put all collisions in a linked list

Finding an Element

Check whether obj is already present
Compute index(obj) from hash code
Look at all elements in bucket
For each of them, call element.equals(obj)
Until one of the calls returns true or you checked them all
How efficient?
Assume hashCode and equals is O(1)
Assume buckets are short
- When table gets too full, grow it
Finding element is O(1)

Adding an Element

Compute index(obj) from hash code
Check if element is present in the bucket
If any of them equals(obj), return
Otherwise, add to bucket (as first element)
O(1)+
Removal is similar, also O(1)+
Why +? If table gets too large/small, reallocate

Lecture 20 Clicker Question 2

Fred makes a hash table of Employee objects and defines an equals method

public boolean equals(Object other) { return id == ((Employee) other).id; }

But he forgets to implement a hashCode method for the Employee class What happens?

He gets a compile-time error
He gets a run-time exception
When adding multiple employee objects with the same ID, he sometimes ends up with more than one
When adding multiple employee objects with the same ID, he always ends up with more than one

Lecture 20 Clicker Question 3

Fred makes a hash table of Employee objects and defines a hashCode method

public int hashCode() { return id; }

But he forgets to implement an equals method for the Employee class What happens?

He gets a compile-time error
He gets a run-time exception
When adding multiple employee objects with the same ID, he sometimes ends up with more than one
When adding multiple employee objects with the same ID, he always ends up with more than one

Lecture 20 Clicker Question 4

Fred makes a hash table of Employee objects and defines a hashCode method

public int hashCode() { return id; }

and an equals method

public boolean equals(Employee other) { return id == other.id; }

But his hash table doesn't work right. What happens?

He gets a compile-time error
He gets a run-time exception
When adding multiple employee objects with the same ID, he sometimes ends up with more than one
When adding multiple employee objects with the same ID, he always ends up with more than one

Lecture 20 Clicker Question 5

Fred makes a hash table of Employee objects and defines an equals method

public boolean equals(Object other) { return id == ((Employee) other).id; }

and a hashCode method

public int hashCode() { return -1; }

What happens?

He gets a compile-time error
He gets a run-time exception
When adding multiple employee objects with the same ID, he sometimes ends up with more than one
The hash table works correctly, but its performance is disappointing

Iterating over a Hash Table

Iterator has bucket index + reference to current element
next advances to next element in current bucket
or if there aren't any more, to first element in next bucket
Assuming the table isn't too sparse, each call to next is O(1)

Lecture 20 Clicker Question 6

Suppose you have two hash tables, each with n elements. To find the elements that are in both tables, you iterate over the first table, and for each element, check whether it is contained in the second table. What is the big-Oh efficiency of this algorithm?

O(1)
O(1)+
O(n)
O(n²)