Professor Abdul-Quader
Lesson 9 (Sets / Hashtable Intro)
The following question was posed to me on a phone interview.
Given a list of 2,000,000 unique 7-digit phone numbers, output them in order using no more than 4MB of additional memory.
Each integer requires 4 bytes of space. Can we store them all in memory? If not, how do we sort them if we can’t store all at once?
Discuss. Design your best solution. (5-10 minutes.)
Prove (by induction) that there are
To prove a statement for every number
A set is a gathering together into a whole of distinct objects of our thought – which are called elements of the set.
(Georg Cantor)
Mathematically, a set is a collection of distinct objects (no repetitions). Two sets are considered equal if they have the same elements.
For our perspective, we consider the following operations on a set:
Sometimes “insert” / “remove” are boolean methods, to indicate whether you were able to successfully modify the set (returning false, for example, if you tried to insert a duplicate).
Exercise: Using an ArrayList, implement the Set interface. Analyze the time complexity of contains / insert / remove.
table
, size table[f(item)]
table[f(item)] == null
Suppose we have a large array of size
Simple idea: think of a word as a number in base 27 (“a” represents the digit 1, and “z” represents the digit 27), then convert it to an integer.
Let’s look at the built-in hash function for Strings in Java.
The math here is a little bit of a challenge, let’s do some examples!
Insert
Two main strategies for collision resolution:
As we learn about these, keep this question in mind: in what scenarios (in terms of the table size
contains
: compute the hash function, check the relevant list.