Practical Examples

Hash tables have many practical use cases, but here we’re going to focus on using them to increase algorithm speed.

In Chapter 1, Why Data Structures Matter, we learned about array-based sets—arrays that ensure that no duplicate values exist. There we found that every time a new value is inserted into the set, we need to run a linear search (if the set is unordered) to make sure that the value we’re inserting isn’t already in the set.

If we’re dealing with a large set in which we’re making lots of insertions, this can get unwieldy very quickly, since every insertion runs at O(N), which isn’t very fast.

In many cases, we can use a hash table to serve as a set.

When we use an array as a set, we simply place each piece of data in a cell within the array. When we use a hash table as a set, however, each piece of data is a key within the hash table, and the value can be anything, such as a 1 or a Boolean value of true.

Let’s say we set up our hash table set in JavaScript as follows:

 var​ ​set​ = {};

Let’s add a few values to the set:

 set​[​"apple"​] = 1;
 set​[​"banana"​] = 1;
 set​[​"cucumber"​] = 1;

Every time we want to insert a new value, instead of a O(N) linear search, we can simply add it in O(1) time. This is true even if we add a key that already exists:

 set​[​"banana"​] = 1;

When we attempt to add another "banana" to the set, we don’t need to check whether "banana" already exists, since even if it does, we’re simply overwriting the value for that key with the number 1.

The truth is that hash tables are perfect for any situation where we want to keep track of which values exist within a dataset. In Chapter 4, Speeding Up Your Code with Big O, we discussed writing a JavaScript function that would tell us whether any duplicate numbers occur within an array. The first solution we provided was:

 function​ hasDuplicateValue(array) {
 for​(​var​ i = 0; i < array.length; i++) {
 for​(​var​ j = 0; j < array.length; j++) {
 if​(i !== j && array[i] == array[j]) {
 return​ ​true​;
  }
  }
  }
 return​ ​false​;
 }

We noted there that this code, which contains nested loops, runs at O(N2).

The second solution we suggested there runs at O(N), but only works if the array exclusively contains positive integers. What if the array contains something else, like strings?

With a hash table (which is called an object in JavaScript), we can achieve a similar solution that would effectively handle strings:

 function​ hasDuplicateValue(array) {
 var​ existingValues = {};
 for​(​var​ i = 0; i < array.length; i++) {
 if​(existingValues[array[i]] === ​undefined​) {
  existingValues[array[i]] = 1;
  } ​else​ {
 return​ ​true​;
  }
  }
 return​ ​false​;
 }

This approach also runs at O(N). Here existingValues is a hash table rather than an array, so its keys can include strings as well as integers.

Let’s say we are building an electronic voting machine, in which voters can choose from a list of candidates or write in another candidate. If the only time we counted the final tally of votes was at the end of the election, we could store the votes as a simple array, and insert each vote at the end as it comes in:

 var​ votes = [];
 
 function​ addVote(candidate) {
  votes.push(candidate);
 }

We’d end up with a very long array that would look something like

 ["Thomas Jefferson", "John Adams", "John Adams", "Thomas Jefferson",
  "John Adams", ...]

With this approach, each insertion would only take O(1).

But what about the final tally? Well, we could count the votes by running a loop and keeping track of the results in a hash table:

 function​ countVotes(votes) {
 var​ tally = {};
 for​(​var​ i = 0; i < votes.length; i++) {
 if​(tally[votes[i]]) {
  tally[votes[i]]++;
  } ​else​ {
  tally[votes[i]] = 1;
  }
  }
 
 return​ tally;
 }

However, since we’d have to count each vote at the end of the day, the countVotes function would take O(N). This might take too long!

Instead, we may consider using the hash table to store the data in the first place:

 var​ votes = {};
 
 function​ addVote(candidate) {
 if​(votes[candidate]) {
  votes[candidate]++;
  } ​else​ {
  votes[candidate] = 1;
  }
 }
 
 function​ countVotes() {
 return​ votes;
 }

With this technique, not only are insertions O(1), but our tally is as well, since it’s already kept as the voting takes place.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset