Quick & Dirty Notes / Mini-Quiz: HashTables

A quick cheat sheet / mini-quiz on HashTables to brush up on the fundamentals.

This blog post will be formatted with a question and then a corresponding answer. I find that having a question posed can help solidify my ability to recall information as well as force me to have a better understanding of the topic. See if it does the same for you.

As you go through the blog try to answer the questions in your own words before looking at the provided answer.

What are some of the other names for a Hash Table?

  • Maps
  • Unordered Maps
  • Dictionaries
  • Objects

*depending on the language there will be slight differences, but they’ll have a built-in hashTable

JS → Object
Python → Dictionaries
Java → Maps
Ruby → Hashes

What is a hash table?

  • Keys are used as the index to locate the data in memory.
  • The data is known as the ‘value’ → Hash Tables create ‘key value pairs’.

*This is done with a hash function. The hash function takes the name of the key and creates a key into an index in memory and stores both the key and its value within the memory slot/address.

What is a hash function?

i.e. it takes a set of inputs of any arbitrary size and fits them into a table or other data structure that contains fixed-size elements.

What are the Key Aspects of Hash Functions?

  • Once the hash is generated, if you pass the same input to the hash function it will reproduce the exact same hash. If the input is changed at all it will produce a different corresponding hash.

* Hash Functions are idempotent: A function given the same input will always produce the same output. → pure functions are idempotent by nature.

What’s the benefit of a Hash Function? Why would we want to use it in a data structure? How does it do it?

  • The hash function will take the key → convert it to a hash → then convert the hash into an index/address space where it will store the data for the key.
  • When we look up data, we simply have to pass the key → the function generates the hash → and we immediately know where to retrieve our data.
  • Because we have the hash/index address we can lookup data very quickly.
  • Hash functions built into languages for Hash Tables are intentionally very fast in order to allow for quick lookups. We can assume their bigO to be O(1) → constant time.

Briefly go over the procedure of adding/retrieving data to a hash table.

What are the BigO’s for Hash Table operations? How do they work under the hood?

  • lookup → O(1)
    *same → pass the key→ the key gets hashed → we use the hash to look up the exact space to find our value/data
    **unless there’s a collision → O(n)
  • delete → O(1)
    *same → use the key → right away we know where to delete the item from → because it's not ordered, we do not need to shift indices like arrays so no O(n)
  • search → O(1)
    *simply use the hash function to check to see if the value is there

What can you store within a Hash table?

However, typically the key and the value can be of any data type → all data types are allowed

That said. Within Javascript ES6.

Hash Tables:
* will always have their keys as strings → whether an int or function it will be stringified. The values can be of any data type including functions.
** has no order → data is stored randomly

Hash Maps:
* will allow for keys to be of any data type including a function. Same for values. You can Save any data type as a key.
** also allow for data to be stored in sequence → maintains the insertion order unlike Hash Tables → therefore when you loop through a Hash Map → you loop through in order.

Hash Sets:
* stores only the keys and no values

What is a collision?

Note: this occurs when two different inputs produce the same output in hashing (in hashing X inputs are mapped to a fixed # characters)

* there is no way to predict collisions and is considered an acceptable risk — therefore hashing is still used for integrity and non-repudiation

How does a collision occur?

Because the hashing function doesn’t have anything to tell it to distribute space evenly, it randomly assigns the same space to two separate keys when the hashing function returns the same hash for both keys.

How does a collision impact runtime?

There are a few common ways of resolving collisions:

- separate chaining
- open addressing
- robin hood hashing

(see wiki page to learn more)

Why use a hash table over an array?

  • Search → O(1)
    *often used in databases
  • insert → O(1)
    *most of the time we don’t need to worry about collision
  • lookup → O(1)
  • delete → O(1)

Compared to Array:

  • search → O(n)
  • insert → O(n)
  • lookup → O(1)
  • *push → O(1)
  • delete → O(n)

What are some strengths and weaknesses of Hash Tables?

  • Slower key iteration → O(n).
    Unordered / Scattered Memory → makes it take longer to locate all addresses in memory

What's the trade-off with Hash Tables?

  • Cost → requires more memory → greater Space Complexity of O(n).

How would you implement your own Hash Table?

That’s it for this mini-quiz/cheatsheet. Hope it tickled your brain and you had some fun!

Just another one of those dreamers with a sparkle in his eyes.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store