Tuesday, June 28, 2016

Ruby Hashes

Need a reference text for the Ruby programming language? Get the Well-Grounded Rubyist.

An important data structure in Ruby is the hash. Hashes is a data structure of key-value pairs. In other programming languages, it might be called a dictionary or an associative array. If you know JavaScript, Ruby hashes are very similar to Objects. Let us take a look at the most basic aspects of Ruby hashes.

Creating a Hash

You can create a hash using literal notation, with curly braces:

hash = {}

The above is very similar to creating empty arrays, except for hashes you have to use curly braces.

Populating the Hash

To populate the hash, you can use bracket notation. Let us say we want to keep track of a todo list and use the key as the day of the week and the value as what we need to do. Populate the hash for Monday as so:

todo = {}
todo[:monday] = "clean the dishes"

Verify the contents of the hash:

=> {:monday=>"clean the dishes"}

A hash is a data structure of key-value pairs, so in the code above, we are associating the key :monday with the value "clean the dishes." One thing to notice is that we are using a symbol as the key. You could have used a string as the key, but it is common to use symbols as the key to Ruby hashes. Generally, using symbols will speed things up, as ultimately a string key will eventually (and internally) be converted to a symbol anyway.

Let us add one more todo to the hash:

todo[:wednesday] = "karate practice"

Now, the contents of the hash are:

=> {:monday=>"clean the dishes", :wednesday=>"karate practice"}

Hashes are represented using curly braces, with each key-value pair separated by a comma. The mapping between a key and a value is represented using the rocket or arrow symbol =>.

In general, you could have the key or value as any kind of object, not just a string or a symbol. For instance, you could use an integer as the key, but this would seem much like an array. You could also use, say, an array as the value to a certain key. That is perfectly fine in Ruby.

Accessing Nonexistent Key-Value Pairs

Now, what happens if you try to access a key for which the key-value pair does not exist?

=> {:monday=>"clean the dishes", :wednesday=>"karate practice"}

todo[:friday]
=> nil

The answer is nil. If you try to access a key for which there is not value associated with it, you will get nil. This behavior, however, can be changed if you create a new hash using new and then give it an argument for the default value. For instance:

todo = Hash.new
=> {}
todo[:tuesday]
=> nil

The above creates a new hash using Hash.new instead of using just {}. That results in the same outcome. When you try to refer to a key that points to no value, you get nil. Now, if you give new a default value:

todo = Hash.new("DOES NOT EXIST")
=> {}
todo[:tuesday]
=> "DOES NOT EXIST"

todo
=> {}

In the case above, whenever you try to access some key that does not have an associated value, you will get the string "DOES NOT EXIST", because you told Hash.new that that would be the default value in that situation. Mind, however, that the actual hash is still empty even though you got back "DOES NOT EXIST."

Creating a Hash with Initial Key-Value Pairs

So far we had to create a hash from scratch and add key-value pairs one by one. You could also have defined a hash with initial key-value pairs like so:

todo = { :monday => "clean the dishes", :wednesday => "karate practice" }

You can access the value to which a key points to using bracket notation and passing the key as the parameter:

todo[:monday]
=> "clean the dishes"

todo[:wednesday]
=> "karate practice"

Alternative Notation

There is an alternative notation to write hashes. If you are familiar with JavaScript, this will look just like writing JavaScript objects. Given the following definition:

todo = { :monday => "clean the dishes", :wednesday => "karate practice" }

Remove the arrow => and move the colon from the left-hand side of the symbol name to its right-hand side:

todo = { monday: "clean the dishes", wednesday: "karate practice" }

One thing to keep in mind is that you can only use that notation if the key is a symbol. Also, internally, hashes are still represented using the => notation:

todo
=> {:monday=>"clean the dishes", :wednesday=>"karate practice"}

Hash Size and Searching the Hash for Existence of Key/Values

You can find out how many key-value pairs there are in the hash using the size method:

todo
=> {:monday=>"clean the dishes", :wednesday=>"karate practice"}

todo.size
=> 2

Two useful methods to determine the existence of a certain key or certain value are: has_key? and has_value?

Here is an example:

todo.has_key? :monday
=> true

todo.has_key? :tuesday
=> false

Because we have :monday as one of the keys in the todo hash, we get true. However, the hash does not have the key :tuesday, so we get false for that.

Similarly, you can check whether some value is present in the hash:

todo.has_value? "wash the car"
=> false

todo.has_value? "clean the dishes"
=> true

There are many other useful methods that you can find in the Ruby documentation. So check it out

Hashes as the Last Argument to Methods

Say we have a method like so:

def greeting(name, hash)
  puts "Hello, #{name} !"
end

greeting("James", {})

Output:

Hello, James !

The method will simply say Hello followed by whatever name you give as the first argument. For the second argument, I just passed in an empty hash. Let us work with that hash next:

def greeting(name, hash)
  puts "Hello, #{name} !"
  puts "It seems that you have to #{hash[:monday]} on Monday"
end

greeting("James", { :monday => "wash the car" })

Output:

Hello, James !
It seems that you have to wash the car on Monday

So we passed a hash as an argument to greeting. That method then took the value in the hash whose key is :monday and used it to display a message about what the person needs to do.

Now that you understand what the method does, let us get to the point: if the last argument to a method call is a hash, you can omit the curly braces:

greeting("James", :monday => "wash the car")

You will see that a lot. Furthermore, you can also use the alternative notation:

greeting("James", monday: "wash the car")

To me, that looks a lot better. But you have to watch out for what it really means! That is just a hash in disguise. Keep in mind that because the hash is the last argument to the method call, you can omit the curly braces. And then you can also use the alternative notation, because the key is a symbol. It does not matter how many key-value pairs the hash has, the following would be totally okay too:

greeting("James", monday: "wash the car", tuesday: "do the laundry")

Fetch versus Bracket Notation to Retrieve a Value

Given the example:

todo
=> {:monday=>"clean the dishes", :wednesday=>"karate practice"}

You already know that to access a hash's value, you have to give the key within the square brackets:

todo[:monday]
=> "clean the dishes"

You can achieve the same outcome using the fetch method:

todo.fetch(:monday)
=> "clean the dishes"

But is there any difference at all? Consider the case where the key does not exist (i.e. no such key-value pair exists in the hash)

todo[:friday]
=> nil

todo.fetch(:friday)
KeyError: key not found: :friday
from (irb):63:in `fetch'
from (irb):63
from /usr/bin/irb:12:in `<main>'

While using bracket notation returns nil for a key that does not map to any value, using the fetch method will actually raise an exception! Let us go back to our example with the greeting method:

def greeting(name, hash)
  puts "Hello, #{name} !"
  puts "It seems that you have to #{hash[:friday]} on Monday"
  p hash[:friday]
end

greeting("James", { :monday => "wash the car" })

I changed the hash key in the greeting method to :friday (which does not have an associated value) and added a p statement to check the value of hash[:friday]. The output is:

Hello, James !
It seems that you have to  on Monday
nil

Now, if we use fetch instead:

def greeting(name, hash)
  puts "Hello, #{name} !"
  puts "It seems that you have to #{hash.fetch(:friday)} on Monday"
  p hash[:friday]
end

greeting("James", { :monday => "wash the car" })

Output:

Hello, James !
KeyError: key not found: :friday
from (irb):80:in `fetch'
from (irb):80:in `greeting'
from (irb):84
from /usr/bin/irb:12:in `<main>'

When we used bracket notation, the program kept executing and whenever we tried to access an unexistent key, we just got nil and things were just fine. But when we used fetch, it raised a KeyError exception and halted program execution right away. That is the difference between fetch and [] notation: the former raises an exception while the latter only returns nil. If you use fetch, make sure to rescue the exception and do something about it; otherwise, your program will come to a halt and stop. With bracket notation, the program will still keep going, even though having nil in certain places might have undesired effects in what you are trying to do. So handle the nil as well! :)

Conclusion

Ruby hashes are an important data structure that allows you to store data in key-value pairs. Make sure to understand them well, play with irb and make your own hashes. Try doing crazy things with it. Give it different kinds of keys and values and see what you get. Try accessing something that does not exist. Have fun! :)

Looking for a reference text for Ruby? Get the Well-Grounded Rubyist.

No comments:

Post a Comment