Tuesday, June 28, 2016

Ruby Hashes

Need a reference text for the Ruby programming language? Get the Well-Grounded Rubyist.

An important data structure in Ruby is the hash. Hashes is a data structure of key-value pairs. In other programming languages, it might be called a dictionary or an associative array. If you know JavaScript, Ruby hashes are very similar to Objects. Let us take a look at the most basic aspects of Ruby hashes.

Creating a Hash

You can create a hash using literal notation, with curly braces:

hash = {}

The above is very similar to creating empty arrays, except for hashes you have to use curly braces.

Populating the Hash

To populate the hash, you can use bracket notation. Let us say we want to keep track of a todo list and use the key as the day of the week and the value as what we need to do. Populate the hash for Monday as so:

todo = {}
todo[:monday] = "clean the dishes"

Verify the contents of the hash:

=> {:monday=>"clean the dishes"}

A hash is a data structure of key-value pairs, so in the code above, we are associating the key :monday with the value "clean the dishes." One thing to notice is that we are using a symbol as the key. You could have used a string as the key, but it is common to use symbols as the key to Ruby hashes. Generally, using symbols will speed things up, as ultimately a string key will eventually (and internally) be converted to a symbol anyway.

Let us add one more todo to the hash:

todo[:wednesday] = "karate practice"

Now, the contents of the hash are:

=> {:monday=>"clean the dishes", :wednesday=>"karate practice"}

Hashes are represented using curly braces, with each key-value pair separated by a comma. The mapping between a key and a value is represented using the rocket or arrow symbol =>.

In general, you could have the key or value as any kind of object, not just a string or a symbol. For instance, you could use an integer as the key, but this would seem much like an array. You could also use, say, an array as the value to a certain key. That is perfectly fine in Ruby.

Accessing Nonexistent Key-Value Pairs

Now, what happens if you try to access a key for which the key-value pair does not exist?

=> {:monday=>"clean the dishes", :wednesday=>"karate practice"}

todo[:friday]
=> nil

The answer is nil. If you try to access a key for which there is not value associated with it, you will get nil. This behavior, however, can be changed if you create a new hash using new and then give it an argument for the default value. For instance:

todo = Hash.new
=> {}
todo[:tuesday]
=> nil

The above creates a new hash using Hash.new instead of using just {}. That results in the same outcome. When you try to refer to a key that points to no value, you get nil. Now, if you give new a default value:

todo = Hash.new("DOES NOT EXIST")
=> {}
todo[:tuesday]
=> "DOES NOT EXIST"

todo
=> {}

In the case above, whenever you try to access some key that does not have an associated value, you will get the string "DOES NOT EXIST", because you told Hash.new that that would be the default value in that situation. Mind, however, that the actual hash is still empty even though you got back "DOES NOT EXIST."

Creating a Hash with Initial Key-Value Pairs

So far we had to create a hash from scratch and add key-value pairs one by one. You could also have defined a hash with initial key-value pairs like so:

todo = { :monday => "clean the dishes", :wednesday => "karate practice" }

You can access the value to which a key points to using bracket notation and passing the key as the parameter:

todo[:monday]
=> "clean the dishes"

todo[:wednesday]
=> "karate practice"

Alternative Notation

There is an alternative notation to write hashes. If you are familiar with JavaScript, this will look just like writing JavaScript objects. Given the following definition:

todo = { :monday => "clean the dishes", :wednesday => "karate practice" }

Remove the arrow => and move the colon from the left-hand side of the symbol name to its right-hand side:

todo = { monday: "clean the dishes", wednesday: "karate practice" }

One thing to keep in mind is that you can only use that notation if the key is a symbol. Also, internally, hashes are still represented using the => notation:

todo
=> {:monday=>"clean the dishes", :wednesday=>"karate practice"}

Hash Size and Searching the Hash for Existence of Key/Values

You can find out how many key-value pairs there are in the hash using the size method:

todo
=> {:monday=>"clean the dishes", :wednesday=>"karate practice"}

todo.size
=> 2

Two useful methods to determine the existence of a certain key or certain value are: has_key? and has_value?

Here is an example:

todo.has_key? :monday
=> true

todo.has_key? :tuesday
=> false

Because we have :monday as one of the keys in the todo hash, we get true. However, the hash does not have the key :tuesday, so we get false for that.

Similarly, you can check whether some value is present in the hash:

todo.has_value? "wash the car"
=> false

todo.has_value? "clean the dishes"
=> true

There are many other useful methods that you can find in the Ruby documentation. So check it out

Hashes as the Last Argument to Methods

Say we have a method like so:

def greeting(name, hash)
  puts "Hello, #{name} !"
end

greeting("James", {})

Output:

Hello, James !

The method will simply say Hello followed by whatever name you give as the first argument. For the second argument, I just passed in an empty hash. Let us work with that hash next:

def greeting(name, hash)
  puts "Hello, #{name} !"
  puts "It seems that you have to #{hash[:monday]} on Monday"
end

greeting("James", { :monday => "wash the car" })

Output:

Hello, James !
It seems that you have to wash the car on Monday

So we passed a hash as an argument to greeting. That method then took the value in the hash whose key is :monday and used it to display a message about what the person needs to do.

Now that you understand what the method does, let us get to the point: if the last argument to a method call is a hash, you can omit the curly braces:

greeting("James", :monday => "wash the car")

You will see that a lot. Furthermore, you can also use the alternative notation:

greeting("James", monday: "wash the car")

To me, that looks a lot better. But you have to watch out for what it really means! That is just a hash in disguise. Keep in mind that because the hash is the last argument to the method call, you can omit the curly braces. And then you can also use the alternative notation, because the key is a symbol. It does not matter how many key-value pairs the hash has, the following would be totally okay too:

greeting("James", monday: "wash the car", tuesday: "do the laundry")

Fetch versus Bracket Notation to Retrieve a Value

Given the example:

todo
=> {:monday=>"clean the dishes", :wednesday=>"karate practice"}

You already know that to access a hash's value, you have to give the key within the square brackets:

todo[:monday]
=> "clean the dishes"

You can achieve the same outcome using the fetch method:

todo.fetch(:monday)
=> "clean the dishes"

But is there any difference at all? Consider the case where the key does not exist (i.e. no such key-value pair exists in the hash)

todo[:friday]
=> nil

todo.fetch(:friday)
KeyError: key not found: :friday
from (irb):63:in `fetch'
from (irb):63
from /usr/bin/irb:12:in `<main>'

While using bracket notation returns nil for a key that does not map to any value, using the fetch method will actually raise an exception! Let us go back to our example with the greeting method:

def greeting(name, hash)
  puts "Hello, #{name} !"
  puts "It seems that you have to #{hash[:friday]} on Monday"
  p hash[:friday]
end

greeting("James", { :monday => "wash the car" })

I changed the hash key in the greeting method to :friday (which does not have an associated value) and added a p statement to check the value of hash[:friday]. The output is:

Hello, James !
It seems that you have to  on Monday
nil

Now, if we use fetch instead:

def greeting(name, hash)
  puts "Hello, #{name} !"
  puts "It seems that you have to #{hash.fetch(:friday)} on Monday"
  p hash[:friday]
end

greeting("James", { :monday => "wash the car" })

Output:

Hello, James !
KeyError: key not found: :friday
from (irb):80:in `fetch'
from (irb):80:in `greeting'
from (irb):84
from /usr/bin/irb:12:in `<main>'

When we used bracket notation, the program kept executing and whenever we tried to access an unexistent key, we just got nil and things were just fine. But when we used fetch, it raised a KeyError exception and halted program execution right away. That is the difference between fetch and [] notation: the former raises an exception while the latter only returns nil. If you use fetch, make sure to rescue the exception and do something about it; otherwise, your program will come to a halt and stop. With bracket notation, the program will still keep going, even though having nil in certain places might have undesired effects in what you are trying to do. So handle the nil as well! :)

Conclusion

Ruby hashes are an important data structure that allows you to store data in key-value pairs. Make sure to understand them well, play with irb and make your own hashes. Try doing crazy things with it. Give it different kinds of keys and values and see what you get. Try accessing something that does not exist. Have fun! :)

Looking for a reference text for Ruby? Get the Well-Grounded Rubyist.

Monday, June 27, 2016

Ruby Arrays

Need a reference text for the Ruby programming language? Get the Well-Grounded Rubyist.

You can use arrays to group similar data into a single entity, instead of having to create multiple variables just to separate that same kind of data. Ruby makes it really convenient for you to work with arrays: you can use bracket notation to access and set elements and use stack/queue-like methods to manipulate its data (i.e. push, pop, unshift, shift). Furthermore, you can easily go through an array and use that data in some form using iterators like each, map, select, and reject.

Creating an Array

You can either create an empty array or start it off with a few elements. To start off from scratch, you can use the [] empty array:

animals = []

Then, you would populate the array with elements using the operations we will learn later.

Otherwise, you can just initialize it with some elements:

animals = ["dog", "cat", "bear", "horse", "eagle"]

Arrays are denoted within brackets and have to be comma-separated. The array above is an array of strings and has 5 elements. Although we used strings above, you could use just about any data type because Ruby allows for arrays to have a mix of different types. For example, the following would be okay too:

array = [12, "dog", 3, "cat", 4.65, 0, "eagle"]

The above is an array of mixed data types -- it has integers, floats, and strings.

Accessing Elements

Coming back to our animals example, we can access each element using bracket notation:

animals = ["dog", "cat", "bear", "horse", "eagle"]

animals[0] is the first element of the array ("dog")
animals[4] is the last element of the array ("eagle")

To access elements in an array, you start counting from zero: so the first element will be at position 0 and the last element will be at position given by the size of the array MINUS one. In the case above, we have a 5-element array, so the position of the last element is 5 - 1, which is 4.

You can determine the size of the array in Ruby using the length or size method:

animals.size 
animals.length

They both return the same thing and will give you the number of elements in the array.

Modifying Array At the End

You can add new elements to the end of the array using the push method:

animals = ["dog", "cat", "bear", "horse", "eagle"]

animals.push("pigeon")

=> ["dog", "cat", "bear", "horse", "eagle", "pigeon"]

Note how the new element "pigeon" was added to the end of the array. This is like a stack for those of you familiar with basic data structures.

To remove an element from the end of the array, you can use pop:

animals = ["dog", "cat", "bear", "horse", "eagle"]

animals.pop
=> "eagle"

=> ["dog", "cat", "bear", "horse"]

Ruby gives you the popped element as the return value from the pop method, you can do something with that. The element is removed from the end of the array, though.

Modifying the Array At the Beginning

There are two other methods that allow you to add/remove elements from the beginning of the array instead of at the end: they are unshift and shift.

You can use unshift to add an element to the beginning of an array:

animals = ["dog", "cat", "bear", "horse", "eagle"]

animals.unshift("dinosaur")

=> ["dinosaur", "dog", "cat", "bear", "horse", "eagle"]

The new element has been added to the front of the array. This is like a queue for those of you familiar with basic data structures.

If you want to remove an element from the beginning of the array, use shift:

animals = ["dog", "cat", "bear", "horse", "eagle"]

animals.shift
=> "dog"

animals
=> ["cat", "bear", "horse", "eagle"]

When you do a shift operation, the element at the beginning of the array is removed and then returned to you so you can do something with it.

Modifying Specific Elements

You can use bracket notation to modify specific elements of an array at a certain index. For example, given the array of animals, you could replace the second element, "cat", with something else:

animals = ["dog", "cat", "bear", "horse", "eagle"]
animals[1] = "turtle"
animals
=> ["dog", "turtle", "bear", "horse", "eagle"]

Notice how the second element (whose index is 1) was replaced by "turtle."

Iterating Over Array

You can go through each element in the array using the each iterator. It is part of the Enumerable module. You can find many useful things there, so check it out. Anyway, here is an example:

animals = ["dog", "turtle", "bear", "horse", "eagle"]

animals.each { |animal| puts animal }

The above will go through each element in the array of animals and display its value (each on its own line). The each iterator accepts a block (denoted within braces) whose block parameter variable will take in each array element, one at a time, starting from the first one up to the last one, and then do something with it. In this case, I am using the puts statement to display each element's value.

Output:

dog
turtle
bear
horse
eagle

Other useful iterators are map, select, and reject. I invite you to look them up in the Ruby documentation. Now, let us see how to iterate over an array using a plain while loop:

animals = ["dog", "turtle", "bear", "horse", "eagle"]

index = 0

while index < animals.length
  puts animals[index]
  index += 1
end

The above uses the index variable to keep track of the location in the array. It starts off at 0 because the first element of the array is at position 0. Then, the loop condition is that the index is less than the number of elements in the array. Remember the last element is always the array size MINUS one. Inside the loop, you just display the value of the element at position index in the animals array. At the end of the loop, just before the end keyword, make sure to have the increment for index, so we do not end up with an infinite loop. The output of the above construct will be:

dog
turtle
bear
horse
eagle

Conclusion

Ruby makes it really nice and intuitive to work with arrays. You can think of them as stacks or queues, too. You can easily create, access, and modify its elements using Ruby's built-in methods. Furthermore, the Enumerable module has some interesting methods and iterators that allow you to extract certain pieces of information from Arrays. Don't forget to check that out as well! :) Thank you for reading and have fun! 

Looking for a reference text for Ruby? Get the Well-Grounded Rubyist.