Tuesday 11 March 2014

Hashes

Hashes are another kind of variable and in my opinion they are pretty similar to arrays. They are indexed, the same as arrays, but instead of being indexed with a number, they are indexed with a user defined string and this index is called a key. This gives us a key/value pair. A good example of this is a dictionary (an actual book dictionary) where the word is the key and the definition is the value.
The value of a hash can be any type of variable - it can be a number, a string, another hash, an array or even a variable name...

You can tell that you are looking at a hash if it's prefixed by a % sigil. I think it makes more sense to have a hash (#) as the sigil if you're going to call it a hash but never mind, a hash is already reserved for comments.
%hash
A hash can actually be modelled by using arrays like this:
1.   my @french_words = ("bonjour", "au revoir", "merci");
2. my $hello = 0;
3. my $goodbye = 1;
4. my $thank_you = 2;
5. print $french_words[$goodbye];
But it looks really strange and clumsy and who wants to bother with all that when a hash works perfectly well?


Declaring and assigning

Again there are lots and lots and lots of ways of declaring and assigning a hash.
You can declare the hash and then add one element at a time:
1.   my %hash;
2. $hash{"up"} = "down";
3. $hash{"top"} = "bottom";
4. $hash{"charm"} = "strange";
What does this all mean? - First, I've declared the hash in line 1 using the % sigil and in lines 2 to 4 I've added individual elements to it. The individual elements of the hash are prefixed with a $ because they hold an individual items - a scalar. Then comes the hash name followed by the key you want in curly brackets, the key can be any string. You then assign this to what you want as a value, more on what you can assign to it later.

or you can declare everything all together:
1. my %hash = ("up", "down", "top", "bottom", "charm", "strange");
Here I've assigned a list to the hash. The list needs to have an even amount of elements because consecutive elements are paired up into key/value pairs. If you have an odd amount of elements, the last value is going to be undef. In this hash the key/value pairs are up-down, top-bottom and charm-strange.

This version is more readable:
1. %hash = ("up" => "down", "top" => "bottom", "charm" => "strange");
so you can see which are the keys and which are the values more easily.

Or you can do the even more readable and, I think, preferred by the perl community:
1.   my %hash = (
2.     up => "down",
3.     top => "bottom",
4.     charm => "strange",
5. );
This is exactly the same as the method above it, only spaced out to be clearer.
And of course if I was coding properly all of the "fat commas" (these things "=>") as they're called would all line up, but this blog thing won't let me do that!

Also the thing about the comma at the end of line 4 applies here as well - you don't need it but it's a good idea to put it there to prevent unnecessary line changes if extra elements are added to the end of the hash. See my array post for more details.

You may have noticed that in my last code example, I didn't put quotes around the keys. You don't actually need them because the key is always going to be a string and perl knows this. You only need to put quotes if you're including whitespace or other special characters such as "-", so in these cases you need to explicitly stringify (make into a string) the key by using the quotes.


Printing
As far as I can see, when you print out a hash, they don't necessarily come out in the same order that you declared them in but they do print out in the same order every time you print them. This is because they are printed out in their internal order, which can't be relied on because it will change if you add or delete key/values pairs but will stay the same otherwise. It also changes with the version of perl you're using because the way the keys are ordered has changed several times.

This is one easy way to print a hash that involves a for loop, which I haven't covered yet. But for the moment, just trust me that it works. This will print out all of the key/value pairs with nice spacing:
1.   print "$_ $hash{$_}\n" for (keys %hash);
writing print %hash does the same thing but squashes everything together

Note The trick I showed to print arrays in my last post doesn't work at all here. If you type print "%hash\n" you will get "%hash" printed to the screen.


Accessing and using elements of the hash

Deleting Elements
Hashes are not fixed sizes so you can add to them and delete from them as you like. Unlike with arrays, when you delete an element from a hash, there won't be and undefs unless you only get rid of the key or only the value. This means you don't need to worry about having any gaps in your hash.

To delete a key/value pair, you can do this:
1.   delete $hash{"up"};
Note that you only have to specify the key and the value will be automatically found and deleted as well. I guess this would be useful in cases where you might not know the value. Maybe.

Adding Elements
This is exactly the same as when you're first creating the hash and adding one element at a time. If you want to add more elements later on, you just do exactly the same thing:
1.   $hash{key} = "value";
There is no need to use "my" because the hash has already been declared, you're just adding to an existing variable.

What can I put in my hash?
You don't always have to have the value as a string, it could be a number or a variable containing a string or a number:
1.  $hash{key} = $value;
Or an array or a hash:
1.  $hash{key} = \@values;
Be very careful when doing this, you have to make sure that you give it a reference to an array or a reference to a hash rather than the thing itself. I'll go more into these later on but this is how to do it for now. You either put a backslash in front if you're using a variable as above or, if you're putting the hash or array straight in, you need to use [] for arrays and {} for hashes.

The reason for doing this is because, if you don't, the array or hash will just become part of the original hash you've created, new keys and values will be created. Hopefully this example will explain what I mean:

1.  my %address = ('Line One' => "5 The Street", 'Town/City' => "London", 'Post Code' => "W15 9QT");
2.
3.my %person = (
4.    Name => "Emma", 
5.    Age => 23,
6.    Height => "164cm",
7.    Address => \%address,
8.    );

The hash drawn out will now look something like this (as you would expect):

Name => Emma
Age => 23
Height => 164 cm
Address => (
    Line One => 5 The Street
    Town/City => London
    Post Code => W15 9QT
)

I created a script that would run the code above but I took out the backslash on line 7. Here is what came out:

Name => Emma
Age => 23
Height => 164cm
Address => Post Code
W15 9QT => Line One
5 The Street => Town/City
London => undef

This is clearly not what we wanted, instead of a hash within a hash, there is only one big hash.

And also, if you were reading carefully before, you'll see that I said each element of the hash contains a scalar. Arrays and hashes aren't scalars but their references are so this is another reason why you must make any hashes or arrays into references if you don't want them to be

Moral of the story, make sure you use a reference if you're going to do a hash within a hash or an array within a hash!!!


Editing Elements
If you want change a value, you just assign it to the key and the old value will be overwritten:
1.  $hash{key} = "new value";
If the key doesn't already exist, a new key/value pair is created so you need to be careful with the spelling of the key when you want to edit a key that's already there or you could end up with the original key and a misspelled version of the key.

Changing the key of a key/value pair is a lot more tricky. On looking up ways to change it, I think the best way to do it is to delete the key/value pair and start again. You can change it, but it's a lot more code than just deleting and starting again.

Duplication
Duplicate keys are not allowed although duplicate values are. If there are duplicate keys declared, only the last one will be acknowledged and the rest will be disregarded. So if we did something like this:
1.   my %hash = (
2.     "Name" => "Fred",
3.     "Weight" => "70kg",
4.     "Height" => "190cm",
5.     "Weight" => "75kg",
6. );
7. print $hash{Weight};
The answer printed will be 75kg.

Can you get the element key from the value?
Kind of. With some coding. There's unfortunately no easy trick here. Also values don't have to be unique so you could end up with the wrong key.

Adding hashes together

This is really easy, assuming you already have two hashes that already have things in them (%hash_one and %hash_two), you can just do this:
1.  %hash_three = (%hash_one, %hash_two);
Easy!

Exists - is the key already in the hash?
This is useful because duplicate keys aren't really allowed so you can check first if the key already exists before you add a new key/value pair.

And finally...
To get a list of all the keys in the hash:
print keys(%hash);
And to get a list of all the values in the hash - you guessed it:
print values(%hash);