Thursday 10 April 2014

Loops and conditionals: For each

Arrays and lists and things like that are all very well, but we often need a way of going through each element sequentially to make use out of them - maybe to search for something or to do some function on each element. This is where for and foreach loops come in.

A for(each) loop goes through each item in an array or a list or anything else you want to iterate through for example numbers 1-10. In fact, if the thing you put inside the brackets returns a list, it can be iterated over.

It's general layout looks something like this:

for(each) (array/list/whatever) {
    code block
}

For or foreach?
I've just discovered that for and foreach are completely identical and do exactly the same thing. You can use them both interchangeably. The underlying code for both is exactly the same. This leads me to think why then, are there two different words for it, but unfortunately I don't think there is a real answer to it.
I wanted to find out which one programmers prefer and I asked around at work and it seems like the only reason they pick one over the other is convention. Most of the people I asked use for and foreach in the following way:

For is generally used if you don't have something specific to iterate through. You can initialise a variable, condition check and increment the variable.
1.   for (my $i = 1; $i < 9; $i++) {
2.     print "$i "
3. }
This just prints out numbers 1 to 8 with a space between each.

For those who are unfamiliar with the above -
my $i = 1 - initialising the variable to use.
$i < 9 - giving the variable a maximum or minimum size.
$i++ - showing how much to increment or decrement the variable with each pass of the loop - in this case, it means plus one.

Basically all of this put together means, there is a variable called $i with a value of one. Start with $i = 1 in the first pass of the loop and add one to it each time you go through the loop until $i is no longer less than 9, then exit the loop.


Foreach is often used if you are going through an array or list or hash and you want to go through the elements of each one. For example:
1.   foreach (@myarray) {
2.     print "$_\n";
3. }
This just goes through each item in the array and prints them out on individual lines.

You can however, swap the for and foreach around or you can use the same word for both usages, it's totally up to you but I think I'm going to stick with how I've described it as above.

The good thing about foreach loops in perl, which I haven't come across before, is that you don't have to explicitly say how big the array or whatever you're iterating through is and you don't have to tell it that you want to go to the next item once it's finished with the item it's on.

What is this $_?
In this case, it is a quick and anonymous way of referring to each individual item of the array, it's called the "default" variable. It represents the scalar that's being focussed on so in the case of a for loop, it's the list item or array item that's being currently looked at. It's kind of like using the word "it" in the English language - you know what you are referring to, but you're using a general word.

It is bad practice to directly alter the original array so instead you can assign the individual elements to a new variable as I've done below - $item refers to each individual array item:
1.   foreach my $item (@myarray) {
2.     print "$item\n";
3. }
You can also use foreach with hashes, it's very similar to arrays:
1.   foreach my $key (keys %myhash) {
2.     print "this is the key: $key\n";
3.     print "this is the value: ".$myhash{$key}."\n";
4. }
This will print out the statements followed by the key on one line and the value on the next for each of the items in the hash.
Note the "my" isn't completely necessary, because if you leave it out, the "my" will be implied anyway.

You need to write the word "key" inside the brackets before writing the name of the hash because what you are doing is getting a list of keys in the hash and then iterating over them. Within the code block, you can then use the key to get the values.

I then wondered about not just getting a list of the keys and iterating over them, but getting the keys and the values and iterating over them and I was told about "each". You can't really use it with a for loop and it looks something like this:
1.   while (my ($key, $value) = each %hash) {
2.     print "key is $key, value is $value\n";
3. }
As you can see you use a while loop, which I haven't covered yet but basically it just goes through all of the keys and values and prints them out.
A warning does come with using each - you need to make sure that nothing else in your program can is changing the hash you're iterating over because if changes happen during the while loop. You may end up skipping or duplicating entries.

Map
I think I have mentioned map before but this is definitely a place to write a reminder. It's just a slightly cleaner way of writing code that takes each member of an array/list and modifies or uses it in the same way to create a new list.
1.   my @new_array = map {
2.     print "this is the key: \n";
3.     print "this is the value: $item \n"
4. }

5 comments:

  1. Looks like you're missing an opening brace on line 1 of the map example.

    ReplyDelete
  2. The C-style loop control can be replaced by:
    for ( 1..9 ) { say } (or say $_ if you want to be explicit). say for (1..9) also works.

    Non-numeric ranges are also possible: "say for (a..d)" works. Be careful with character-set order if you do this.

    ReplyDelete
    Replies
    1. This is true, but the C-style for loop is still useful for when you want to step by a different value than one.

      Also, while newer perls know to recognize the range operator and optimize for it, old perls will create a temporary array with all those values in it, which can eat up lots of memory if your numbers are big enough.

      Delete
  3. In very early versions of perl (maybe until sometime in perl 2) for and foreach were not interchangeable. For was the C-style variant, and foreach was the list iterator variant. They later got merged so that you could use either keyword for either purpose. I think the reason was that the C-style for loop was being recognized as mostly unused (separating it by moving the final section into a continue block was much more readable) while the iterator foreach loop was very heavily used and it was desired to have the shorter keyword be available for that purpose.

    For map example is rather confusing. You are initializing the array to the list of return values from the second print done within the map code block, which is a rather obfuscated way of getting a list of integer 1 values (that might not be one if the print fails).

    ReplyDelete