Python Collections

2012 April 29 at 06:41 » Tagged as :python, ruby, map, reduce,

Lists and Arrays.

Correct me if I am wrong here but lists in python are what everyone else calls arrays right? Check this out:

a = ['spam', 'eggs', 100, 1234]

That's an example for a list given in the python docs. But that's how most other languages define an array. Getting an array list slice is rather cool. Exactly the way we did it with with stringsslice = a[2:4] . What they call parallel assignment in Ruby becomes multiple assignment in python.

a, b = b, a+b

Many operations can be made on lists such as append(), extend(), insert(), remove(), pop(), index(), count(), sort() and reverse(). These are pretty standard in most languages aren't they? It is the parameters that they accept that's weird . For example remove(n) does not remove the item at position n, it removes the item with value  n. Other functions count() and index() also exibit similar behavior.  On the other hand pop() does remove the element at n.

 Lists can be used easily as stacks and queues, There is also a collections.deque object which is a list on steroids specially suited to be used as a queue. It's apparently pronounced as deck.

 Filter, Map and Reduce

Python has a set of cool built in functions that operate on lists; filter, map and reduce. In fact reduce is so cool that it has been removed from python 3.

"Removed reduce(). Use functools.reduce() if you really need it; however, 99 percent of the time an explicit for loop is more readable." (python docs).

But what exactly does reduce do? "reduce(function, sequence) returns a single value constructed by calling the binary function function on the first two items of the sequence, then on the result and the next item, and so on". Or so says the documentation. This is the equivalent of  array_reduce in PHP

Now you may ask what about map?  "map(function, sequence) calls function(item) for each of the sequence’s items and returns a list of the return values"  This the equivalent of php's  array_map (or array_walk depending on the situation) .  In english, this means apply a function to each element in the sequence.

Last but not least is filter: "filter(function, sequence) returns a sequence consisting of those items from the sequence for which function(item) is true" (python docs). This is the equivalent of the php array_filter function.

Now a small digression: Have you noticed how Ruby tries to persuade you to use these functions all the time but python and php documentation does not? Rubists seem to think that reduce is the panacea for all ills. They love it so much that there are three different functions that do the same thing!  PHP and python on the other hand gives you the options and let you do things your own way, with ruby, you have to do things the way ruby wants you to do things.  Whether you agree or disagree you will find a great explanation of the ruby version of these functions at Railspikes

 

 list comprehensions and Lambda

Python and Ruby both have some nifty short cuts like the ones below:

squares = [x**2 for x in range(10)]
squares = map(lambda x: x**2, range(10))

This is called list comprehensions in python. It's inspired by the set builder notation in maths. In the above example, demonstrates a nice and easy way of creating an array that contains the squares of a set of integers. The PHP or Java code that does the same will arguably be longer. Though it may not look like much, it's the concept which can be used to achieve quite a lot with just one line of code as we will see in the next example from the python docs:

# apply a function to all the elements
[abs(x) for x in vec]

This is like Ruby statement modifiers but even better. Here is another one. Not quite maybe harder to read that the  earlier exampe:

# flatten a list using a listcomp with two 'for'
vec = [[1,2,3], [4,5,6], [7,8,9]]
[num for elem in vec for num in elem]

Don't worry, when you stare at it long enough you will suddenly go 'el comphrehendo'  if that doesn't happen, please do wait for the follow up post which will be out in a day or two. But here is the output:

 [1, 2, 3, 4, 5, 6, 7, 8, 9]

Let's wind down this section with a look at del(). del() seems to have been designed to address the shortcoming of the remove function. It will delete an element using an index. And it can be used to cut out whole slices too.

Tuples and Dictionaries.

We are able to call del and remove on lists because they (lists) are mutable, also available in  python is a non mutable collection called tuple. Tuples are created with brackets so you find that an empty tuple is ()  instead of [] but the brackets do happen to be optional t  = 'Hello', 'World' will create a two element tuple. What about a single element tuple? it goes like this; t = 'Hello', (make a note of the trailing , ). Tuples can be unpacked as var1, var2 , ... , varn = tup

 Python also has sets and dictionaries. Sets like their counterparts in java are collections that do not contain  duplicates and dictionaries are like associative arrays (which means they cannot contain duplicates either). While we use => in PHP and Ruby when defining an associative array or a Hash, in python we use the ':'  So we have

myDict = {'jack': 4098, 'sape': 4139}

you can also use the dict function to build dictionaries from key value pairs like so:

myDict = dict([('sape', 4139), ('guido', 4127), ('jack', 4098)])

zip

WHile we are still on the subject of collections, python provides a nice way of looping through two lists at once with the help of the zip function. It's best described with an example:

questions = ['name', 'quest', 'favorite color']
    answers = ['lancelot', 'the holy grail', 'blue']
    for q, a in zip(questions, answers):
        print 'What is your {0}?  It is {1}.'.format(q, a)
One more thing before I go, the format() method which belong to the String class is an alternative formating technique to the familiar %s, %d etc. Footnote: this post is an section of my study notes from my crash course on python