The Artima Developer Community
Sponsored Link

Python Buzz Forum
Zippy triples served with Python

0 replies on 1 page.

Welcome Guest
  Sign In

Go back to the topic listing  Back to Topic List Click to reply to this topic  Reply to this Topic Click to search messages in this forum  Search Forum Click for a threaded view of the topic  Threaded View   
Previous Topic   Next Topic
Flat View: This topic has 0 replies on 1 page
Thomas Guest

Posts: 236
Nickname: tag
Registered: Nov, 2004

Thomas Guest likes dynamic languages.
Zippy triples served with Python Posted: Nov 21, 2007 1:24 AM
Reply to this message Reply

This post originated from an RSS feed registered with Python Buzz by Thomas Guest.
Original Post: Zippy triples served with Python
Feed Title: Word Aligned: Category Python
Feed URL: http://feeds.feedburner.com/WordAlignedCategoryPython
Feed Description: Dynamic languages in general. Python in particular. The adventures of a space sensitive programmer.
Latest Python Buzz Posts
Latest Python Buzz Posts by Thomas Guest
Latest Posts From Word Aligned: Category Python

Advertisement

The Problem

Here’s a problem I encountered when writing the HTML generator for this site. Logically, Word Aligned is a time-ordered collection of articles. I wanted each article to link to its predecessor and successor. So the general problem is: How do you iterate through a collection yielding (previous, this, next) triples?

Specification

Some test cases make things clearer. Let’s name the function we’re developing prev_this_next().

>>> long_weekend = 'Fri', 'Sat', 'Sun', 'Mon',
>>> yesterday_today_tomorrow = prev_this_next(long_weekend)
>>> yesterday_today_tomorrow.next()
(None, 'Fri', 'Sat')
>>> yesterday_today_tomorrow.next()
('Fri', 'Sat', 'Sun')
>>> yesterday_today_tomorrow.next()
('Sat', 'Sun', 'Mon')
>>> yesterday_today_tomorrow.next()
('Sun', 'Mon', None)
>>> yesterday_today_tomorrow.next()
Traceback (most recent call last):
    ...
StopIteration
>>> for ptn in prev_this_next(range(5)):
...     print ptn
... 
(None, 0, 1)
(0, 1, 2)
(1, 2, 3)
(2, 3, 4)
(3, 4, None)
>>> print "\n".join(map(repr, prev_this_next("XYZ")))
(None, 'X', 'Y')
('X', 'Y', 'Z')
('Y', 'Z', None)

You’ll notice we’ve specified behaviour at the boundaries: the first item in the collection has no predecessor, thus the first triple returned has its first item set to None; and similarly the final triple has its third item set to None. We might equally well have chosen to return a user supplied default, or to wrap the collection at its ends. For now, let’s go with the simple behaviour shown.

You’ll also have noticed I’m writing Python — fair enough, since this web site is generated off-line using Python. The long_weekend example drives the Python iterator protocol by hand, calling yesterday_today_tomorrow.next() until a StopIteration exception terminates the iteration. It’s quite rare to use iterators in this way: more commonly, you just loop through them using for, or plug them into container operations. The second and third test cases show more typical usage.

First Implementation

If this were C++, we’d prefer our collection to support bi-directional iteration: think of a doubly-linked std::list, or a plain old random access std::vector. Then we could just decrement/increment each iterator from the collection to find its neighbours.

In Python, we might decide to assume a random access container and write something like this:

def get_default(items):
    "Return an item getter function."
    n_items = len(items)
    def inner(index):
        "Return items[index] or None if index is out of range."
        if index < 0 or index >= n_items:
            return None
        else:
            return items[index]
    return inner

def prev_this_next(items):
    get = get_default(items)
    for ix, item in enumerate(items):
        yield get(ix - 1), item, get(ix + 1)

This code isn’t elegant but it does pass our tests. Incidentally, an attempt to implement get_default using EAFP would fail. Can you see why?

def inner(index):
    try:
        return items[index]
    except IndexError:
        return None

If our collection of items is a stream — by which I mean a lazily-evaluated collection — this code raises an exception. We don’t know how long the stream will be (indeed, it could be infinite) and we can’t just access elements from it at random. For C++ programmers, think of sequentially reading a file using an input iterator.

Stream Test Case

Let’s adapt one of our test cases to expose this flaw.

>>> long_weekend = iter(('Fri', 'Sat', 'Sun', 'Mon'))
>>> yesterday_today_tomorrow = prev_this_next(long_weekend)
...

Running this code raises an exception:

Traceback (most recent call last):
    ...
TypeError: object of type 'tupleiterator' has no len()

Stream Solution

Thinking of this problem in terms of streams gives us a solution which is both more general and more simple. All we have to do is tee up three independent iterators into the stream, stagger them, then zip them back together. The itertools module supplies the components. We connect.

import itertools

def prev_this_next(items):
    extend = itertools.chain([None], items, [None])
    prev, this, next = itertools.tee(extend, 3)
    try:
        this.next()
        next.next()
        next.next()
    except StopIteration:
        pass
    return itertools.izip(prev, this, next)

This code works on any iterable, infinite, finite or empty, lazy or eager. Some more testcases:

>>> [ptn for ptn in prev_this_next(list())]
[]
>>> [ptn for ptn in prev_this_next(set([1]))]
[(None, 1, None)]
>>> ptn = prev_this_next(itertools.count())
>>> itertools.islice(ptn, 100, 101).next()
(99, 100, 101)

Another Triples problem

Now suppose you want to peel items from an iterable, three at a time. Let’s call this function three_at_a_time() and let’s specify its behaviour with some simple tests:

>>> t = three_at_a_time((1, 2, 3, 4, 5, 6))
>>> t.next()
(1, 2, 3)
>>> t.next()
(4, 5, 6)
>>> t.next()
Traceback (most recent call last):
    ...
StopIteration
>>> ttt = three_at_a_time((1, 2, 3, 4))
>>> [t for t in ttt]
[(1, 2, 3)]
>>> ttt = three_at_a_time(itertools.count())
>>> ttt = itertools.islice(ttt, 3, 6)
>>> [t for t in ttt]
>>> "".join(chain(*three_at_a_time("Word Aligned")))
'Word Aligned'

Note that any trailing single element or pair at the end of the collection is discarded. We might equally have decided to pad the collection with a user-supplied default or throw an exception.

Here’s one implementation.

def three_at_a_time(items):
    it = iter(items)
    return itertools.izip(it, it, it)

And here’s another.

def n_at_a_time(items, n):
    it = iter(items)
    return itertools.izip(* [it] * n)

three_at_a_time = lambda items: n_at_a_time(items, 3)

Read: Zippy triples served with Python

Topic: Mocks and Stubs and Fakes, Oh My! Previous Topic   Next Topic Topic: New Zealand!

Sponsored Links



Google
  Web Artima.com   

Copyright © 1996-2019 Artima, Inc. All Rights Reserved. - Privacy Policy - Terms of Use