The Artima Developer Community
Sponsored Link

Python Buzz Forum
Run-length encoding in Python

0 replies on 1 page.

Welcome Guest
  Sign In

Go back to the topic listing  Back to Topic List Click to reply to this topic  Reply to this Topic Click to search messages in this forum  Search Forum Click for a threaded view of the topic  Threaded View   
Previous Topic   Next Topic
Flat View: This topic has 0 replies on 1 page
Thomas Guest

Posts: 236
Nickname: tag
Registered: Nov, 2004

Thomas Guest likes dynamic languages.
Run-length encoding in Python Posted: Jun 1, 2009 2:06 PM
Reply to this message Reply

This post originated from an RSS feed registered with Python Buzz by Thomas Guest.
Original Post: Run-length encoding in Python
Feed Title: Word Aligned: Category Python
Feed URL: http://feeds.feedburner.com/WordAlignedCategoryPython
Feed Description: Dynamic languages in general. Python in particular. The adventures of a space sensitive programmer.
Latest Python Buzz Posts
Latest Python Buzz Posts by Thomas Guest
Latest Posts From Word Aligned: Category Python

Advertisement

Recently I discussed run-length encoding and DEFLATE compression. I never actually showed a Python implementation of a run-length encoder, so here’s one now.

import itertools as its

def ilen(it):
    '''Return the length of an iterable.
    
    >>> ilen(range(7))
    7
    '''
    return sum(1 for _ in it)

def runlength_enc(xs):
    '''Return a run-length encoded version of the stream, xs.
    
    The resulting stream consists of (count, x) pairs.
    
    >>> ys = runlength_enc('AAABBCCC')
    >>> next(ys)
    (3, 'A')
    >>> list(ys)
    [(2, 'B'), (3, 'C')]
    '''
    return ((ilen(gp), x) for x, gp in its.groupby(xs))

The decoder is equally simple. Itertools.repeat expands a (count, value) pair into an iterable which will generate count elements. Itertools.chain flattens these iterables into a single stream.

def runlength_dec(xs):
    '''Expand a run-length encoded stream.
    
    Each element of xs is a pair, (count, x).
    
    >>> ys = runlength_dec(((3, 'A'), (2, 'B')))
    >>> next(ys)
    'A'
    >>> ''.join(ys)
    'AABB'
    '''
    return its.chain.from_iterable(its.repeat(x, n) for n, x in xs)

If you haven’t seen itertools.chain.from_iterable() yet, it was introduced at Python 3.0/2.6. The important feature here is that it lazily works its way through a single iterable argument. If instead we’d written:

def runlength_dec(xs):
    ....
    return its.chain(*(its.repeat(x, n) for n, x in xs))

then our run-length decoder would need to consume all of xs before yielding results (which is why we must interrupt the interpreter’s execution below).

>>> xs = its.cycle((3, 'A'), (2, 'B'))
>>> runlength_dec(xs)
  C-c C-cTraceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<string>", line 25, in runlength_dec
  File "<string>", line 25, in <genexpr>
KeyboardInterrupt

Named tuples for clarity

Streams of pairs (as shown above) are perfectly Pythonic. If we run-length encode a stream of numbers, clients will just have to read the manual and remember that item[0] is a repeat count and item[1] is a value.

If this seems fragile, a new-ish member of the collections module can give the pair more structure.

>>> from collections import namedtuple
>>> Run = namedtuple('Run', 'count value') 
>>> run1 = Run(count=10, value=2)
>>> run2 = Run(value=2, count=10)
>>> run1
Run(count=10, value=2)
>>> run2
Run(count=10, value=2)
>>> run1.count
10
>>> run1[0]
10

Here’s how we’d change runlength_enc() to use the new type.

def runlength_enc(xs):
    '''Return a run-length encoded version of the stream, xs.
    
    >>> ys = runlength_enc('AAABBCCC')
    >>> next(ys)
    Run(count=3, value='A')
    >>> list(ys)
    [Run(count=2, value='B'), Run(count=3, value='C')]
    '''
    return (Run(ilen(gp), x) for x, gp in its.groupby(xs))

Read: Run-length encoding in Python

Topic: Run-length encoding in Python Previous Topic   Next Topic Topic: DEFLATE: run-length encoding, but better

Sponsored Links



Google
  Web Artima.com   

Copyright © 1996-2019 Artima, Inc. All Rights Reserved. - Privacy Policy - Terms of Use