This post originated from an RSS feed registered with Python Buzz
by Ian Bicking.
Original Post: Python nit, chapter 3
Feed Title: Ian Bicking
Feed URL: http://www.ianbicking.org/feeds/atom.xml
Feed Description: Thoughts on Python and Programming.
While this is the occasional enabler of a cute answer on
comp.lang.python, in reality this is a total non-feature; no one needs
to iterate over strings.
But it's worse than that! Who among us has not accidentally passed in
a string where a list is expected, then puzzled over the odd results
that occur when the function thinks it has received a list of
single-character values? This has been the source of great annoyance
for me. One of the important design features of a dynamically typed
language is that operations not be ambiguous. If two different types
of objects have the method foo, that's okay -- if they both do the
conceptually same thing! If they both have a method by the same name,
and they do conceptually different things, then you set yourself up
for the Dark Side of dynamic typing -- when type errors silently
insinuate themselves into the depths of your program, resulting in
disconnected errors, or even worse in no error but incorrect results.
In this case, iteration is supported by strings, but strings are not
collections (at least, that's not how anyone uses them), and they
shouldn't implement something that makes them pretend to be
collections.
The reason why strings do this is probably because it seemed natural.
Especially before Python had a formal notion of iteration, for
used to just use __getitem__ with an increasing index -- and
strings have a __getitem__ which returns characters (which are
themselves strings, thus being an infinitely recursively iterative
structure). It would have taken a special rule to keep strings from
being iterable. And such a rule should have been written, but it wasn't. Now that
__iter__ covers this specific instance, such a rule could
more easily be implemented:
class goodstr(str):
def __iter__(self):
raise TypeError, "iteration over a non-sequence"
Of course, this string subclass does me no good, because no one
(including myself) would bother to use it. Maybe in Python
3... though I'd be curious -- I suspect if this was changed even now,
very little code would be effected, because iterating over strings
just doesn't make much sense, though list(s) is probably more
common than using a string in a for loop (e.g.,
urllib.quote), and works by way of iterability.