The Artima Developer Community
Sponsored Link

Weblogs Forum
Looking for Memories of Python Old-Timers

54 replies on 55 pages. Most recent reply: Jan 4, 2008 12:50 AM by Sharmila Gopirajan Sivakumar

Welcome Guest
  Sign In

Go back to the topic listing  Back to Topic List Click to reply to this topic  Reply to this Topic Click to search messages in this forum  Search Forum Click for a threaded view of the topic  Threaded View   
Previous Topic   Next Topic
Flat View: This topic has 54 replies on 55 pages [ « | 1 ... 34 35 36 37 38 39 40 41 42 ... 55  | » ]
Jim Roskind

Posts: 1
Nickname: jimroskind
Registered: Jun, 2006

Re: Looking for Memories of Python Old-Timers Posted: Jun 4, 2006 1:56 AM
Reply to this message Reply
Advertisement
Guido asked that I try to contribute something from my memories… so here are some things that come to mind….

It was in early to mid 1994 that Steve Kirsch suggested (he often suggests things in an aggressive manner) that we try using Python at InfoSeek to create a full-text searching system with a pay-by-the-drink model. We had already implemented a version in Perl, and I had spent too many hours debugging the Perl interpreter (and sadly seeing numerous cut/paste copies of the same interpreter bugs).

I was initially a bit put off by the Python indentation approach (I was a grammar guy… and got used to parsing braces and parens, and didn’t like the idea of white space having so much meaning), but I soon came to love it. Eventually I came to apply one of my favorite computer science sayings to Python as an endorsement to their indentation blocking approach: “The fundamental evil in computer science is the replication of code or data.” In almost all languages I had previously worked, the indentation of code had always been a critical PART of program nesting structure (at least for the reader). As I read and wrote more Python, its use of indentation as the ONLY way to specify blocks began to look better and better. All the silly bugs related to indentation errors (misleading the human reader) were gone in Python. The redundant use in other languages of braces AS WELL AS indentation (the former to help the parser, and the latter to help the human reader) was an effective duplication of the author’s intent. That duplication in other languages, a fundamental original sin, was missing in Python. It was cool! ;-)

I recall being blown away by the cleanliness of the Python interpreter source, and the fact that there was (from what I could see) never cut/paste code. It was nice. It was clean. And it worked very well. I certainly didn’t feel like an old timer then. I felt like I had just walked into a highly evolved kingdom. Lurking in the related forums it became clear that this “guy named Guido” was riding a tight shotgun over language evolution, and it was his will that was preventing the language from acquiring warts. I still smile today as I see his vision directing the language, and I know Python would never have gotten anywhere if not for his vision and strength of will.

One giant gain we saw in moving to Python was that it was very easy to rapidly prototype, and it was almost believable that we could write production code in Python. As a small startup, we were constantly redirecting ourselves and modifying specifications for our product. This rapid prototyping nature meant that the programmers could keep pace with the marketing changes and not work to shut down creative suggestions. We were also worried about the cost of computation to provide the services we were planning, and the vague hope that we could use our prototype to make money was always in our minds. I spent time trying to speed up some of our prototype code segments (aiming for production code), and found that the existing profiler was not giving me results which matched my tiny experiments (my tiny experiments were giving me great insight into how to speed up our code). I stepped back, read the old profiler source, and realized it was not deducting the time spent in the profiler from the code being measured. Worse yet, it was writing to disk intermittently during profiling, and even that time was being charged to the profiled code (oops). Realizing the weakness of the old profiler, it was then no surprise that our code seemed (according to the profiler) to be devoid of hot spots. The profiler was adding disk-write delays uniformly across the code, and dominating all actual measurements. This was perchance the first bit of a hint that the language was newer than I would have suspected, but the profiler was not seemingly central to the language, and I assumed it was a tangential contribution. I set off to write a more careful profiler, and that is the one that has survived till today. It is interesting (if the documentation I wrote is to be trusted) that the profiler was written after I had been using Python for only about a month, which shows what the learning curve for Python was like, even in “those days.” It was fun writing the Python profiler because this was the first profiler that I wrote that properly handled recursions vs iteration comparisons. The key to this feature was that the profiler was written in Python. In Python it was a snap to have a dynamic dictionary on hand to list all the function blocks residing on the current call stack. Usually, when I had written a profiler in the past, I didn’t have a lot of tools to work with. With Python, most support is written in Python (debugger, profiler, etc.), and powerful tools abound. It was and is a nice virtuous cycle.

I remember one early result from my profiling experiments. In retrospect, this story also hinted at how new the language was, but at the time I just saw it as something that was too hard to see with the previous profiler. One area of our code was running surprisingly slowly and almost dominating our performance. The hot area was a comparison function which was called by a sort routine. (Recall that with the old profiler, nothing ever stood out as a hot spot). At first I was surprised that there was no way I could speed up this “hot spot,” as it was already a very simple compare. Then I looked at call counts for this routine. We were typically sorting an array of length 1000. If I remember the numbers correctly, the profiler revealed that the hot comparison function was being called over 400,000 times per sort. This was a whole lot closer to what I’d expect from an n-squared sort algorithm than from an n * log(n) quick sort. Remembering that algorithms 101 taught that quick sort *can* be an n-squared sort if the pivots are “chosen poorly,” and this could commonly happen (with a simple quick sort implementation) when the list was already mostly sorted… I took a wild guess that this was the problem. To take a shot at working around this quick sort implementation, I needed to randomize the list prior to the sort. With Python, this was a snap (to try). I just put the list into a hash table, and then pulled it out (in its pseudo random order) before doing the sort. Sure enough, a factor of 20 performance improvement resulted (for our application), and a simple bug report went off to Guido. I always liked this story because it was a case where adding wasteful code was VERY helpful.

I also remember one difficult aspect of our Python development. We needed to perform bookkeeping (remember, we were making a pay-by-the-drink model search system), and we ended up writing a persistent object store to record these events and charges. All the features of Python involving introspection were beyond a pleasure to use to build this system. It was “a little challenging” to handle the automatic storage and resurrection of randomly inter-related objects. The hard part was making sure that a single saved object was resurrected as a single object, no matter how many other objects pointed to it in arbitrary circuitous ways. Here the exception system provided a simple way to effectively backtrack as needed during the saving of objects in unpredictable graphs. I remember thinking that IF I didn’t have all the tools on hand in the language, the coding of this stuff would have been between hard and impossible (at least hard for a human, or certainly nearly if not fully impossible for me). In the end we had a production system running 24/7, where we evolved the types of objects stored in the database dynamically as we improved and enhanced our code. After struggling a lot I asked a very technical academic colleague if there was any clever way that we *should* have been doing all this work. I was happy to hear that it was generally considered a very hard (academic) problem, and I should be happy that we got most any of this working (in a real world problem). I know that a lot of the credit for getting this running went to the language.

As a last memory, I recall attending “The Second Python Workshop” in around May 1995. I think it was held in a small room at NIST in Palo Alto. I had already moved on to this other startup company, Netscape Corp., but I was anxious to see some of the code that I had worked on survive. There seemed like there were about 30 folks there at any time, and the attendance list (found via Google) shows that no more than 60 folks attended in total. The language seemed interesting, but it also seemed like it was more of a hobby than a tidal wave.

Over the years since then, more and more folks have tried to tell me about this new language they had started to play with… called Python…. And that it is pretty cool. I just smile and agree.

Jim

Flat View: This topic has 54 replies on 55 pages [ « | 34  35  36  37  38  39  40  41  42 | » ]
Topic: Looking for Memories of Python Old-Timers Previous Topic   Next Topic Topic: The Third State of your Binary JUnit Tests


Sponsored Links



Google
  Web Artima.com   

Copyright © 1996-2017 Artima, Inc. All Rights Reserved. - Privacy Policy - Terms of Use - Advertise with Us