artima - Article Discussion

Article Discussion

How To Go Slow

View Flat

Summary: Computers make life easier because they're so fast, right? Well, yes and no. Do you write efficient code? The author reveals some disconcerting inefficiencies lurking in commonly used software and development practices.

54 posts.

The ability to add new comments in this discussion is temporarily disabled.

Most recent reply: December 11, 2008 11:25 AM by Jim

Frank

Posts: 135 / Nickname: fsommers / Registered: January 19, 2002 7:24 AM

How To Go Slow

February 8, 2008 1:40 PM

In this article, the author reveals some disconcerting inefficiencies lurking in commonly used software and development practices:

http://www.artima.com/cppsource/how_to_go_slow.html

What do you think about Colvin's points about seeming inefficiencies?

Leo

Posts: 14 / Nickname: aeoo / Registered: April 12, 2006 7:02 AM

Re: How To Go Slow

February 11, 2008 3:07 PM

Garbage article. There is no new insight whatsoever. All the examples are very tired. Anyone who gives two shits about the field already knows them all. What is the point of this?

I think someone is trying to meet "my publish-o-meter quota" in order to needlessly inflate their ego. The essay must be at least 5 pages long, even if you have only 1 sentence worth of ideas to convey. Academia BS mentality.

cerebio

Posts: 1 / Nickname: cerebio / Registered: April 13, 2005 6:50 AM

Re: How To Go Slow

February 10, 2008 1:55 PM

I really enjoyed this article. It's well written (witty and clear), and achieves what its aim seems to be: shake up the experienced developers frequenting Artima that have a responsibility to teach the large crowd of slightly more ignorant software creators that performance is still something to consider. After all, we don't want the often quoted credo (attributed to Intel) of 'Intel giveth, M$ taketh away' to be applicable to our software (be it Windows, BSD, Linux or otherwise based).

It's scary sometimes that most junior software engineers these days don't even know what big-O notation is. That there are algorithms used by all those fancy libraries they are given and that those algorithms have characteristics that should be taken into account when choosing to use or not use them. And hardware and how it is controlled and utilized by software and what that actually may mean is so far from their minds that they might as well be working on an abacus for all they know ...

It is up to the older and more experienced among the developer communities to help them discover that there is more to developing software than pretty GUIs and fancy developer tool sets (nice as those are).

Okay, I'll shut up now *blush*

Nuwan

Posts: 1 / Nickname: nuwang / Registered: April 16, 2006 4:25 AM

Re: How To Go Slow

February 11, 2008 8:05 PM

> I really enjoyed this article. It's well written (witty
> and clear), and achieves what its aim seems to be: shake
> up the experienced developers frequenting Artima that have
> a responsibility to teach the large crowd of slightly more
> ignorant software creators that performance is still
> something to consider.

I would agree with this sentiment. Having a laundry list of issues like this could be a good starting point, especially since many of the more experienced developers here presumably need to advise newer developers. (or at least, point them to articles like these).

One thing I found particularly interesting was the section on thrashing. It's easy to forget low level machine specific issues like cache lines when dealing with higher level languages and I was hoping to gather some different viewpoints on this.

From one perspective, having to worry about such issues seems to indicate a certain level of failure in black box abstraction/layering mechanisms. Whether we like it or not, it appears we can't escape the implementation details of the underlying layers, and therefore are doomed to know all of them, and eventually, buckle under all that mind boggling complexity/information overload?

For example, padding a struct so it fits snugly in a cache line is good for performance, but brings in architecture specific code into your program + complicates the code. Obviously, it shouldn't be done unless the code is performance critical, but it still means we are exposed to the details of an underlying layer. What about the mantra taught in school to use black box abstractions to shield yourself from complexity? Can we really blame people for being overwhelmed with all this stuff?

Nemanja

Posts: 40 / Nickname: ntrif / Registered: June 30, 2004 1:10 AM

Re: How To Go Slow

February 12, 2008 7:57 AM

This was fun :)

One remark I have is that in general it is not a good idea to write a recursive quick sort algorithm, at least not in an imperative language such as C++. Not only it is slow, but leads to a stack overflow.

Mark

Posts: 48 / Nickname: mthornton / Registered: October 16, 2005 11:22 PM

Re: How To Go Slow

February 12, 2008 9:16 AM

> to write a recursive quick sort algorithm, at least not in
> an imperative language such as C++. Not only it is slow,
> but leads to a stack overflow.
Only if you get the implementation wrong. A reasonable implementation will provide a log(n) bound on the stack size (hint: recurse on the smaller partition).

art

Posts: 4 / Nickname: articulate / Registered: September 19, 2005 8:54 AM

Re: How To Go Slow

February 12, 2008 4:05 AM

When I try to make code that is functionally correct I look up the definition of the api's I call, consider all possible code paths and try to ensure that the correct value will always be returned.

When I try to make code fast, I measure it under certain hopefully valid scenarios and see what is taking the time. I generally reason about 'known to be slow' operations (like database operations), but overall performance is not something I try to guarantee in the way I do functional correctness.

If we want guarantees of performance, libraries would document their performance characteristics, and we would reason about what the time bounds.

I certainly think that the tools to help us analyze performance would be a lot more valuable that the tools to help with concurrent programming, which is one currently ultra trendy, and over hyped optimization technique.

Elizabeth

Posts: 7 / Nickname: ewiethoff / Registered: March 26, 2005 2:00 PM

Re: How To Go Slow

February 12, 2008 5:57 AM

> If we want guarantees of performance, libraries would
> document their performance characteristics, and we would
> reason about what the time bounds.

You're right. A lot of API's out there don't spell out time (or space) performance characteristics.

Take Ruby, for example. The RDocs imply that Array#shift is costly and Array#pop is not. Meanwhile, there's no such thing as Hash#pop, but the RDocs don't hint that Hash#shift takes a tremendous amount of time on a large Hash. I tried shifting a Hash last week. Ugh. It seems to go through the trouble of rehashing the whole table with every shift, but I wouldn't know this from the docs.

help(list) and help(dict) don't say anything about Python list and dict performance, nor does the Python Library Reference, as far as I can tell.

So which API's out there spell out performance characteristics? Is this information in docs limited to the world of C++ or something? I notice glancing at the JavaDocs for HashMap, Hashtable, and LinkedHashMap that they're reasonably informative, much more so than Ruby and Python. And I notice that the Perldocs for array pop and shift are reasonably informative.

Perhaps if API's were in the habit of spelling this stuff out, developers be more conscious of performance and less inclined to rely on Saint Moore.

Kay

Posts: 13 / Nickname: schluehk / Registered: January 20, 2005 5:46 AM

Re: How To Go Slow

February 12, 2008 11:03 AM

> help(list) and help(dict) don't say anything about Python
> list and dict performance, nor does the Python Library
> Reference, as far as I can tell.
>
> So which API's out there spell out performance
> characteristics?

API's? Hard to tell ;)

For sorting you can consult the spec:

http://svn.python.org/projects/python/trunk/Objects/listsort.txt

In this particular case people usually trust more on Tim Peters than on their own understanding. I argued with him one time on prime number generators and I wouldn't repeat it again.

art

Posts: 4 / Nickname: articulate / Registered: September 19, 2005 8:54 AM

Re: How To Go Slow

February 13, 2008 4:34 PM

I did not mean to bash library vendors for not documenting things.

What I do mean is that when I do:

root = (-b + sqrt(b*b - 4 * a * c)) / (2 * a)

I think about the fact that that if 4 * a * c is bigger than b * b my code fail. And even sometimes about precision; if things are large or small that could cause a large error or even a failure that would not happen in pure mathematics. I never think about what numbers will make sqrt slow.

I don't always expect the operation to complete in the same amount of time. Or even within any given known time bound.

Since I don't reason about the sum or the time (or memory) used by my code, there is no reason to expect that my code will run in any given bound of time.

All I do is reason about the big things (sorting was mentioned).

I feel it is like coding without reasoning about corner cases.

So why won't there be corner cases that are horribly slow?

Why don't I write unit tests that explore those cases and write assertions about the number of operations my code took?

Elizabeth

Posts: 7 / Nickname: ewiethoff / Registered: March 26, 2005 2:00 PM

Re: How To Go Slow

February 13, 2008 6:11 PM

> All I do is reason about the big things (sorting was mentioned).

It's hard to know what is big if the standard docs don't say. It's hard to know whether shifting (or popping) an element off a particular collection is big if the docs don't say.

If "shift"ing an arbitrary element off a Ruby Hash causes the entire hash table to get resized and rehashed, then shifting off all the elements one at a time is complexity O(nlogn) or O(n^2) (not sure off the top of my head). That puts what I tried to do a few days ago in the realm of "big things" like sorting. But I can't know ahead of time if the Hash#shift docs don't provide a clue.

Morgan

Posts: 37 / Nickname: miata71 / Registered: March 29, 2006 6:09 AM

Re: How To Go Slow

February 8, 2008 2:18 PM

He mentioned BubbleSort so I immediately stopped reading.

Bjarne

Posts: 48 / Nickname: bjarne / Registered: October 17, 2003 3:32 AM

Re: How To Go Slow

February 9, 2008 5:42 AM

> He mentioned BubbleSort so I immediately stopped reading.

I think you are being too cynical. This is a good article explaining basic sources of inefficiencies for people who don't want a PhD in complexity theory.

Elizabeth

Posts: 7 / Nickname: ewiethoff / Registered: March 26, 2005 2:00 PM

Re: How To Go Slow

February 9, 2008 11:15 AM

And Bubble Sort is very efficient if there are only a few items out of place. If I'm not mistaken, Python's built-in sort stabs at the data with a few different algos until it's fully sorted.

Jan

Posts: 4 / Nickname: asgard / Registered: February 9, 2008 4:33 AM

Re: How To Go Slow

February 9, 2008 11:25 AM

I believe that main cause of slow (and resource-hungry) programs today is the use of prefabricated libraries and components. These components are often too general for the purpose they are used for, and are often built from other components.

Each such layer will also add innecessary conversions of data and data structures.

Slowdown on these layers is mostly invisible during the testing, because today's systems are too powerful and have too much memory. The tests are usually designed to test for "perceived" slowness instead of theoretical expectations how much resource the component needs.

However, once you start stacking these components together, the inefficiencies will become apparent.

Also, I believe that (current) object-oriented programming paradigm is for the most part responsible for the situation (this opinion will be quite hated on Artima, I guess :)). In my opinion, object-oriented programs hide the fundamental data structures of the program in the vast networks of little objects. This prevents understanding and optimization of these data structures, thus preventing usage of optimal algorithms (the whole refactoring thing is just accommodating a different data structures to the program, so that better algorithms could be used).

There are ways out in my opinion. One is to take a look at VHDL, for instance. It is a very modular language (for hardware design), yet the final design is optimized as a whole - there are no redundant data conversions after the synthesis.

Second approach is to decouple the data structures used in the program from the actual program logic. Any program that uses a database essentially does that - the common point is the database schema, and how exactly (with what algorithms) it is implemented in the database is not relevant. The actual database implementation can then be optimized independently of the program, by choice of indexes for example. This should be done on much finer level, directly in memory. But this means to centralize data back from many little objects, which contradicts current OO paradigm.

James

Posts: 128 / Nickname: watson / Registered: September 7, 2005 3:37 AM

Re: How To Go Slow

February 10, 2008 11:52 AM

> Also, I believe that (current) object-oriented programming
> paradigm is for the most part responsible for the
> situation (this opinion will be quite hated on Artima, I
> guess :)). In my opinion, object-oriented programs hide
> the fundamental data structures of the program in the vast
> networks of little objects. This prevents understanding
> and optimization of these data structures, thus preventing
> usage of optimal algorithms (the whole refactoring thing
> is just accommodating a different data structures to the
> program, so that better algorithms could be used).

I disagree on a number of these points. I think it's a gross misunderstanding of OO. It's a common misconception that OO means every OO must have it's own internal copy of the data-structure it represents. Well designed OO is also much easier to optimize in a lot of cases because the data structures can be changed without affecting things depending on an object's public interface. Lastly I think that's the worst description of refactoring I've ever seen. In fact most refactorings are for modifying classes and object graphs without affecting the underlying structures or behaviors.

Elizabeth

Posts: 7 / Nickname: ewiethoff / Registered: March 26, 2005 2:00 PM

Re: How To Go Slow

February 9, 2008 1:14 PM

Jan wrote: "I believe that main cause of slow (and resource-hungry) programs today is the use of prefabricated libraries and components. These components are often too general for the purpose they are used for, and are often built from other components."

I hear what you're saying about prefab libraries and components. I think I've run into a situation in which the "obvious" prefabricated choice doesn't cut it.

Off and on in the past couple weeks I've worked on creating a little app for my beau. It's in JRuby for historical reasons, and it eats memory something fierce.

Well, a hash is the most obvious choice for processing the mound of data in the way it needs processing. The built-in hash structures that comes with Ruby, Python, Java, etc. store not only the "values" but also the "keys" used to access the values. (That's why you can ask these built-in hashes for a list of keys, and they'll cough them right up for you.) But, wouldn't you know, I'm not interested in ever seeing the keys again in my beau's app.

So a couple nights ago I created a fresh data structure, which is sort of a cross between an array and a hash, that doesn't bother to store the "keys" I feed it. Yes, it uses the built-in Array structure, but at least it takes less memory than the Hash. I still need to incorporate my mini-hash with the rest of the app and give it a whirl, though.

Greg

Posts: 9 / Nickname: gregc / Registered: June 23, 2004 5:24 AM

Re: How To Go Slow

February 9, 2008 3:21 PM

When all I care about is the values in awk's associative array I just use the values as the keys.

So instead of
array = x
and then
for(i=1; i<=n; ++i) x = array...
I write
++array[x]
and then
for (x in array) ...

And yes - it can be a good strategy to start with the what your language and library provide and then rewrite as needed.

James

Posts: 128 / Nickname: watson / Registered: September 7, 2005 3:37 AM

Re: How To Go Slow

February 10, 2008 0:09 PM

> Well, a hash is the most obvious choice for processing the
> mound of data in the way it needs processing. The built-in
> hash structures that comes with Ruby, Python, Java, etc.
> store not only the "values" but also the "keys" used to
> access the values. (That's why you can ask these built-in
> hashes for a list of keys, and they'll cough them right up
> for you.) But, wouldn't you know, I'm not interested in
> ever seeing the keys again in my beau's app.

If you can guarantee that your hashes are unique, then you can definitely implement a hash structure that doesn't keep references to the keys.

One thing that I've long considered a flaw in the Java Map class is that it puts far too much burden on the implementor. Often I just want an interface that has a single method: get(key) -> value.

It seems to me that this should be the base of a Map API and then there would be subclasses of this that would provide more functionality. That would make it possible to implement your structure as a standard map.

Mark

Posts: 48 / Nickname: mthornton / Registered: October 16, 2005 11:22 PM

Re: How To Go Slow

February 10, 2008 0:59 PM

There are a number of potential alternative designs for the collection classes. I sometimes wish for a separate interface for read only collections instead of just throwing an exception if a mutating method is called. The trouble is that most such alternatives greatly increase the number of interfaces and implementions.

The API we have is a compromise, where the classes/interfaces usually do rather more than we need in any one instance, but we don't have to remember so much. True this does cause more work for implementers, but most people merely use the provided classes, and for those that do implement collections the abstract collections do much of the work.

James

Posts: 128 / Nickname: watson / Registered: September 7, 2005 3:37 AM

Re: How To Go Slow

February 10, 2008 4:09 PM

> There are a number of potential alternative designs for
> the collection classes. I sometimes wish for a separate
> interface for read only collections instead of just
> throwing an exception if a mutating method is called. The
> trouble is that most such alternatives greatly increase
> the number of interfaces and implementions.

I see three map implementations (obviously the names would be better in real implementation):

interface MapA<V> {
   public V get(Object key);
}
 
interface MapB<K, V> extends MapA<V> {
   public Set<K> keySet();
}
 
interface MapC<K, V> extends MapB<K, V> {
   public put(K key, V value);
}

I'm leaving out the other utility methods of Map for simplicity.

> The API we have is a compromise, where the
> classes/interfaces usually do rather more than we need in
> any one instance, but we don't have to remember so much.
> True this does cause more work for implementers, but most
> people merely use the provided classes, and for those that
> do implement collections the abstract collections do much
> of the work.

Personally, I don't think this is too much to remember. And we wouldn't need any more implementations to have the current common reusable implementations. Each could implement MapC and implement the other two at the same time. And if my method will only look up values and never write to the map, it makes it clear in the method signature.

I do agree that most people use the provided classes but what I question is whether people do this because it's complicated to implement a Map. There are a lot of powerful approaches to programming we lose out on in Java because of the attitude that Maps have to be standalone classes. Being able to return an anonymous inner class that allows properties to be retrieved by name allows for a lot of simple reusable code. Think about how much simpler JavaBeans would be if they were an interface that returned a Map over the objects properties. You could even implement them without any hard-coding.

And the Map interface is not simpe to implement, even if you extend the AbstractMap interface. That class forces you to implement an entry set which requires implementing Map.Entry and an iterator.

James

Posts: 128 / Nickname: watson / Registered: September 7, 2005 3:37 AM

Re: How To Go Slow

February 11, 2008 6:40 AM

> interface MapA<V> {
>    public V get(Object key);
> }
> 
> interface MapB<K, V> extends MapA<V> {
>    public Set<K> keySet();
> }
> 
> interface MapC<K, V> extends MapB<K, V> {
>    public put(K key, V value);
> }

I was thinking about this and the proper interface hierarchy would be for MapC to extend MapA.

This would allow for a writable map with or without providing keys.

Cameron

Posts: 26 / Nickname: cpurdy / Registered: December 23, 2004 0:16 AM

Re: How To Go Slow

February 10, 2008 2:55 PM

> One thing that I've long considered a flaw in the Java Map
> class is that it puts far too much burden on the
> implementor. Often I just want an interface that has a
> single method: get(key) -> value.
>
> It seems to me that this should be the base of a Map API
> and then there would be subclasses of this that would
> provide more functionality.

Yup .. that's why we built our own AbstractKeyBasedMap (com.tangosol.util package), with a total of two abstract methods:

public abstract Object get(Object oKey);
protected abstract Iterator iterateKeys();

There are obvious / simple implementations for the use cases that you described.

Peace,

Cameron Purdy | Oracle
http://www.oracle.com/technology/products/coherence/index.html

Elizabeth

Posts: 7 / Nickname: ewiethoff / Registered: March 26, 2005 2:00 PM

Re: How To Go Slow

February 11, 2008 0:28 PM

>> Often I just want an interface that has a single method:
>> get(key) -> value.
> Yup .. that's why we built our own AbstractKeyBasedMap
> (com.tangosol.util package), with a total of two abstract
> methods:
> public abstract Object get(Object oKey);
> protected abstract Iterator iterateKeys();

Thanks, Cameron. Perhaps I'll download it and take a look.

Custom data structures aside... I made a happy discovery yesterday. I read that Java's GC runs in its own thread, and that gave me the idea of putting my thread to sleep. When I put to sleep for a bit the loops that generate lots of garbage, even just once in a while -- sleep(0.001) if rand(50).zero? -- my app is able to massage a much larger mound of data before thrashing for virtual memory.

The moral of this little episode: Read the fine print in the docs and books. For you might be able to help your built-in GC along without explicitly invoking it. (And explicitly invoking a full collection is usually not what you want anyway.)

Morgan

Posts: 37 / Nickname: miata71 / Registered: March 29, 2006 6:09 AM

Re: How To Go Slow

February 11, 2008 1:05 PM

Elizabeth - try Thread.yield(), which I think is slightly better than a short sleep.

If there is a possibility that your calculation thread will be interrupted, call the following method at various times in your processing:


/**
* Standard way to Check for interrupts
* @param message
* @throws InterruptedException
* @see http://www.forward.com.au/javaProgramming/HowToStopAThread.html
*/
public static void yieldForInterrupts(String message) throws InterruptedException
   {
      Thread.yield(); // let other threads have time
      if (Thread.currentThread().isInterrupted())
      {
         throw new InterruptedException(message);
      }
   }

Elizabeth

Posts: 7 / Nickname: ewiethoff / Registered: March 26, 2005 2:00 PM

Re: How To Go Slow

February 11, 2008 2:16 PM

Morgan Conrad wrote: "Elizabeth - try Thread.yield(), which I think is slightly better than a short sleep."

Thanks. This has prompted me to try Ruby's Thread.pass.

Greg Colvin wrote: "Well I'd say they enjoyed being reminded humorously of what they should know, as they related funny stories of their own, and they mostly despaired of ever reaching those who believe Saint Moore will save them."

I wonder if believing in Saint Moore is an age thing. I'm coming up on 50 and I've never believed in Saint Moore. Before computer hardware became downright disposable and could be purchased with disposable income, getting more memory, or a bigger hard drive, or a niftier graphics card, or a faster machine, or additional machines, was not an option. (Heck, I used to solder up RS-232 cables just to avoid forking out the dough for ready-made...)

Dave

Posts: 6 / Nickname: abrahams / Registered: October 6, 2004 10:03 AM

Re: How To Go Slow

February 10, 2008 2:00 PM

> I believe that main cause of slow (and resource-hungry)
> programs today is the use of prefabricated libraries and
> components. These components are often too general for the
> purpose they are used for, and are often built from other
> components.
>
> Each such layer will also add innecessary conversions of
> data and data structures.

Not necessarily. That's where generic programming shines: providing abstraction without penalty.

> However, once you start stacking these components
> together, the inefficiencies will become apparent.
>
> Also, I believe that (current) object-oriented programming
> paradigm is for the most part responsible for the
> situation

I agree with you, at least in part. Blindly treating the OO design paradigm as a desirable end in itself, or simply not having the flexibility to think differently can be a major problem. Much of the desktop software we run today has immense quantities of legacy code from the era when everything was supposed to be OO.

> In my opinion, object-oriented programs hide
> the fundamental data structures of the program in the vast
> networks of little objects. This prevents understanding
> and optimization of these data structures, thus preventing
> usage of optimal algorithms (the whole refactoring thing
> is just accommodating a different data structures to the
> program, so that better algorithms could be used).

Yes, it is interesting that programmers have spent years developing abstractions (like modules, functions, classes, etc.) to help us organize our code and manage complexity, and have mostly overlooked the abstraction of "what the program is actually doing." That's an algorithm. Again, generic programming is a paradigm that actually allows us to focus on and clarify the abstract algorithm without loss of efficiency.

> There are ways out in my opinion. One is to take a look at
> VHDL, for instance. It is a very modular language (for
> hardware design), yet the final design is optimized as a
> whole - there are no redundant data conversions after the
> synthesis.

It's comparatively easy to write such entirely declarative languages to address particular domains (DSLs), but I don't think anyone has really succeeded in making a general-purpose programming language that works that way.

> Second approach is to decouple the data structures used in
> the program from the actual program logic.

Bingo. Tease out the abstract algorithm so you can reason about its efficiency.

> Any program
> that uses a database essentially does that - the common
> point is the database schema, and how exactly (with what
> algorithms) it is implemented in the database is not
> relevant. The actual database implementation can then be
> optimized independently of the program, by choice of
> indexes for example.

Are you suggesting that programs that use databases don't have efficiency problems?

Jan

Posts: 4 / Nickname: asgard / Registered: February 9, 2008 4:33 AM

Re: How To Go Slow

February 11, 2008 10:32 PM

> > Each such layer will also add innecessary conversions
> > of data and data structures.
>
> Not necessarily. That's where generic programming shines:
> providing abstraction without penalty.

Yeah, you're right, generic programming in C++ (and also usage of non-virtual class system) allows this layer to be removed during compilation, so these abstractions give no penalty. But that's a special case in today's OO systems.

Probably it's just the whole static vs. dynamic thing. The more things you have static, you lose flexibility, but you can optimize much better.

> > There are ways out in my opinion. One is to take a look
> > at VHDL, for instance. It is a very modular language (for
> > hardware design), yet the final design is optimized as
> > a whole - there are no redundant data conversions after
> > the synthesis.
>
> It's comparatively easy to write such entirely declarative
> languages to address particular domains (DSLs), but I
> don't think anyone has really succeeded in making a
> general-purpose programming language that works that way.

The generic programming you mention is actually an example of paradigm that comes close to that in this respect.

> > Second approach is to decouple the data structures used
> > in the program from the actual program logic.
>
> Bingo. Tease out the abstract algorithm so you can reason
> about its efficiency.

Actually, I have an idea in mind based on that, below.

> > Any program
> > that uses a database essentially does that - the common
> > point is the database schema, and how exactly (with
> > what algorithms) it is implemented in the database is not
> > relevant. The actual database implementation can then
> > be optimized independently of the program, by choice of
> > indexes for example.
>
> Are you suggesting that programs that use databases don't
> have efficiency problems?

Well, they may have efficiency problems, but part of these problems can be easily solved by hinting the database (by normalizing the schema and creating the correct indexes). So you don't need to touch the program logic at all.

And that's my point. The program logic and the data structure implementation are somehow "decoupled" in programs that use databases (actually, it's still muddy if you use stored procedures, but that's another story).

Let me tell you more about my pet research project (which unfortunately, for the time being, competes with my other pet research projects).

I imagine a language where every data structure (especially the global ones, which cannot be described in terms of standard data structure) would be designed like a database scheme is usually designed. The actual implementation of the schemes would be thus completely decoupled from the program logic.

I even imagine that scheme implementation would be created by compiler (two manual passes - in first, compiler creates a test implementation, then you run tests of your program, you gather statistics about how the schemes are used, and in second pass, the compiler will use the gathered statistics as well as other user's input to create efficient implementation of the data structures).

I think this approach would have many advantages, most importantly, possibility to change the data structures used at will, depending on the change of requirements of program logic.

There are many interesting ideas in this direction. For example, compiler could implement the scheme in completely different way, and use the original scheme as a view of it, so the original scheme would be translated during compilation to the new scheme. You could even have older parts of program to reference the data structures in old way, while having newer parts to reference them in new way - like versioning of data structures.

Other idea is to have "virtual columns" in the scheme, which would be computed from values in other columns. The compiler would decide if it's worth to cache this value in the table, or if it should compute it at each use of this value (depending on how many inserts/reads you do and how difficult is to obtain the value).

So you see, there are bazillion possibilities, and also lot of interesting problems to overcome. However, it contradicts to current OOP paradigm (read: languages like Java, C#, Ruby, Groovy) in two fundamental ways:

1. Centralization of data structures - obviously, since I want to emphasize them. This contradicts having networks of small objects implementing the data structures.

2. The data structures are static for the run of program - if I want to optimize them, I need to know the type of usage ahead. This contradicts having dynamic language.

I generally find the idea of compiler that selects data structures for you depending on your actual needs and goals (speed, memory footprint) very intriguing, similar to the way that current compilers select instructions and registers for you instead of using assembler and doing it yourself.

James

Posts: 128 / Nickname: watson / Registered: September 7, 2005 3:37 AM

Re: How To Go Slow

February 12, 2008 6:22 AM

> So you see, there are bazillion possibilities, and also
> lot of interesting problems to overcome. However, it
> contradicts to current OOP paradigm (read: languages like
> Java, C#, Ruby, Groovy) in two fundamental ways:
>
> 1. Centralization of data structures - obviously, since I
> want to emphasize them. This contradicts having networks
> of small objects implementing the data structures.

As I already mentioned, this represents a misunderstanding of OO.

> 2. The data structures are static for the run of program -
> if I want to optimize them, I need to know the type of
> usage ahead. This contradicts having dynamic language.

Java and C# are static languages.

> I generally find the idea of compiler that selects data
> structures for you depending on your actual needs and
> goals (speed, memory footprint) very intriguing, similar
> to the way that current compilers select instructions and
> registers for you instead of using assembler and doing it
> yourself.

This is something that could be achieved through an OO approach. I don't know of anything that does this (I don't think runtime optimizations in the JVM or CLR like method inlining counts.) Because OO hidesthe underlying data-structures from what is using them, the data structure can often be replaced without affecting the caller.

Jan

Posts: 4 / Nickname: asgard / Registered: February 9, 2008 4:33 AM

Re: How To Go Slow

February 12, 2008 0:05 PM

> > This contradicts having networks
> > of small objects implementing the data structures.
>
> As I already mentioned, this represents a misunderstanding
> of OO.

No, you don't understand. There are several layers of data structures in any large program. Let me explain on example what I have in mind:

Imagine your data are like two tables of records, which are related. So you have a class A and B for respective records in those tables. The class A contains (among other things) string class S (points to its instance, to be precise). Then you have classes AA and BB for these tables perhaps, which again point to all records inside those tables. And finally, you have class C which points to AA and BB instances and contains some relation between the records in these two tables.

Now, my point is, even if all the algorithms in classes A, B, C, AA, BB would be optimal, there still could be some unoptimal remnants in the whole scheme. For example, it may happen that class S always stores length of the string, but I just don't need it anywhere in any of the other classes. It may happen that class A is implemented as an array of pointers to the instances, but it would be much more useful from cache standpoint, if it would just have the data from it's instances together. It may happen that class C always requires only a specific set of records from the classes AA and BB, so it would be useful, if they would contain also pointers to a subset of these records stored somewhere.

If I had just an empty paper and see the grand scheme of things from the start (all the classes), I would certainly define the structure of all data in memory very differently, so it could be fast for the problem at hand, class C.

The problem here is thus that the top layer (class C) only sees layer with classes AA and BB, this layer sees only layer with classes A and B, and so on.

There are so much possible inefficiencies from these 5 classes that it's amazing, and consider that large OO programs have hundreds of classes, and tens of such layers.

I believe that is actual reason why today's programs are so slow and memory-hungry. There is just no single simple piece of code that can be easily optimized. It's a big collection of such little things described above, accumulated over the layers.

That's why I am pointing to databases for inspiration - database schemas have only few layers. There are basic datatypes, maybe arrays of them, then records (table rows), tables and their relations. There are always 5 layers, and their structure and purpose is clear, and the optimization is (relatively to OOP) easy because of that.

I explain below why I think Java or C# cannot deal with this problem. I hope you see now why I feel there is need of centralized management of data structures, and compilers that are aware of the data structures in their all complexity.

> > 2. The data structures are static for the run of program
> -
> > if I want to optimize them, I need to know the type of
> > usage ahead. This contradicts having dynamic language.
>
> Java and C# are static languages.

Yes, they're statically typed, but they must preserve the class structure, to be able to handle overloaded classes and for reflection. It means that they cannot optimize classes away like C++ can do (if virtual methods are not used, that is). In this sense they are more dynamic, because they cannot base their optimizations on implementation of other classes. You see, it's a dilemma - either you optimize but you have to know in advance, or you take advantage of not knowing in advance, and then it's hard to optimize.

The cases that need dynamic features (like handling of unknown inherited classes and reflection) are like 1% of all cases. It's wrong to slow down everything else (by making it impossible to optimize) just because these features are sometimes useful.

Mark

Posts: 48 / Nickname: mthornton / Registered: October 16, 2005 11:22 PM

Re: How To Go Slow

February 12, 2008 1:02 PM

James

Posts: 128 / Nickname: watson / Registered: September 7, 2005 3:37 AM

Re: How To Go Slow

February 12, 2008 1:17 PM

> > > Java and C# are static languages.
> >
> > Yes, they're statically typed, but they must preserve
> the
> > class structure, to be able to handle overloaded
> classes
> > and for reflection. It means that they cannot optimize
> > classes away like C++ can do (if virtual methods are
> not
> > used, that is). In this sense they are more dynamic,
> They can and do optimise away virtual invovations if at a
> point in time there are no overriding methods. If a class
> is later loaded which overrides such a method then the
> optimisation is reversed. The JIT has complete knowledge
> of the classes actually present, and in principle can do
> any optimisation that such information makes possible.

In addition escape analysis has been implemented successfully in JVMs. This allows the VM to determine that an object reference never leaves the stack and can then 'extract' the members of the object directly onto the stack as variables. It doesn't take much imagination to see that this can be done recursively to eliminate many layers of indirection.

James

Posts: 128 / Nickname: watson / Registered: September 7, 2005 3:37 AM

Re: How To Go Slow

February 12, 2008 1:42 PM

> > > This contradicts having networks
> > > of small objects implementing the data structures.
> >
> > As I already mentioned, this represents a
> misunderstanding
> > of OO.
>
> No, you don't understand. There are several layers of data
> structures in any large program. Let me explain on example
> what I have in mind:
>
> [snip]
>
> If I had just an empty paper and see the grand scheme of
> things from the start (all the classes), I would certainly
> define the structure of all data in memory very
> differently, so it could be fast for the problem at hand,
> class C.
>
> The problem here is thus that the top layer (class C) only
> sees layer with classes AA and BB, this layer sees only
> layer with classes A and B, and so on.

This is exactly the misunderstanding I am referring to. And it's a misconception that a lot of people developing OO code believe.

Taking your example, the main error is the assumption that because AA is defined as a set of A instances, that AA must be implemented physically as an object that points to a number of instances of A. It's one possible solution but not mandated in any way.

Instead A can be implemented as an indirection over AA. For example (Java code):

class AA {
  final byte[] data;
  final int recordLength;
 
  AA(byte[] data, int recordLength) {
    this.data = data;
    this.recordLength = recordLength;
  }
 
  List<A> getAs() {
    return new AbstractList<A>() {
      public A get(final int index)
      {
        return new A() {
          public String getValue()
          {
            return new String(data, index * recordLength, recordLength);
          }
        };
      }
      
      public int size()
      {
        return data.length / recordLength;
      }
    };
  }
}
 
interface A {
  String getValue();
}

Jan

Posts: 4 / Nickname: asgard / Registered: February 9, 2008 4:33 AM

Re: How To Go Slow

February 13, 2008 10:09 PM

> Instead A can be implemented as an indirection over AA.
> For example (Java code):

Ok, so you have shown me it's possible to break layers in Java. However, I have two objections:

1. Is it still OOP? What happened to the encapsulation principle? (I know that the "correct" set of OOP principles is fuzzy and may vary from person to person.) I don't see this technique being the canonical way to program in Java, just like it's not canonical to have your program in one class, with the instance variables as globals (though it's certainly possible to program this Cobol way too).

2. How many programs out there are written like that? If the language doesn't make it easy or obvious to write programs this way, even if the technique may be superior, it won't help in practice.

In fact, I would probably rather deal with program that has networks of small objects than program of the same complexity written this way.

I don't want to argue much. I simply believe it can be done better, essentially, that compiler can select good implementation for A (or even several different implementations) without much fuss on the side of programmer. If you don't believe that, fine, at least you won't compete with me trying to figure that out. :)

P.S. Your example looks more like aspect-oriented programming to me (I don't know much about it, just a thought), having a set of big classes which are mostly accessed through interfaces that expose some part of them.

Mark

Posts: 48 / Nickname: mthornton / Registered: October 16, 2005 11:22 PM

Re: How To Go Slow

February 14, 2008 1:29 AM

> 1. Is it still OOP? What happened to the encapsulation
> principle? (I know that the "correct" set of OOP
Sometimes, if efficiency is critical, you have to compromise.

I have been experimenting with some Java code for matrix multiplication. The classical algorithm with a fully encapsulated matrix achieves 90MFLOPS, but code that is modified to take better advantage of the processor (the test matrices don't fit in the Level2 cache, and it has 4 cores) manages 6000MFLOPS. I don't think compilers for any language are yet able to automatically make all the transformations involved in getting that performance gain.

James

Posts: 128 / Nickname: watson / Registered: September 7, 2005 3:37 AM

Re: How To Go Slow

February 14, 2008 6:48 AM

> > Instead A can be implemented as an indirection over AA.
> > For example (Java code):
>
> Ok, so you have shown me it's possible to break layers in
> Java. However, I have two objections:

One caveat. OO is extremely poorly defined. Many people disagree about what is and isn't OO. Particularly, people from the SmallTalk school of OO have different ideas about OO than the C++ school of OO.

> 1. Is it still OOP? What happened to the encapsulation
> principle?

I consider this encapsulated. How is it any less encapsulated? Neither AA nor A expose their implementation to the outside. One of the fallacies (IMO) of novice OO is that all objects must be standalone. OO doesn't prescribe how you implement objects internally, it's about how they appear to the outside.

> (I know that the "correct" set of OOP
> principles is fuzzy and may vary from person to person.) I
> don't see this technique being the canonical way to
> program in Java,

Well, A lot of people don't do it. I'll give you that. But if you go and look at the source for the JDK, you'll find this kind of thing done often. One good example is that when a StringBuffer (and I imagine StringBuilder) is converted to a String, the newly created String will actually use the underlying array of the StringBuffer.

Another good example is that the HashSet class is really just the keyset of a HashMap instance. Look at HashMap and how entryset and keyset are implemented and you'll find that the actually underlying structure is an array.

> just like it's not canonical to have your
> program in one class, with the instance variables as
> globals (though it's certainly possible to program this
> Cobol way too).
>
> 2. How many programs out there are written like that? If
> the language doesn't make it easy or obvious to write
> programs this way, even if the technique may be superior,
> it won't help in practice.

This pretty much the whole point of inner classes in Java.

> In fact, I would probably rather deal with program that
> has networks of small objects than program of the same
> complexity written this way.

These are small objects. They are actually smaller than one that contains it's own data because they only contain a few pointers.

I think it's probably more that the code is foreign than it is complex. The technique allows a large monolithic structure with strong internal binding to appear to be a network of related objects. For example, instead of passing an array, index, and length to a method and hope that the method handles it properly, I can had something that is equivalent in performance but requires no understanding of the structures. The decreases complexity at the level that matters.

And really the point here is that you can start off with the kind of standard solution you describe and then if performance becomes an issue, move to a more sophisticated one. But often it's easier to do things this way. It just takes a little getting used.

> I don't want to argue much. I simply believe it can be
> done better, essentially, that compiler can select good
> implementation for A (or even several different
> implementations) without much fuss on the side of
> programmer. If you don't believe that, fine, at least you
> won't compete with me trying to figure that out. :)

I agree that things could be done better and I think you've got some really interesting and maybe even new ideas on this. I don't want to discourage you at all. My point is that you are ignoring that there is already at least one paradigm that provides a possible framework path for what you have imagined. New achievements are almost always built upon the work of others.

In a nutshell, you are talking about changing the internal data-structures of a program without changing the program. OO already has solved some, but not all of this problem. For example, when I write code to work with a List, the list provided can be array-backed or a linked list or something else entirely. My code will still do the same thing. If I understand, this is a core piece of your idea.

Mark

Posts: 48 / Nickname: mthornton / Registered: October 16, 2005 11:22 PM

Re: How To Go Slow

February 14, 2008 7:25 AM

> One good example is that when a StringBuffer (and I
> imagine StringBuilder) is converted to a String, the
> newly created String will actually use the underlying
> array of the StringBuffer.

That used to be true for StringBuffer, but I don't think it is any more. There was a long series of performance 'bugs' where each attempted fix caused a problem for someone else. StringBuilder's lack of synchronization meant it could never use the buffer sharing trick (it would have been a security loophole).

James

Posts: 128 / Nickname: watson / Registered: September 7, 2005 3:37 AM

Re: How To Go Slow

February 14, 2008 9:20 AM

> > One good example is that when a StringBuffer (and I
> > imagine StringBuilder) is converted to a String, the
> > newly created String will actually use the underlying
> > array of the StringBuffer.
>
> That used to be true for StringBuffer, but I don't think
> it is any more. There was a long series of performance
> 'bugs' where each attempted fix caused a problem for
> someone else. StringBuilder's lack of synchronization
> meant it could never use the buffer sharing trick (it
> would have been a security loophole).

You're right. I'm kind of curious as to what the issue was. It seems like a sound idea on the surface (as long as the structure is synchronized.)

What would be really great is if escape analysis could be expanded upon to include eliminating copying in situations where the original copy is never used after the copy. I'm not sure it's feasible but who knows. It would also seem that a good bit of escape analysis could be done statically but I've not seen anything suggesting that this is being done.

Mark

Posts: 48 / Nickname: mthornton / Registered: October 16, 2005 11:22 PM

Re: How To Go Slow

February 14, 2008 0:02 PM

> You're right. I'm kind of curious as to what the issue
> was. It seems like a sound idea on the surface (as long
Interaction with the sizing policy for the StringBuffer. So a 12 character String might be sharing a 4000 character array from the buffer. On the other hand aggressively reducing the buffer size caused some applications to run more slowly. So large buffers were expected by some applications, but this caused other applications to use excessive memory. Abandoning the sharing caused many applications to run a bit slower, but also eliminated the pathological bad cases (which were very bad).

James

Posts: 128 / Nickname: watson / Registered: September 7, 2005 3:37 AM

Re: How To Go Slow

February 14, 2008 0:25 PM

> > You're right. I'm kind of curious as to what the issue
> > was. It seems like a sound idea on the surface (as
> long
> Interaction with the sizing policy for the StringBuffer.
> So a 12 character String might be sharing a 4000 character
> array from the buffer. On the other hand aggressively
> reducing the buffer size caused some applications to run
> more slowly. So large buffers were expected by some
> applications, but this caused other applications to use
> excessive memory. Abandoning the sharing caused many
> applications to run a bit slower, but also eliminated the
> pathological bad cases (which were very bad).

Maybe not a good idea at the scope of the JDK but this couldn't that be solved by switching on a system property? Then those apps that had the 'pathological' cases could turn off the sharing without affecting everyone else.

Gregor

Posts: 6 / Nickname: gregor / Registered: August 3, 2005 4:18 AM

Re: How To Go Slow

February 18, 2008 11:44 PM

> > Java and C# are static languages.
>
> Yes, they're statically typed, but they must preserve the
> class structure, to be able to handle overloaded classes
> and for reflection. It means that they cannot optimize
> classes away like C++ can do (if virtual methods are not
> used, that is).
Java and C# are not alike in this resprect.

In Java, all methods are indeed virtual. The VM may inline the known methods implementations anyway, but I must still check for new method implementations (from a derived class loaded via dynamic class loading). This is called hotspot optimization.

In C#, methods are non-virtual, unless the virtual keyword is used. Given the fact that the CIL is always JITted, these methods are usually inlined.

In addition, C# has an extra construct, which is designed for performance: value types. These are stack-allocated, final/sealed classes - which guarantees that they can always be inlined where used as a field. In fact, all pointers to value types are inlined.

Morgan

Posts: 37 / Nickname: miata71 / Registered: March 29, 2006 6:09 AM

Re: How To Go Slow

February 9, 2008 3:14 PM

> > He mentioned BubbleSort so I immediately stopped
> reading.
>
> I think you are being too cynical. This is a good article

So, I reread it. Here are some pearls of wisdom..

"Looping also amounts to How To Run Hot, where heat means wasted energy and fried equipment, and cooling also wastes a lot of expensive energy."

"You still may never get out of the loop, but at least you won't be hogging a CPU the whole time."

"The easiest way to avoid this is to avoid languages with undefined behavior"

Other than resource locking and some C specific advice of how best to iterate over a 2D array, there's nothing useful there. Do youactually consider "Don't write infinite loops", "Dont use Bubble Sort" and "Don't dereference NULL" to be useful advice.

What if I wrote an article on extreme programming best practices with advice like "don't drop the monitor on your groin", "avoid pair programming with a schizophrenic psychopath" and "don't delete your entire CVS repository by mistake"? That's pretty close to his advice. If it were closer to April I'd think the article is a deliberate joke.

But it's inspired me to write one. Hey Artima, how do I get to write something for you?

Cameron

Posts: 26 / Nickname: cpurdy / Registered: December 23, 2004 0:16 AM

new guru emerges

February 9, 2008 3:30 PM

> What if I wrote an article on extreme programming best
> practices with advice like "don't drop the monitor on your
> groin", "avoid pair programming with a schizophrenic
> psychopath" and "don't delete your entire CVS repository
> by mistake"?

Obviously you are a guru .. ;-)

Peace,

Cameron Purdy | Oracle
http://www.oracle.com/technology/products/coherence/index.html

Greg

Posts: 9 / Nickname: gregc / Registered: June 23, 2004 5:24 AM

Re: How To Go Slow

February 9, 2008 3:57 PM

Well yes, of course I'm deliberately joking, so perhaps Chuck should have waited until April. But there is a serious point to my jokes, which is to get people to pay attention to performance.

Of course no one purposely loops infinite, dereferences a null pointer, chooses a quadratic algorithm over a linear one, or otherwise writes bad code. And indeed you'd think no one needs to be reminded. Nonetheless, programs too often loop infinite, crash, and exhibit non-linear degradation in performance. So I gave a few examples of how that can happen.

The upshot of the complexity section was to first check the C++ standard library,which carefully documents the complexity of its algorithms. And if it doesn't have what you need then be sure you know the complexity of the code you use or write instead. Again, this should be obvious. Nonetheless, the programs on my computer spend a lot of time chewing up resources to no apparent purpose. So I gave an example of two similar functions with much different computational complexity.

The suggestion to use languages with no undefined behavior was serious. There are a lot available that work well with C++ (e.g. Boost.Python) and you can also implement custom mini-languages in C++ to make particular tasks easier to do safely. You just need to beware of the performance tradeoffs.

And I'm rather liking your idea of an Extreme Programming Worst Practices article. If you submit it here we'd be happy to review it: http://www.artima.com/write.html

Morgan

Posts: 37 / Nickname: miata71 / Registered: March 29, 2006 6:09 AM

Re: How To Go Slow

February 9, 2008 8:15 PM

> Well yes, of course I'm deliberately joking, so perhaps
> Chuck should have waited until April.
...
> And I'm rather liking your idea of an Extreme Programming
> Worst Practices article. If you submit it here we'd be
> happy to review it: http://www.artima.com/write.html

Thanks for the link! Hopefully I'll be inspired to write such an "article". :-)

James

Posts: 128 / Nickname: watson / Registered: September 7, 2005 3:37 AM

Re: How To Go Slow

February 10, 2008 11:39 AM

> Well yes, of course I'm deliberately joking, so perhaps
> Chuck should have waited until April. But there is a
> serious point to my jokes, which is to get people to pay
> attention to performance.

I'll try to make the point that I think Morgan is making in a more constructive tone.

This article is a good one for developers don't know their asses from a holes in the ground. In my experience that's probably anywhere from 70 - 85% of developers. However, the portion of those developers that are reading Artima articles is pretty small. I think this article is inappropriate for the audience.

Probably every person reading this already knows to yield in a loop when you are polling. Everybody reading this knows that bubblesort is for boneheads. And finally everyone reading this knows it's impossible to write a Java program that doesn't crash or hang.

Greg

Posts: 9 / Nickname: gregc / Registered: June 23, 2004 5:24 AM

Re: How To Go Slow

February 11, 2008 9:18 AM

Well, James, I know from my email that the engineers I respect the most enjoyed this, so I'd say I hit my audience. And I know from how slowly some programs run that engineers who should know all this don't always put it into practice.

James

Posts: 128 / Nickname: watson / Registered: September 7, 2005 3:37 AM

Re: How To Go Slow

February 11, 2008 9:39 AM

> Well, James, I know from my email that the engineers I
> respect the most enjoyed this, so I'd say I hit my
> audience.

Did it tell them anything they didn't already know or did they think it was good for other people to read?

> And I know from how slowly some programs run
> that engineers who should know all this don't always put
> it into practice.

Do you believe telling them again (assuming they read this) will cause them to start putting these basic principles into practice?

Greg

Posts: 9 / Nickname: gregc / Registered: June 23, 2004 5:24 AM

Re: How To Go Slow

February 11, 2008 10:16 AM

Well I'd say they enjoyed being reminded humorously of what they should know, as they related funny stories of their own, and they mostly despaired of ever reaching those who believe Saint Moore will save them.

James

Posts: 128 / Nickname: watson / Registered: September 7, 2005 3:37 AM

Re: How To Go Slow

February 11, 2008 10:44 AM

> Well I'd say they enjoyed being reminded humorously of
> what they should know, as they related funny stories of
> their own, and they mostly despaired of ever reaching
> those who believe Saint Moore will save them.

Well, if your goal was to just amuse people then maybe you hit your audience. I kind of got the feeling you were attempting to educate people.

I think this is a good article for junior developers. I just don't think this is the place where you will get those eyeballs. If you are reposting it in other places, however, you might. I'm just explaining the reaction you received from people here. I am not criticizing the article. I will point out that your subtle dig at Java was pointless and needlessly partisan.

Walter

Posts: 5 / Nickname: walterb / Registered: July 22, 2004 8:48 AM

Re: How To Go Slow

February 12, 2008 11:32 AM

> Do you believe telling them again (assuming they read
> this) will cause them to start putting these basic
> principles into practice?

No matter how much I learn, I sometimes find it useful to once again go over the basic priniciples. This is especially true when dealing with a craft.

Matt

Posts: 62 / Nickname: matt / Registered: February 6, 2002 7:27 AM

Re: How To Go Slow

March 6, 2008 1:19 AM

The silly loops with

goto

s make me suspect it was a joke.

I think all the algorithm stuff missed the point and just slips back into the nerdy safty zone of premature optimization. What I mean is that it seems like a lot slow code is not because of algorithm choice, but simply bad code.

The example of a slow app that you started out with was an installer. It may be that the installer is so slow because it is doing things like checking whether a file is already there, and if it is there, whether it has some reference count in the registry (assuming Windows) from a previous install, then comparing of the version of the already installed file to the version in the compressed archive. If it is doing all these things in such a way that it decompresses the file more than once, or accesses the file on disk many times (to get its size, then its last modified date, version, etc.) it performance will suck. Ironically, it can actually be a case where code that is well-decoupled and nicely object oriented performs worse than code that may be special-casing to gather all the registry information for all files in one shot, then all the information from the archive in another.

As far as the "blame the library" comments, I bet most of the time you shouldn't blame the library. Instread but you should blame your own code that is resorting the same array a thousand times.

Matt

Posts: 62 / Nickname: matt / Registered: February 6, 2002 7:27 AM

Re: How To Go Slow

March 6, 2008 1:21 AM

Oops, I should have used the code tag, not the java tag which adds all those empty lines.

I guess I am out of practice.

Yuriy

Posts: 1 / Nickname: yuriy / Registered: May 10, 2008 7:07 AM

Re: How To Go Slow

May 10, 2008 1:08 PM

Yes, you are right. I can say more: bad programmers, bad design and virtual machines lead to the bad and slow code. By the way I have more then 7 years C/C++ experience and seems I have never used any kind of sorting algorithm in real projects.

What do you think about XML RPC over HTTP? We forgot that PC is not a text processor - it is binary device and it handles binary protocols much more efficiently. Who is interested in reading of HTTP headers? - Nobody. So why they are plain text? This example is not just one case - it is a very common practice. I saw many times that using of another approach in design can increase speed of execution up to 10 times.

And about Java, C# and others based on virtual machine languages. Most of the Java programs start so slowly that you can drink a cup of coffee while waiting. So the Moore's Law is not a law anymore, software won the battle.
What they achieve? Bad developers can write good program in Java? - Not. I think the main complexity of modern programming lies in requirements and design.

On my opinion C++ has a lack of good cross-platform GUI library. If somebody invest into that one tenth of money spent on Java or .Net I will not see any competitors to C++.

P.S. Seems sooner or later I will be Java or C# programmer because that is 90% of all jobs. :(

Jim

Posts: 1 / Nickname: jimxoch / Registered: December 10, 2008 6:00 AM

Re: How To Go Slow

December 11, 2008 11:25 AM

Nice and interesting article, however I don't really agree with the following statement:

"For example, the search algorithms in the C++ Standard are O(log(N)) when used with random-access iterators, but O(N) when used with input iterators."

What I know about the C++ search algorithms is rather different:

A) std::search() original complexity: O(N*M) [1]
Implementations with better complexity have been presented [4] but not yet adopted by the most notable STL implementations. [7,8,9]

B) std::search_n() original complexity: O(N*m) - O(N) [2]
A search_n specialization for random access iterators has been presented in the year 2004 [5] and has been soon adopted by some STL implementors. [7,8] Another STL implementor has also adopted a similar search_n specialization a little later. [9]

C) std::find_first_of() original complexity: O(N*M) [3]
A find_first_of specialization for one-byte integers has been presented in the year 2007 and has been soon adopted by one STL implementor. [8]

References:

1. The std::search STL algorithm
http://www.sgi.com/tech/stl/search.html
2. The std::search_n STL algorithm
http://www.sgi.com/tech/stl/search_n.html
3. The std::find_first_of STL algorithm
http://www.sgi.com/tech/stl/find_first_of.html

4. David R. Musser and Gor V. Nishanov, "A Fast Generic Sequence Matching Algorithm",
Rensselaer Polytechnic Institute Computer Science Technical Report 97-11 (without appendices, 1997);
full version available at: http://www.cs.rpi.edu/~musser/gp/gensearch1.pdf (revised Feb. 2, 2001).
5. Can search_n be more efficient?
http://www.codeproject.com/KB/stl/SearchN.aspx
6. find_first_of: A performance pitfall among the STL algorithms
http://www.codeproject.com/KB/stl/find_first_of.aspx

7. The STL shipped with the GCC-C++ compiler
http://gcc.gnu.org/libstdc++/
8. The STLport STL implementation
http://stlport.sourceforge.net/
9. The STL shipped with the MS-VC++ compiler
http://msdn.microsoft.com/en-us/library/ct1as7hw(VS.80).aspx

Best regards
Jim Xochellis

Homepage:
http://geocities.com/jimxoch/