The Artima Developer Community
Sponsored Link

Weblogs Forum
Associative Fold (i.e. Reduce) as a Primitive Operation

11 replies on 1 page. Most recent reply: May 3, 2006 10:09 AM by Max Lybbert

Welcome Guest
  Sign In

Go back to the topic listing  Back to Topic List Click to reply to this topic  Reply to this Topic Click to search messages in this forum  Search Forum Click for a threaded view of the topic  Threaded View   
Previous Topic   Next Topic
Flat View: This topic has 11 replies on 1 page
Christopher Diggins

Posts: 1215
Nickname: cdiggins
Registered: Feb, 2004

Associative Fold (i.e. Reduce) as a Primitive Operation (View in Weblogs)
Posted: Apr 29, 2006 3:59 PM
Reply to this message Reply
Summary
I still don't quite get it, shouldn't every new language support a primitive Reduce() operation in anticipation of the brave new parallel world?
Advertisement
So last I heard reduce() is still on its way out of Python 3000. I still don't get the rationale. Something about it making code hard to read. As far as I can tell, having a reduce() operation as a primitive means that a compiler can easily parallelize the code. Or am I missing something here? Hasn't Google proven that? And its not just Python that I am concerned about. What about Microsoft's CLI or Perl's Parrot or Java's bytecode? They all seem to be stuck in the single processor mode of thinking. I can't think of any good reason to not start embracing parallel operations at the lowest level.


Kay Schluehr

Posts: 302
Nickname: schluehk
Registered: Jan, 2005

Re: Associative Fold (i.e. Reduce) as a Primitive Operation Posted: Apr 29, 2006 11:37 PM
Reply to this message Reply
> So last I heard reduce() is still on its way out of Python
> 3000. I still don't get the rationale. Something about it
> making code hard to read. As far as I can tell, having a
> reduce() operation as a primitive means that a compiler
> can easily parallelize the code. Or am I missing something
> here?

Don't you confuse the requirements of reduce() ( or fold() ) with those of map()? While map() shall obviously be parallelizable if the underlying container could be distributed across several processors, reduce() is a recursively ( or iteratively ) defined functional analog of a for-loop and necessarily sequential.

Tim LS

Posts: 37
Nickname: parchandri
Registered: Jul, 2005

Re: Associative Fold (i.e. Reduce) as a Primitive Operation Posted: Apr 30, 2006 8:53 AM
Reply to this message Reply
Here is my understanding. :)

Basically it comes down to this: It is easier to grok a for loop than grok the new concepts of 'associative operation' and 'reduce'.

Also, it lets people write wrong code they can't debug. As you say, it has to be associative. This is fine for addition, and multiplication and probably a lot of things, but if you forget that and get it wrong (use it with a non-associative operation), the compiler won't check it, the results will be parallelization dependent, and user opaque.

Another problem is that it's not so easy to optimise in the general case. Dealing with objects, you could spawn a worker thread to process the back half of the list separately. But 4 times out of 5 there might be a penalty or no benefit from that parallelization.

High performance code is not the first goal of python, too. If you're doing something really computationally intensive with python, you're probably better off writing it in C and wrapping it up in python anyway, due to overhead.

Having 2 ways to do things makes code harder to read too. Your codebase can become inconsistent due to personal preference. Or perhaps because reduce performs faster for a particular case.

Christopher Diggins

Posts: 1215
Nickname: cdiggins
Registered: Feb, 2004

Re: Associative Fold (i.e. Reduce) as a Primitive Operation Posted: Apr 30, 2006 9:09 AM
Reply to this message Reply
> Don't you confuse the requirements of reduce() ( or fold()
> ) with those of map()?

Someone at Lambda-the-Ultimate pointed out that I am confused with aggregate functions: http://lambda-the-ultimate.org/node/1438

> While map() shall obviously be
> parallelizable if the underlying container could be
> distributed across several processors, reduce() is a
> recursively ( or iteratively ) defined functional analog
> of a for-loop and necessarily sequential.

Some uses I see of reduce can be parallelized (for instance sum of a set or product of a set). The only requirement for the parallelization of reduce seems to be that the function being applied is associative.

Christopher Diggins

Posts: 1215
Nickname: cdiggins
Registered: Feb, 2004

Re: Associative Fold (i.e. Reduce) as a Primitive Operation Posted: Apr 30, 2006 9:16 AM
Reply to this message Reply
> Basically it comes down to this: It is easier to grok a
> for loop than grok the new concepts of 'associative
> operation' and 'reduce'.

I have to agree with you.

The only issue I have is that some algorithms aren't really appropriate for a loop, like sum. You've specified to the compiler that order is important, when it in fact it isn't.

Your point about error checking which is a good one! I will have to look into possible solutions. For example compile-time detection of assocaitivity of functions, or at least user-defined associative labels for code blocks.

Christopher Diggins

Posts: 1215
Nickname: cdiggins
Registered: Feb, 2004

Re: Associative Fold (i.e. Reduce) as a Primitive Operation Posted: Apr 30, 2006 10:08 AM
Reply to this message Reply
I have done somre more research and it seems that in most cases the execution of "reduce" is strictly ordered (essentially a "foldl" operation).

What I seem to be looking for is a non-deterministic fold (e.g. "ndfold").

In the Google MapReduce framework it seems that the effect of a "ndfold" is accomplished by passing a combine function to the framework ( http://labs.google.com/papers/mapreduce-osdi04.pdf )

Tim LS

Posts: 37
Nickname: parchandri
Registered: Jul, 2005

Re: Associative Fold (i.e. Reduce) as a Primitive Operation Posted: May 1, 2006 1:37 AM
Reply to this message Reply
> I have done somre more research and it seems that in most
> cases the execution of "reduce" is strictly ordered
> (essentially a "foldl" operation).

Yes, logical, as it happens on a single processor node.

> In the Google MapReduce framework it seems that the effect
> of a "ndfold" is accomplished by passing a combine
> function to the framework (
> http://labs.google.com/papers/mapreduce-osdi04.pdf )

Yes, you could do it that way. The idea of a combine function is clever in the usual cases. It's so simple to divide up the parallelisable 'reduce' work the same as the 'map' steps. And it avoids excessive passing of data. The only downside is specifying how it's done, but that seems a very small downside.

Max Lybbert

Posts: 314
Nickname: mlybbert
Registered: Apr, 2005

Re: Associative Fold (i.e. Reduce) as a Primitive Operation Posted: May 1, 2006 2:03 PM
Reply to this message Reply
/* As far as I can tell, having a reduce() operation as a primitive means that a compiler can easily parallelize the code. ... What about Microsoft's CLI or Perl's Parrot or Java's bytecode? They all seem to be stuck in the single processor mode of thinking.
*/

The former head of Parrot development is currently working on such a VM (http://www.sidhe.org/~dan/blog/archives/cat_musings_on_vm_design.html before he named it, and http://www.sidhe.org/~dan/blog/archives/cat_tornado.html after he named it, probably want to read the oldest posts first). While I slowly work on Yard, Heron, and HeronScript issues, I've considered using this VM -- except the VM's not available yet. Aargh!

Anyhow, the Perl community is moving toward a more functional way of writing code when it makes sense to do so. Perl's grep operator shows up in places where the UNIX version of grep doesn't make sense, but it's the Perl way to do this.

Tim LS

Posts: 37
Nickname: parchandri
Registered: Jul, 2005

Re: Associative Fold (i.e. Reduce) as a Primitive Operation Posted: May 2, 2006 2:34 AM
Reply to this message Reply
Max: Reading that you are working on YARD issues -- I've been playing with YARD today, found it quite nice -- are you openly contributing to YARD? If so, where/how?

Max Lybbert

Posts: 314
Nickname: mlybbert
Registered: Apr, 2005

Progress report on YARD Posted: May 2, 2006 11:18 AM
Reply to this message Reply
I started working on YARD to port it to Linux. CDiggins wrote YARD mainly platform-independently, aside from input routines which use native interfaces (apparently for efficiency reasons). I spent a lot of time considering how much leeway I had on that one (I really considered a templated class that would default to scanf, but that could instead use mmap() or something else, according to your data input desires). I've gotten over that, and I've decided to pick a good default and move on.

After getting my hands dirty with that exercise, I started writing a HOWTO. I'm writing that HOWTO in a Perl POD-like language ( http://perldoc.perl.org/perlpod.html ) that YARD will convert to Boost.Book ( http://www.boost.org/doc/html/boostbook.html ). That's been stalled because, frankly, the POD-Boost.Book converter is my first YARD project and I'm learning the library's API. The HOWTO, btw, basically shows how to use YARD by building a POD to Boost.Book converter. :-)

Anyhow, I've been meaning to knock out some walls on the old sourceforge site ( https://sourceforge.net/projects/yard-parser ), but I haven't gotten around to it. I did put some stuff into the CVS server, but it's nothing amazing ( http://cvs.sourceforge.net/viewcvs.py/yard-parser/yard/ ). I have quite a few ideas to implement, but I haven't prototyped them yet so they are still just ideas.

If you're interested in contributing, acting as a sounding board regarding documentation, writing a feature request list, reporting bugs, or doing anything else, please do. You can join the sourceforge project officially if you want, but there's no real need to do so. But if more people are involved, I'm more likely to feel guilty about letting them down, which really motivates me.

Achilleas Margaritis

Posts: 674
Nickname: achilleas
Registered: Feb, 2005

Re: Progress report on YARD Posted: May 3, 2006 2:22 AM
Reply to this message Reply
I noticed that the YARD operator 'seq' is a binary operator. Declaring big sequences becomes problematic due to nesting. For example, sequencing ten elements is:

seq<A, seq<B, seq<C, seq<D, seq<E, seq<F, seq<G, seq<H, seq<I, J>>>>>>>>>

I suggest that YARD binary operations are converted to operators with multiple parameters which default to some class that statically evaluates to true, thus allowing for the above sequence to be expressed like this:

seq<A, B, C, D, E, F, G, H, I, J>

The class 'seq' could be declared like this:

template <class T1, class T2, class T3 = True, class T4 = True, ... , class T10 = True> class seq;

and the 'parse' function could use operator && so as that the functions that return true are removed by the compiler when optimizations are turned on:

static bool parse() {
return T1::parse() && T2::parse() && T3::parse() && ... && T10::parse();
}

The same can be applied to the choice operator.

Max Lybbert

Posts: 314
Nickname: mlybbert
Registered: Apr, 2005

Re: Progress report on YARD Posted: May 3, 2006 10:09 AM
Reply to this message Reply
> I noticed that the YARD operator 'seq' is a binary
> operator. Declaring big sequences becomes problematic due
> to nesting.
> ...
> I suggest ...

I'm not trying to be nasty by snipping all that, but anybody interested can read the original post.

Overall, I like the proposal. Writing Yard templates is a lot like writing Lisp (which isn't necessarily a bad thing, but it can be annoying to keep track of nesting levels).

Or, as the Perl guys say, irregular verbs are always the verbs used most often (in natual languages at least; consider "to be" in English, Spanish, French, etc.). It makes sense to recognize this is a common case, and to make a special case for it (if that wording makes sense).

Flat View: This topic has 11 replies on 1 page
Topic: Associative Fold (i.e. Reduce) as a Primitive Operation Previous Topic   Next Topic Topic: BON Dynamic Diagrams in UML for SOA Description

Sponsored Links



Google
  Web Artima.com   

Copyright © 1996-2019 Artima, Inc. All Rights Reserved. - Privacy Policy - Terms of Use