The Artima Developer Community
Sponsored Link

Weblogs Forum
Language Purity and Dirty and Clean Functions

46 replies on 4 pages. Most recent reply: Jul 11, 2006 5:10 AM by Emil Cristian Alexandrescu

Welcome Guest
  Sign In

Go back to the topic listing  Back to Topic List Click to reply to this topic  Reply to this Topic Click to search messages in this forum  Search Forum Click for a threaded view of the topic  Threaded View   
Previous Topic   Next Topic
Flat View: This topic has 46 replies on 4 pages [ « | 1 2 3 4 | » ]
Isaac Gouy

Posts: 527
Nickname: igouy
Registered: Jul, 2003

Re: Language Purity and Dirty and Clean Functions Posted: Jun 28, 2006 10:47 AM
Reply to this message Reply
Advertisement
> This is a very strange article.
I know - unfortunately I seem to have lost the reply I received from the author when I raised those same questions.

Jules Jacobs

Posts: 119
Nickname: jules2
Registered: Mar, 2006

Re: Language Purity and Dirty and Clean Functions Posted: Jun 28, 2006 11:30 AM
Reply to this message Reply
> But it is easy to optimize those away even on an
> imperative language, provided that F and G do not have
> assignment statements. A compiler could infer the 'pure'
> attribute when compiling a function, save it along the
> symbol file and later use it for optimizations.

Certainly not easy. Possible, but in a "pure" language, it's trivial.

Howard Lovatt

Posts: 321
Nickname: hlovatt
Registered: Mar, 2003

Re: Language Purity and Dirty and Clean Functions Posted: Jun 28, 2006 8:09 PM
Reply to this message Reply
Some posts in this forum have stated that the reason current functional languages are slower than imperative languages is due to current hardware not suiting them and/or the effort put into writing compiler for functional languages is less than for their imperative counterparts. I think the problem is more fundamental, it is due to the necessary allocation of data structures in a purely functional language.

The comparison between the imperative version:
  final int a[] = { 1, 2, 3 };
  for ( i = 0; i < a.length; i++ ) a[ i ] = f( a[ i ] );
  for ( i = 0; i < a.length; i++ ) a[ i ] = g( a[ i ] );

and the optimized functional version:

a = [ 1, 2, 3 ]
map( a, g . f )

as given in an earlier post is slightly misleading. The correct functional version is:

a = [ 1, 2, 3 ]
b = map( a, g . f )

The difference is that map returns a new list, b, and even though the optimized functional version does fewer loops over the data (one instead of two) the cost of allocating the new list is in general higher than looping over the data twice. Note: the imperative version typically uses a simple array for storage and the functional version uses a linked list.

The reason for using a linked list, or similar, is because in a pure functional language you cannot allocate an empty array of the right size and fill it (that's imperative). What you do is allocate a list of one element, for the first call to f . g and keep extending it by linking to the previous list for each subsequent call. Therefore the code for the functional version is something like:
interface IntF { int eval( int a ); }
 
class IntList {
  final int head;
  final List tail;
  List( final int head, final IntList tail ) {
    this.head = head;
    this.tail = tail;
  }
  IntList map( final IntF f ) {
    return tail == null ?
           new IntList( f.eval( head ), null ) :
           new IntList( f.eval( head ), tail.map( f ) );
  }
}

Note:

1. The map[code] method allocates its result list in stages due to the recursive call

2. The stack space is large because of the recursive call unless, as many functional compilers do, the recursion is translated into the equivalent loop.

Also in imperative languages like C++, via template meta programming, and Fortran, for built in fnctions, the optimization of joining the functions together and hence only looping over the data once is actually done by modern compilers. Therefore the imperative language neither loops over the data twice nor has an expensive allocation and hence in practice is quicker.

This slowness of functional languages is nothing to do with current machine hardware, the problem is the necessary data structure. You can minimize this problem by using a companion Value semantic type. See:

http://pec.dev.java.net/nonav/compile/javadoc/pec/compile/immutable/package-summary.html
http://pec.dev.java.net/nonav/compile/javadoc/pec/compile/value/package-summary.html
http://pec.dev.java.net/nonav/compile/javadoc/pec/compile/immutable/ImmutableValueConversions.html

Isaac Gouy

Posts: 527
Nickname: igouy
Registered: Jul, 2003

Re: Language Purity and Dirty and Clean Functions Posted: Jun 28, 2006 9:27 PM
Reply to this message Reply
Howard Lovatt wrote
-snip-
> The reason for using a linked list, or similar, is because
> in a pure functional language you cannot allocate an empty
> array of the right size and fill it (that's imperative).
-snip-

Here's an example of a functional language allocating empty arrays and filling them
http://shootout.alioth.debian.org/gp4/benchmark.php?test=nbody&lang=clean&id=0

Down at the bottom of the source code, this line creates a tuple of 7 arrays each of which has 5 elements initialized to 0.0
bodies = (a,a,a,a,a,a,a) with a=>createArray nbodies 0.0

Arrays are discussed on page 66 of "Functional Programming in Clean" http://www.cs.ru.nl/~clean/contents/Clean_Book/clean_book.html

and page 38 of the language report
ftp://ftp.cs.kun.nl/pub/Clean/Clean20/doc/CleanRep2.0.pdf

The preface to the language report states "CLEAN is a practical applicable general-purpose lazy pure functional programming language suited for the development of real world applications."

Howard Lovatt

Posts: 321
Nickname: hlovatt
Registered: Mar, 2003

Re: Language Purity and Dirty and Clean Functions Posted: Jun 28, 2006 10:34 PM
Reply to this message Reply
@Isaac Gouy

Sure, there is lots of work on compiler optimizations that do imperative stuff for functional languages. I am not so sure it is a good idea for the magic to be so hidden. I favour letting the programmer write imperative stuff when they need to, but encouraging them to write functional stuff most of the time.

O'Caml is an example of a programming language that mixes functional and imperative but favours functional. Also the code produced by O'Caml is normally very fast. Another example would be Fortress, a language from Sun that runs on a modified JVM! This uses the pure functional stuff for autmatic concurrency as well as for optimization.

I am not against functional languages, in fact I like them! The point I was making was that optimizing them is very hard, not easy like you might imagine.

Isaac Gouy

Posts: 527
Nickname: igouy
Registered: Jul, 2003

Re: Language Purity and Dirty and Clean Functions Posted: Jun 28, 2006 11:15 PM
Reply to this message Reply
> Sure, there is lots of work on compiler optimizations that
> do imperative stuff for functional languages.

afaict a difference between languages like Clean & Haskell and the MLs, is that Clean & Haskell claim to preserve referential transparency and the MLs don't. (Hopefully someone will correct me.)

James Watson

Posts: 2024
Nickname: watson
Registered: Sep, 2005

Re: Language Purity and Dirty and Clean Functions Posted: Jun 29, 2006 5:53 AM
Reply to this message Reply
> Here's an example of a functional language allocating
> empty arrays and filling them

Isn't that considered a side-effect?

Isaac Gouy

Posts: 527
Nickname: igouy
Registered: Jul, 2003

Re: Language Purity and Dirty and Clean Functions Posted: Jun 29, 2006 8:13 AM
Reply to this message Reply
> > Here's an example of a functional language allocating
> > empty arrays and filling them
>
> Isn't that considered a side-effect?

afaict it's considered a side-effect which doesn't effect the "purely functional" nature of the language, the reasoning for arrays is the same as the reasoning for files -

"Assume that we are able to guarantee that the reference count of the file argument of fwritec is always exactly one. We say that such an argument is unique. Now, when we apply fwritec to such a unique file we can observe the following. Semantically we should produce a new file. But we know that no other expression can refer to the old file: only fwritec has a reference to it. So, why not reuse the old file passed as argument to fwritec to construct the new file? In other words: when old file is unique it can simply be updated destructively by fwritec to produce the new file in the intended efficient and safe way."

Section 4.3 Uniqueness types, "Functional Programming in Clean" http://www.cs.ru.nl/~clean/contents/Clean_Book/clean_book.html

James Watson

Posts: 2024
Nickname: watson
Registered: Sep, 2005

Re: Language Purity and Dirty and Clean Functions Posted: Jun 29, 2006 8:23 AM
Reply to this message Reply
> > > Here's an example of a functional language allocating
> > > empty arrays and filling them
> >
> > Isn't that considered a side-effect?
>
> afaict it's considered a side-effect which doesn't effect
> the "purely functional" nature of the language, the
> reasoning for arrays is the same as the reasoning for
> files -

It doesn't? You can't arbitrarily reorder the statements if one fills an array created by another. Clearly the one that creates the array must come first.

It kind of seems like you are saying that pure functions don't have side-effects except when it's convienient for them to have side-effects. It also seems like a very primitive form of what escape analysis does in the (under-development) JVM.

Isaac Gouy

Posts: 527
Nickname: igouy
Registered: Jul, 2003

Re: Language Purity and Dirty and Clean Functions Posted: Jun 29, 2006 9:18 AM
Reply to this message Reply
James Watson wrote
> You can't arbitrarily reorder the statements
> if one fills an array created by another. Clearly the one
> that creates the array must come first.

afaik we can't /arbitrarily/ reorder function application.
toString (sin 0.5) is OK
sin (toString 0.5) isn't


> It kind of seems like you are saying that pure functions
> don't have side-effects except when it's convienient for
> them to have side-effects.

What I understand the folk who developed uniqueness typing to be saying is that pure functions don't have side-effects except when it's /safe/ for them to have side-effects - when having side-effects doesn't destroy referential transparency in the program.


> It also seems like a very primitive form of what escape
> analysis does in the (under-development) JVM.

I don't know.

Andrew Shuttlewood

Posts: 2
Nickname: andys
Registered: Jun, 2006

Re: Language Purity and Dirty and Clean Functions Posted: Jun 29, 2006 9:21 AM
Reply to this message Reply
I think that a binary "Pure/NotPure" concept is perhaps a little too simplistic.

For example, a lot of Haskell relies upon local state that is then encapsulated in local effects - which allows you to implement reasonable quicksort (amongst other things) by using state when you have to.

To the person who talked about the cost of creating new lists every time that you chain together maps/filters, the technology used to avoid this is called deforestation. Also, in haskell, a "list" doesn't actually have to be a list, the lazy evaluation could easily lead to it behaving like an iterator.

The monad concept is just basically an implementation of effect systems, maybe you need to implement something more sophisticated? (The Haskell approach has the ST monad to mess with stuff in)

James Watson

Posts: 2024
Nickname: watson
Registered: Sep, 2005

Re: Language Purity and Dirty and Clean Functions Posted: Jun 29, 2006 10:18 AM
Reply to this message Reply
> What I understand the folk who developed uniqueness typing
> to be saying is that pure functions don't have
> side-effects except when it's /safe/ for them to have
> side-effects - when having side-effects doesn't destroy
> referential transparency in the program.
>
>
> > It also seems like a very primitive form of what escape
>
> > analysis does in the (under-development) JVM.
>
> I don't know.

The main application of escape-analysis in a JVM is using stack allocation instead of heap alocation. But, the reason you can allocate on the stack is because nothing else will ever use the Objects. In other words putting them on the stack makes them unreachable to any other part of the system and can be done safely because the escape analysis has determined that there is no other access.

To me this sounds a lot like what you are talking about. But the thing about ecscape analysis is that it is recursive. Objects can create Objects that create Objects... but if none of them are 'escape' the whole allocation chain can be placed on the stack (actually in a lot of cases the members of the Object can be 'lifted' directly onto the stack), synchronization ignored and any other optimization that can be applied in such a situation.

Is it correct that this is basically equivalent to what you are referring to or am I missing something? In other words, couldn't it also be said that esacpe-analysis is determining whether a method has side-effects?

Christopher Diggins

Posts: 1215
Nickname: cdiggins
Registered: Feb, 2004

Re: Language Purity and Dirty and Clean Functions Posted: Jun 29, 2006 12:21 PM
Reply to this message Reply
> Is it correct that this is basically equivalent to what
> you are referring to or am I missing something? In other
> words, couldn't it also be said that esacpe-analysis is
> determining whether a method has side-effects?

Even though they are not precisely same thing, as far as I can tell, the same kind of analysis could be used for determining function purity.

So I apologize to not repsonding to everyone's comments. I am learning a lot here, and I appreciate the insightful discussion. I have attempted to clarify my thoughts on the subject, and incorporate other people's contributions on my latest blog entry on the topic of purity at: http://www.artima.com/weblogs/viewpost.jsp?thread=166178

Howard Lovatt

Posts: 321
Nickname: hlovatt
Registered: Mar, 2003

Re: Language Purity and Dirty and Clean Functions Posted: Jun 30, 2006 3:09 AM
Reply to this message Reply
> To the person who talked about the cost of creating new
> lists every time that you chain together maps/filters, the
> technology used to avoid this is called deforestation.
> Also, in haskell, a "list" doesn't actually have to be a
> list, the lazy evaluation could easily lead to it behaving
> like an iterator.

I am aware of all these optimizations - the point I am making is that optimizing pure functional languages is hard - not easy!

> The monad concept is just basically an implementation of
> effect systems, maybe you need to implement something more
> sophisticated? (The Haskell approach has the ST monad to
> mess with stuff in)

Actually Monads turn out to be insufficient and the latest generation of Haskel has Arrows. The problem with Monads is that you can't work out when stuff isn't needed anymore. The classic example is a parser. Using Monads you can never deallocate any of the input incase the parser back tracks. With Arrows you can say I have finished with this and the system can garbage collect stuff. This is another area where pure languages have proved difficult to optimize - space usage.

Howard Lovatt

Posts: 321
Nickname: hlovatt
Registered: Mar, 2003

Re: Language Purity and Dirty and Clean Functions Posted: Jun 30, 2006 3:18 AM
Reply to this message Reply
> I have attempted to clarify my thoughts on the
> subject, and incorporate other people's contributions on
> my latest blog entry on the topic of purity at:
> http://www.artima.com/weblogs/viewpost.jsp?thread=166178

This seems to be the wrong URL.

With regard to your original post about a stack based language a pure function in RPN is one which:

1. Takes its inputs off the stack

2 Puts its return values on the stack

3. Without calling a non-pure function

4. Whose arguments (including the hidden this) and any variables it references are immutable

This definition is only true assuming that all variables are heap allocated and the stack only contains references. If you define the stack frame as a whole as the input and as the output then by extension the whole of the stack frame must contain immutable values on input, during computation, and on output.

Flat View: This topic has 46 replies on 4 pages [ « | 1  2  3  4 | » ]
Topic: Language Purity and Dirty and Clean Functions Previous Topic   Next Topic Topic: Recursion in a Stack Based Language

Sponsored Links



Google
  Web Artima.com   

Copyright © 1996-2019 Artima, Inc. All Rights Reserved. - Privacy Policy - Terms of Use