The Artima Developer Community
Sponsored Link

Weblogs Forum
Pure and Impure Functions

14 replies on 1 page. Most recent reply: Jul 10, 2006 12:51 PM by Emil Cristian Alexandrescu

Welcome Guest
  Sign In

Go back to the topic listing  Back to Topic List Click to reply to this topic  Reply to this Topic Click to search messages in this forum  Search Forum Click for a threaded view of the topic  Threaded View   
Previous Topic   Next Topic
Flat View: This topic has 14 replies on 1 page
Christopher Diggins

Posts: 1215
Nickname: cdiggins
Registered: Feb, 2004

Pure and Impure Functions (View in Weblogs)
Posted: Jun 29, 2006 12:18 PM
Reply to this message Reply
Summary
I am examing the meaning of purity with regards to functions in an attempt to close the gap between imperative and functional languages.
Advertisement

Previously I was uing the terms "clean" and "dirty" but the terms "pure" and "impure" are perhaps more standard and understandable. Here I attempt to clarify the meaning of purity and impurity, and how I am dealing with it.

A pure function is one which has no side-effects and does not depend on any state beyond its local scope. This means that it can be replaced with any other pure function which returns the same result given the same input. This property is often referred to as referential transparency.

Consider the following psuedo-code:

  def pure_f(x) = { return x + 1; }
  def impure_f(x) = { Console.WriteLine(x); return x + 1; }
Now consider the following function:
  def pure_g(list) = { return map(list, pure_f); }
  def impure_g(list) = { return map(list, impure_f); }
Note: A map function generates a list from another list by applying a function to each member of the original list.
The conundrum is that if we don't distinguish between the uses of map in these contexts then we are stuck having to implement map in an order-dependant way.

I am trying to bridge the gap in the Cat language. My solution is to use purity labeling which is similar to const labeling in C++. It has some parallels to Haskell monads, Clean uniqueness and escape analysis.

In Cat there are only functions. Functions all share a global stack. They take their parameters from the top of the shared stack, and return their results on the top of the shared stack. Local variables are emulated by simply pushing and popping values from the top of the stack. The local scope of a function is the part of the stack above, and including, the lowest value it observes or manipulates on the shared stack.

A Cat function is pure iff takes a constant number of parameters from the top of the stack, it returns a constant number of results to the top of the stack, and it has no side-effects. Some of the primitives of Cat are pure, while others aren't. It is easy enough, for the compiler to identify an impure function. If a function contains an impure function, then it is impure.

Some cat primitive functions (such as map, ifthen, or whiledo) operate on other functions, in these cases there needs to be overloaded versions of the functions: a pure version and an impure version. Returning to the map problem, the proposed Cat solution, is that depending on the type of the second parameter (i.e. is it a pure function or an impure function) then either a pure_map or impure_map is chosen by the compiler. The pure version of map is order independant, and as such is easily optimized, and easily parallelized (which supports my earlier post ).

Postscript

As Jules Jacobs pointed out, some impure functions, can become pure when they are passed constant values.


Jules Jacobs

Posts: 119
Nickname: jules2
Registered: Mar, 2006

Re: Pure and Impure Functions Posted: Jun 29, 2006 2:10 PM
Reply to this message Reply
It's not nice, but I don't think it's pureness is simple in Cat.

For example, there are two functions: startlist and endlist. startlist pushes a unique value on the stack, say #a. endlist builds a list from the values on the stack until it encounters #a. The endlist function is not pure, because it pops a variable number of things.

Now we write this function:

f = startlist 1 2 3 endlist

This function is not pure, because it contains an impure fucntion: endlist.

F can also be written like this:

f = [1 2 3]

This version *is* pure.

Now look at the function +:

2 3 + == 5

This function is obviously pure. However, if you write this function:

g = [+] do-times

assuming that do-times executes the quoted program n times:

1 2 3 4 3 g == 10

g contains no impure functions, but g itself isn't pure.

So + isn't pure, because + pops 2 items, and pushes 1 item back. Note that in other languages, like Haskell, + *is* pure. This could be a big problem for Cat.

Tim LS

Posts: 37
Nickname: parchandri
Registered: Jul, 2005

Re: Pure and Impure Functions Posted: Jun 29, 2006 3:02 PM
Reply to this message Reply
I have just a couple small quibbles. A function doesn't have to have side-effects to be impure. For instance, a C function that reads from non-constant global variables. Because the function isn't dependent only on the function arguments, it might be considered impure.

How would you consider a function which has side effects but reverses them before it finishes, e.g. acquiring a lock, and releasing it later?

Also, a more general definition of side-effect probably wouldn't refer to stack frames. Talking about stack frames becomes quite confusing if Cat is involved.

Christopher Diggins

Posts: 1215
Nickname: cdiggins
Registered: Feb, 2004

Re: Pure and Impure Functions Posted: Jun 29, 2006 3:19 PM
Reply to this message Reply
Everything up to here I follow.

...
> This function is obviously pure. However, if you write
> this function:
>
> g = [+] do-times
>
> assuming that do-times executes the quoted program n
> times:

I assume n is a value on the stack? According to my rules, g would be impure.

> 1 2 3 4 3 g == 10
>
> g contains no impure functions, but g itself isn't pure.

However, "3 g" itself is pure. It always consumes four parameters and returns one result. So there is definitely a flaw in my rules.

> So + isn't pure, because + pops 2 items, and pushes 1 item
> back.

+ is always pure because it always consumes a constant number of parameters (2) and returns a constant number of results (1).

> Note that in other languages, like Haskell, + *is*
> pure. This could be a big problem for Cat.

I think we agree that there is a problem, but disagree where. Hmmm....

Christopher Diggins

Posts: 1215
Nickname: cdiggins
Registered: Feb, 2004

Re: Pure and Impure Functions Posted: Jun 29, 2006 3:26 PM
Reply to this message Reply
> I have just a couple small quibbles. A function doesn't
> have to have side-effects to be impure. For instance, a C
> function that reads from non-constant global variables.
> Because the function isn't dependent only on the function
> arguments, it might be considered impure.

Yes, you are definitely correct. So simple observation of state is enough to render a function impure. I think I'll update my post to reflect this.

> How would you consider a function which has side effects
> but reverses them before it finishes, e.g. acquiring a
> lock, and releasing it later?

I would consider it impure. This would lose referential transparency. Shifting these kinds of functions around would result in creating a potentially unstable environment. So order dependency would remain.

> Also, a more general definition of side-effect probably
> wouldn't refer to stack frames. Talking about stack frames
> becomes quite confusing if Cat is involved.

Good point. In Cat I consider the stack frame to start at the lowest point where a function inspects values. So a binary function, such as asdd, inspects or manipulates, only the top two stack items. That is its stack frame. In Cat this is ony an abstract notion, nothing really enforces it.

Christopher Diggins

Posts: 1215
Nickname: cdiggins
Registered: Feb, 2004

Re: Pure and Impure Functions Posted: Jun 29, 2006 3:35 PM
Reply to this message Reply
> 1 2 3 4 3 g == 10
>
> g contains no impure functions, but g itself isn't pure.

This could probably be discovered by a smart compiler, uncover opportunities for purification. That is taking an impure function and making it pure again.

Kay Schluehr

Posts: 302
Nickname: schluehk
Registered: Jan, 2005

Re: Pure and Impure Functions Posted: Jun 30, 2006 12:31 AM
Reply to this message Reply
> I have just a couple small quibbles. A function doesn't
> have to have side-effects to be impure. For instance, a C
> function that reads from non-constant global variables.
> Because the function isn't dependent only on the function
> arguments, it might be considered impure.

Yes. That's interesting because it can be trivially resolved but only if the global variable is threaded as a parameter through other functions that don't act on it. Of course OO encourages side-effects i.e. methods acting on member variables. Since objects but not functions are the units of OO programs "referential transparency" in the OO context could only mean that two instances a, b of a class A, created with the same set of constructor parameters will always behave the same for same input sequences of applied to the same methods. I stay sceptical about the attempts to unify FP and OO and resolving the "conceptual gap" by turning back to functions in order to preserve good design.

Jules Jacobs

Posts: 119
Nickname: jules2
Registered: Mar, 2006

Re: Pure and Impure Functions Posted: Jun 30, 2006 8:59 AM
Reply to this message Reply
> I assume n is a value on the stack? According to my rules,
> g would be impure.

Yes, n is a value on the stack. Maybe I don't understand your rules, but I think the program [+] is pure, and do-times is pure too (because i is pure). So [+] do-times would be pure too?

Could you explain your rules, I don't think I understand them...

Jules Jacobs

Posts: 119
Nickname: jules2
Registered: Mar, 2006

Re: Pure and Impure Functions Posted: Jun 30, 2006 9:49 AM
Reply to this message Reply
f = 2 f

This function puts an infinite number of 2's on the stack. Is this function pure? Rule: a function is pure it all functions it calls are pure. 2 is pure, but is f? You don't know.

Also, is i pure?

"[a-pure-func] i" is pure.
"[an-impure-func] i" is not pure.
"[an-impure-func]" is pure too? (as long as you don't de-quote it)

Christopher Diggins

Posts: 1215
Nickname: cdiggins
Registered: Feb, 2004

Re: Pure and Impure Functions Posted: Jun 30, 2006 11:29 AM
Reply to this message Reply
> > I assume n is a value on the stack? According to my
> rules,
> > g would be impure.
>
> Yes, n is a value on the stack. Maybe I don't understand
> your rules, but I think the program [+] is pure, and
> do-times is pure too (because i is pure). So [+] do-times
> would be pure too?
>
> Could you explain your rules, I don't think I understand
> them...

Pure functions consume a constant number of elements, produce a constant number of elements, and have no side-effects.

do_times is impure because as a function it potentially consumes a non-constant number of parameters, and produces a non-constant number of results.

Does this clarify things?

Christopher Diggins

Posts: 1215
Nickname: cdiggins
Registered: Feb, 2004

Re: Pure and Impure Functions Posted: Jun 30, 2006 11:34 AM
Reply to this message Reply
> f = 2 f
>
> This function puts an infinite number of 2's on the stack.
> Is this function pure? Rule: a function is pure it all
> functions it calls are pure. 2 is pure, but is f? You
> don't know.

f is impure because it doesn't produce a constant number of values.

> Also, is i pure?

That depends on whether it is provided a pure function or impure function to execute.

> "[a-pure-func] i" is pure.

Yes.

> "[an-impure-func] i" is not pure.

Yes.

> "[an-impure-func]" is pure too? (as long as you don't
> de-quote it)

Yes.

Jules Jacobs

Posts: 119
Nickname: jules2
Registered: Mar, 2006

Re: Pure and Impure Functions Posted: Jun 30, 2006 12:00 PM
Reply to this message Reply
> f is impure because it doesn't produce a constant
> number of values.

I meant, according to your rules. I don't think you can create rules to check if a function is pure or not. You can reason about it, but you can't be sure that certain functions are pure or impure (a solution would be to treat these functions as impure unless you flag them as pure).

For example:


function f(n)
if true
pure_func
else
impure_func
end
end


This function is pure.


function f(n)
if is_sum_of_two_primes_or_odd(n)
pure_func
else
impure_func
end
end


Is this function pure? If is_sum_of_two_primes_or_odd(n) returns true for every n, it's pure. However, mathematicians haven't been able too prove that yet.

http://en.wikipedia.org/wiki/Goldbach%27s_conjecture

Let's assume that it's true, is_sum_of_two_primes_or_odd(n) returns true, n doesn't matter. Is the compiler able to figure out that it's true? The compiler probably can't prove that this theorem is correct, so it labels f as impure, which is incorrect. This is an extreme case, but I'm sure there are many other situations where a compiler can't figure out whether a function is pure or not.

Christopher Diggins

Posts: 1215
Nickname: cdiggins
Registered: Feb, 2004

Re: Pure and Impure Functions Posted: Jun 30, 2006 1:16 PM
Reply to this message Reply
> I meant, according to your rules. I don't think you can
> create rules to check if a function is pure or not. You
> can reason about it, but you can't be sure that certain
> functions are pure or impure (a solution would be to treat
> these functions as impure unless you flag them as pure).

I fully agree.

Since you can still easily identify the majority of definitively pure functions, you can optimize and rewrite those programs with ease.

Achilleas Margaritis

Posts: 674
Nickname: achilleas
Registered: Feb, 2005

Re: Pure and Impure Functions Posted: Jul 3, 2006 1:06 AM
Reply to this message Reply
> <p/>
> Previously I was uing the terms "clean" and "dirty" but
> the terms
> "pure" and "impure" are perhaps more standard and
> understandable.
> Here I attempt to clarify the meaning of purity and
> impurity,
> and how I am dealing with it.
> <p/>
> A pure function is one which has no side-effects and does
> not depend on any state beyond its local scope. This means
> that it can be replaced
> with any other pure function which returns the same result
> given the same input.
> This property is often referred to as referential
> transparency.
> <p/>
> Consider the following psuedo-code:
>
> <pre>
> def pure_f(x) = { return x + 1; }
> def impure_f(x) = { Console.WriteLine(x); return x + 1;
> 1; }
> </pre>
>
> Now consider the following function:
>
> <pre>
> def pure_g(list) = { return map(list, pure_f); }
> def impure_g(list) = { return map(list, impure_f); }
> </pre>
>
> <blockquote>
> Note: A map function generates a list from another list
> st by applying a function
> to each member of the original list.
> </blockquote>
>
> The conundrum is that if we don't distinguish between the
> uses of map in these contexts then we
> are stuck having to implement map in an order-dependant
> way.
> <p/>
> I am trying to bridge the gap in the Cat language. My
> solution is to use purity labeling which is
> similar to const labeling in C++. It has some parallels to
> Haskell monads, Clean uniqueness
> and escape analysis.
> <p/>
> In Cat there are only functions. Functions all share a
> global stack. They take their parameters from
> the top of the shared stack, and return their results on
> the top of the shared stack. Local variables
> are emulated by simply pushing and popping values from the
> top of the stack. The local scope of a function is the
> part of the stack above, and including, the lowest value
> it observes or manipulates on the shared stack.
> <p/>
> A Cat function is pure iff takes a constant number of
> parameters from the top of the stack, it returns a
> constant
> number of results to the top of the stack, and it has no
> side-effects. Some of the primitives of Cat are
> pure, while others aren't. It is easy enough, for the
> compiler to identify an impure function. If a function
> contains an impure function, then it is impure.
> <p/>
> Some cat primitive functions (such as map, ifthen, or
> whiledo) operate on other functions, in these cases there
> needs to be overloaded versions of the functions: a pure
> version and an impure version.
> Returning to the map problem,
> the proposed Cat solution, is that depending on the type
> of the second
> parameter (i.e. is it a pure function or an impure
> function) then either a pure_map or impure_map is chosen
> by the compiler. The pure version of map is order
> independant, and as such is easily optimized,
> and easily parallelized (which supports my <a
> href="http://www.artima.com/weblogs/viewpost.jsp?thread=166
> 178">earlier post</a> ).
> <p>
>
> <h3>Postscript</h3>
>
> As Jules Jacobs pointed out, some impure functions, can
> become pure when they are passed constant values.

Why don't you make the compiler infer the purity attribute? it would be great if this task could be automated.

Emil Cristian Alexandrescu

Posts: 13
Nickname: mkcoos
Registered: Jul, 2006

Re: Pure and Impure Functions Posted: Jul 10, 2006 12:51 PM
Reply to this message Reply
There is clean functional programming languages. Ones that, on certain circumstances, allow impure functions yet stays garbage-free and have no side effects.
I am talking about languages that disallows the change of "pasts". (pasts and futures are pieces in continuations)

For a functional language having a stack-based execution style the impure features can act on the arguments stack's top ONLY. Since changing just the top of that stack doesn't impact past results, it makes no difference between impure and pure features.

I repeat: impure features is allowed to change the top of the stack only. But in the case of a stack-based executed functional language it's that stack's top the only place changes can occur!

(I bet you didn't expect that; but it was that the main reason I was so interrested in Cat!)

Flat View: This topic has 14 replies on 1 page
Topic: Pure and Impure Functions Previous Topic   Next Topic Topic: Switching to Artima

Sponsored Links



Google
  Web Artima.com   

Copyright © 1996-2019 Artima, Inc. All Rights Reserved. - Privacy Policy - Terms of Use