This post originated from an RSS feed registered with Ruby Buzz
by Eigen Class.
Original Post: When immutable data is faster: write barrier costs
Feed Title: Eigenclass
Feed URL: http://feeds.feedburner.com/eigenclass
Feed Description: Ruby stuff --- trying to stay away from triviality.
This example shows that in some cases immutable data can be faster than mutable state. One reason is that language implementations with incremental, generational or concurrent GCs (a few combine all three properties at once, like HotSpot's; OCaml's is only generational and incremental) typically use write barriers which add overhead to each heap pointer store (read barriers are normally more expensive). In particular, generational GCs use write barriers to maintain the remembered set: a list of references from an older generation to a younger one.
The synthetic benchmark operates on "simple records intended to emulate typical business entities":
module P0 = struct
type person = { name: string; age: int; balance: float }
end
module P1 = struct
type person = { name: string; mutable age: int; mutable balance: float }
end
The balance field is updated repeatedly, by creating a new record in the immutable case, and by changing the field directly in the mutable one. floats are generally boxed in OCaml (the main exceptions are float arrays and records holding only floats), so a person record takes 6 words on x86-64 (7 on x86, since the double value takes 8 bytes). In the first case, each iteration allocates 48(28) bytes; in the second one, only 16(12) are allocated, but the field update incurs into the write barrier overhead: a call to the caml_modify routine, that checks whether a pointer to a value in the younger generation is being stored in a block residing in the old (major) heap. As it turns out, the extra allocation is faster than the write barrier:
The example can be modified to illustrate clearly the effect of the write barrier, and when the latter is needed at all.
If the balance field is changed to an int, the stored value is known not to reside in the minor heap (integers, bools, chars and constant constructors are immediate), so the write barrier is not needed, and no call to caml_modify will be generated. Moreover, there will be no allocation at all in the mutable case, since there are no boxed floats involved anymore, so we can compare the cost of allocating a 4-word record to that of updating a single field: