Ruby Buzz Forum - Finite maps galore: imperative code strikes back

Articles |
News |
Weblogs |
Books |
Forums

Artima Forums | Articles | Weblogs | Java Answers | News

Sponsored Link •

Ruby Buzz Forum
Finite maps galore: imperative code strikes back

0 replies on 1 page.

Welcome Guest
Sign In

Back to Topic List

Reply to this Topic

Search Forum

Threaded View


Previous Topic		Next Topic

Flat View: This topic has 0 replies on 1 page

Eigen Class

Posts: 358
Nickname: eigenclass
Registered: Oct, 2005

Eigenclass is a hardcore Ruby blog.

Finite maps galore: imperative code strikes back

Posted: Jun 19, 2009 10:26 AM

This post originated from an RSS feed registered with Ruby Buzz by Eigen Class.
Original Post: Finite maps galore: imperative code strikes back Feed Title: Eigenclass Feed URL: http://feeds.feedburner.com/eigenclass Feed Description: Ruby stuff --- trying to stay away from triviality.	Latest Ruby Buzz Posts Latest Ruby Buzz Posts by Eigen Class Latest Posts From Eigenclass

Matías Giovannini has been implementing purely functional data structures based on Bentley and Sedgewick's ternary search trees (TST) and pitching them against OCaml's imperative hash table (Hashtbl). In the interval between the initial results and his showing his first version (called Trie_map from now on), I made a functional TST ("Ternary") which compares favorably to both Trie_map and the revised implementation ("Trie_map_mod").

I also resuscitated a hash table implementation of mine that uses open addressing with double hashing (in contrast to Hashtbl's external chaining) and exhibits vastly improved performance.

I'll show the benchmark results first, to be followed by some code and analysis.

Benchmarks

I benchmark the following finite map implementations:

imperative:
- Fasthashtbl: hash table with open addressing and double hashing
- Hashtbl: the hash table from INRIA's stdlib (external chaining)
- Hashtbl_mod: Hashtbl with more aggressive resizing (lower load factor)
- Hashtbl_hval: Hashtbl_mod with caching of the hash value
functional:
- Trie_map: TST with coalesced constructor for nodes with and without values
- Trie_map_mod: TST with different constructors for leaves and inner nodes with/without a value
- Ternary: TST with separate constructor for nodes with and without values but no leaf constructor
- Map: the Map implementation from INRIA's stdlib (AVL tree)

All the finite maps are compared by

building a map with 98568 English words, which are added in (a) randomized and (b) lexicographic order
measuring the minimal lookup cost with constant lookups (so that everything is in the L1 cache)
timing successful lookups by looking for the 98568 words
1. in the same (random) order the elements were added
2. in randomized order
3. in lexicographic order
4. in reverse lexicographic order
... against the maps built in randomized and lexicographic order
timing searches against the previously generated maps (randomized and lexicographic order) using
1. a larger set of English words (217625, 45.3% hit rate)
2. a set of Spanish words (86016 words, 1.3% hit rate)
... in randomized and lexicographic order

For the sake of exhaustiveness, I measure the iteration and functor overhead and detract it from the lookup timings. The total size of the structures (including the size of the strings) is also measured. All the measurements are repeated 20 times, and the best time is retained. The benchmarks are performed in a single program execution to achieve steady state (major heap resized to required size). The set of words used in the map is large enough to ensure the structure (taking 5.7MB to over 12MB) doesn't fit in the 2MB L2 cache.

The full results can be found here (also includes hash table benchmark with integer keys).

This table shows the results for a map built in randomized order (refer to the full output for those in lexicographic order). Speed is measured in ops/sec, and size in bytes ("Find (hit, orig)" means that the lookups are performed in the same random order as the insertions).

Insertion (randomized order) and lookup

Data structure	Size	Load factor	Insertion (rand)	Find (constant)	Find (hit, rand)	Find (hit, orig)	Find (hit, sorted)	Find (45.3%, rand)	Find (1.3%, rand)	Find (1.3% sorted)
FastHashtbl	7325544	0.376	1509558	16701222	2669067	3554728	3958575	2347248	3243197	5133702
Hashtbl	5752664		1105131	5617227	2136562	2296294	2791035	1623434	1731465	2229713
Hashtbl_mod	7325528	0.376	639994	7373058	2711193	2963744	3893034	2249186	2926475	4403349
Hashtbl_hval	8114072	0.376	675892	10457458	2769387	3056772	4153404	2435018	3436621	5542555
Ternary	11894440		179599	14739138	1484213	1556592	4576735	1567019	2419841	6466320
Trie_map			99834	11870073	985476	1024257	3022102	1085412	1797166	5763597
Trie_map_mod	12417768		164725	8476101	1284442	1340733	3246878	1417483	2151898	4811496
Map	6805440		167046	1168963	674209	671368	1026645	556788	633682	986727

Implementation notes

The source code for all the containers and the benchmarks can be obtained here.

I modified Hashtbl first by changing its resizing policy so that the load factor is the same as in Fasthashtbl (Hashtbl_mod), and second by caching the hash value in the nodes (Hashtbl_hval), which makes successful lookups faster when there are collisions (since the key comparison can be bypassed when the hash values differ) and helps a lot with unsuccessful ones.

Fasthashtbl, Hashtbl_mod and Hashtbl_hval only allow the load factors to grow to 50% before resizing the underlying array to twice its former size; the average load factor is thus 37.5%. Hashtbl, on the other hand, allows it to grow to 2.0, which, theory predicts, would require 13.4 probes per unsuccessful search, and 4.6 per successful one (Knuth, TAOCP Vol 3, Section 6.4, page 524). For a 37.5% load factor, Hashtbl, Hashtbl_mod and Hashtbl_hval will require 1.1 and 1.2, and Fasthashtbl (which uses open addressing with double hashing) will need 1.6 and 1.25 --- double hashing is about as good for successful lookups but worse at unsuccessful ones.

Trie_map_mod and Ternary are very similar: the only differences are that the latter doesn't have a separate Leaf node type, and its code has undergone a few manual optimizations explained below.

Benchmarks

Insertion (randomized order) and lookup

Implementation notes

Some observations

Functional vs. imperative


	Web Artima.com