The Artima Developer Community
Sponsored Link

Weblogs Forum
A Faster Growable Array ( Indexable Stack ) for C++

18 replies on 2 pages. Most recent reply: Nov 17, 2005 8:08 PM by Christopher Diggins

Welcome Guest
  Sign In

Go back to the topic listing  Back to Topic List Click to reply to this topic  Reply to this Topic Click to search messages in this forum  Search Forum Click for a threaded view of the topic  Threaded View   
Previous Topic   Next Topic
Flat View: This topic has 18 replies on 2 pages [ 1 2 | » ]
Christopher Diggins

Posts: 1215
Nickname: cdiggins
Registered: Feb, 2004

A Faster Growable Array ( Indexable Stack ) for C++ (View in Weblogs)
Posted: Nov 16, 2005 3:48 PM
Reply to this message Reply
Summary
I have just developed a growable array class as part of the OOTL, which outperforms std::vector on calls to push_back() by a factor of 2.5 on my installation of GCC.
Advertisement
The std::vector class from the STL is an example of a very useful data type: the growable array. The problem with most implementations of std::vector when used as indexable stacks is that when they reach the reserved memory capacity, a new block is allocated, and all of the data is copied. Or something like that. Correct me if I've made mistakes, I promise not to bite.

An alternate implementation is to create a reversed linked list of buffers which have exponentially increasing size. This allows you to grow the array without reallocation, and can save you an enormous amount of time. The results from my tests, which you can download at OOTL.org, using GCC 3.4 are as follows:

benchmarking : grow_vector(v, 42, count)
1719 mseconds
benchmarking : grow_array(a, 42, count)
641 mseconds
benchmarking : grow_vector(v, "hello", count / 10)
6077 mseconds
benchmarking : grow_array(a, "hello", count / 10)
4406 mseconds

I found that iteration over the list was just as fast as for a std::vector, by using the for_each member function (the growable_array is a model of the Iterable concept).

This implementation also retains the characteristics of average O(1) individual element access. Half of the elements are in the first list node, one quarter are in the next, and so on. In the worst case -- which is accessing the first element ironically enough -- the complexity is O(Log N). That entails a traversal of the linked list of buffers.

Anyway the code is public domain, so have fun with it!


Bob Dobalina

Posts: 16
Nickname: hiredgoon
Registered: Apr, 2005

Re: A Faster Growable Array ( Indexable Stack ) for C++ Posted: Nov 16, 2005 5:42 PM
Reply to this message Reply
>> This implementation also retains the characteristics of average O(1) individual element access.

Mode average maybe, but not mean.

If C++ new[] = malloc, would a new[] equivalent of realloc() solve the vector performance problem?

Why isn't there a realloc new[]?

Christopher Diggins

Posts: 1215
Nickname: cdiggins
Registered: Feb, 2004

Re: A Faster Growable Array ( Indexable Stack ) for C++ Posted: Nov 16, 2005 6:30 PM
Reply to this message Reply
> >> This implementation also retains the characteristics of
> average O(1) individual element access.
>
> Mode average maybe, but not mean.

No it is in fact the mean. Ignoring constants, half of all element take one operation to access, one quarter of all items take two operations, one eigth take three operations, etc.

n + n/2 + n/4 + n/8 + ... + n/2^i
------------------------------
i

converges at 2. Which is O(1).

> If C++ new[] = malloc, would a new[] equivalent of
> realloc() solve the vector performance problem?

The problem with using realloc is that it moves the binary representations of the object, and not all objects support that (e.g. objects which contain pointers).

> Why isn't there a realloc new[]?

Good question. May be because it wouldn't be type-safe, but that hasn't stopped them from putting in other non-type-safe features (like placement new).

Tanton Gibbs

Posts: 20
Nickname: tanton
Registered: Aug, 2005

Re: A Faster Growable Array ( Indexable Stack ) for C++ Posted: Nov 16, 2005 7:47 PM
Reply to this message Reply
I think what you recreated is the equivalent of a std::deque. What differences, if any, does your implementation offer from a deque. I would be interested in seeing a performance test between your implementation and std::deque. As you know, the std::vector is implemented as a contiguous array in order to satisfy the standard, so besting it by ignoring that rule is almost unfair :-) Regardless, I think a deque more closely models what you describe and look forward to seeing those results.

Christopher Diggins

Posts: 1215
Nickname: cdiggins
Registered: Feb, 2004

Re: A Faster Growable Array ( Indexable Stack ) for C++ Posted: Nov 16, 2005 9:01 PM
Reply to this message Reply
> I think what you recreated is the equivalent of a
> std::deque. What differences, if any, does your
> implementation offer from a deque.

It's really a stack, there is no front end insertion at all.

> I would be interested
> in seeing a performance test between your implementation
> and std::deque.

See below:

> As you know, the std::vector is
> implemented as a contiguous array in order to satisfy the
> standard,

I was not aware that a std::vector had to be implemented in contiguous memory according to the standard. That seems counter to much of the standard which seems designed specifically to support non-contiguous memory models, but it is too late for me to go and look it up right now.

> so besting it by ignoring that rule is almost
> unfair :-) Regardless, I think a deque more closely
> models what you describe and look forward to seeing those
> results.

In the below benchmarks v is a std::vector and d is a std::deque. I am using GCC 3.4 and STLPort 4.6.2 on windows.


benchmarking : grow_vector(v, 42, count)
1030 mseconds
benchmarking : grow_vector(d, 42, count)
514 mseconds
benchmarking : grow_array(a, 42, count)
453 mseconds
benchmarking : grow_vector(v, "hello", count / 10)
4437 mseconds
benchmarking : grow_vector(d, "hello", count / 10)
3483 mseconds
benchmarking : grow_array(a, "hello", count / 10)
3437 mseconds
benchmarking : test_vector_indexing(v)
577 mseconds
benchmarking : test_vector_indexing(d)
5937 mseconds
benchmarking : test_array_indexing(a)
1077 mseconds
benchmarking : test_vector_iteration(v)
2437 mseconds
benchmarking : test_vector_iteration(d)
2671 mseconds
benchmarking : test_array_iteration(a)
2405 mseconds


So the STLPort 4.6.2 deque implementation, apart from providing front-end insertion, provides slightly slower iteration and back-end insertion (<10%), but much much slower indexing (over 5x).

Kannan Goundan

Posts: 18
Nickname: cakoose
Registered: Nov, 2005

Re: A Faster Growable Array ( Indexable Stack ) for C++ Posted: Nov 16, 2005 11:48 PM
Reply to this message Reply
So vector uses the amortization trick on "push_back()" (assumming it doubles each time), while you use it on "operator[]". Neat trick.

Is there a situation/algorithm appends elements as many times (or more times) as it indexes into an array?

Nathan Moore

Posts: 2
Nickname: luthe
Registered: Nov, 2005

Re: A Faster Growable Array ( Indexable Stack ) for C++ Posted: Nov 17, 2005 6:14 AM
Reply to this message Reply
There is no realloc new because most of the time you are allocating objects. The concept of growing an object makes no sense.

You could always write your own set of allocators (overload new,delete) as raw wrappers around malloc, free, and then you could provide a new realloc

Nathan Moore

Posts: 2
Nickname: luthe
Registered: Nov, 2005

Re: A Faster Growable Array ( Indexable Stack ) for C++ Posted: Nov 17, 2005 6:18 AM
Reply to this message Reply
Yes a std::vector has to be contigous. Much like the c_str method in std::string this is for c interop. Allowing you to write stuff like &(v[0]) to get the start of the array to pass to a c function. I believe this is mentioned in Stroustrup's book.

Christopher Diggins

Posts: 1215
Nickname: cdiggins
Registered: Feb, 2004

Re: A Faster Growable Array ( Indexable Stack ) for C++ Posted: Nov 17, 2005 7:25 AM
Reply to this message Reply
> So vector uses the amortization trick on "push_back()"
> (assumming it doubles each time), while you use it on
> "operator[]". Neat trick.

Thanks. I want to point out though that the worst case for an indexing operation in growable array is O(Log(N)) whereas the worst case for a push_back operation is O(N).

> Is there a situation/algorithm appends elements as many
> times (or more times) as it indexes into an array?

One very common example is string concatenation.

By far, the most common scenario for indexing is iterating over each element in sequence, which is handled just as fast as a vector by the growable array. So the question would be more meaningful if you added the clause: "and out of sequence" to your question.

Tanton Gibbs

Posts: 20
Nickname: tanton
Registered: Aug, 2005

Re: A Faster Growable Array ( Indexable Stack ) for C++ Posted: Nov 17, 2005 7:53 AM
Reply to this message Reply
Wow, those speed results for deque are really painful! Did that include optimizations turned on? If so, Ouch!

The standard specifies that &v[0] for a vector must give a contiguous array that can be used by a function expecting a C-style array. While theoretically that doesn't necessarily require a contiguous implementation, practically it does.

Christopher Diggins

Posts: 1215
Nickname: cdiggins
Registered: Feb, 2004

Re: A Faster Growable Array ( Indexable Stack ) for C++ Posted: Nov 17, 2005 8:04 AM
Reply to this message Reply
> Yes a std::vector has to be contigous. Much like the c_str
> method in std::string this is for c interop. Allowing you
> to write stuff like &(v[0]) to get the start of the array
> to pass to a c function. I believe this is mentioned in
> Stroustrup's book.

That is ture for std::valarray, but there is no mention of that requirement for std::vector in either TCPL or the Standard. Here is, what I believe to be, the relevant section of the standard:


23.2.4 Template class vector
...
A vector satisfies all of the requirements of a container and of a reversible container (given in two tables
in 23.1) and of a sequence, including most of the optional sequence requirements (23.1.1). The exceptions
are the push_front and pop_front member functions, which are not provided. Descriptions are provided
here only for operations on vector that are not described in one of these tables or for operations
where there is additional semantic information.
...


And then it goes on, and only covers:

23.2.4.1 vector constructors, copy, and assignment
23.2.4.2 vector capacity
23.2.4.3 vector modifiers
23.2.4.4 vector specialized algorithms

There are no mentions here of special requirements on operator[] or on internal memory

Code using &vector[index] is accepted happily by any compiler. The problem is that most vectors use pointers as their iterators, so the compiler can't recognize the incorrect usage of the sequence. This idiom you describe is most likely widespread despite being technically incorrect.

Ironically this is a case where the tail wags the dog: if someone were to make the growable_array conformant to the STL requirements for std::vector, it would break so much preexisting code as it to render it unusable.

Christopher Diggins

Posts: 1215
Nickname: cdiggins
Registered: Feb, 2004

Re: A Faster Growable Array ( Indexable Stack ) for C++ Posted: Nov 17, 2005 8:19 AM
Reply to this message Reply
> Wow, those speed results for deque are really painful!
> Did that include optimizations turned on?

Yes.

> If so, Ouch!

To be fair that is just the comparison against the STLPort implementation. I've posted the code in the public domain for anyone to play with at http://www.ootl.org

Deque implementations vary widely from what I can see (at least between Visual C++ and GCC STLPort). However, the reverse exponential list approach I propose, may very well be completely novel, but such a boisterous claim is hard to verify. I just don't have the patience to pour through journals. Perhaps I should just make the claim, and let someone prove me wrong. ;-)

> The standard specifies that &v[0] for a vector must give a
> contiguous array that can be used by a function expecting
> a C-style array.

Where in the standard? As I mentioned I've searched and I can't find it.

> While theoretically that doesn't
> necessarily require a contiguous implementation,
> practically it does.

If the requirement you mention is true, then this conclusion is inescapable.

Tanton Gibbs

Posts: 20
Nickname: tanton
Registered: Aug, 2005

Re: A Faster Growable Array ( Indexable Stack ) for C++ Posted: Nov 17, 2005 9:17 AM
Reply to this message Reply
> > The standard specifies that &v[0] for a vector must give
> a
> > contiguous array that can be used by a function
> expecting
> > a C-style array.
>
> Where in the standard? As I mentioned I've searched and I
> can't find it.

It was actually in an approved defect report. You can look here for more information: http://swsw.motovilov.net/weblogs/swsw/archives/2005/08/must_elements_o.html

which points to #69 in the lwg defect report page: http://www.open-std.org/jtc1/sc22/wg21/docs/lwg-defects.html

Christopher Diggins

Posts: 1215
Nickname: cdiggins
Registered: Feb, 2004

Re: A Faster Growable Array ( Indexable Stack ) for C++ Posted: Nov 17, 2005 9:32 AM
Reply to this message Reply
Thanks for pointing that out! Saves me some embarassment.

Max Lybbert

Posts: 314
Nickname: mlybbert
Registered: Apr, 2005

I'm glad I didn't answer last night Posted: Nov 17, 2005 10:05 AM
Reply to this message Reply
I learned something about the standard today (from this discussion), so the draft response I was working on last night was incorrect.

However, I recognize that while random access in this Growable Array is O(1), it does steps similar to std::vector's at() method (which is O(1), but expected to be slower than operator[] O(1)). I'm kind of curious how this container compares to std::vector's random access methods.

Flat View: This topic has 18 replies on 2 pages [ 1  2 | » ]
Topic: Rich Web Apps Beyond the Browser: Macromedia Central Previous Topic   Next Topic Topic: C++ Cookbook from O'Reilly

Sponsored Links



Google
  Web Artima.com   

Copyright © 1996-2019 Artima, Inc. All Rights Reserved. - Privacy Policy - Terms of Use