The Artima Developer Community
Sponsored Link

The C++ Source
Reading Unix-style Directories via STL-compliant Sequences
by Matthew Wilson
June 21, 2004

<<  Page 2 of 2

Advertisement

Listing 4 shows the definition of the readdir_sequence::const_iterator class, which is where all the action happens. Because opendir() affords only single-pass manipulation, only the Input Iterator concept [6] is supported. Iterator copy construction and copy assignment semantics are supported by the use of a reference-counted shared handle, of type readdir_sequence::const_iterator::rds_shared_handle [7], to support iterator instance copying. Multiple concurrent enumerations may be conducted by calling begin() but since we're dealing with file systems, whose contents may change at any time as a result of other processes, subsequent ranges obtained from begin() may not contain the same elements.

The opendir() API returns all the entries in a particular directory, irrespective of whether they are files or directories, and includes the dots directories—"." and "..". Specifying files without directories, or vice versa, as the second argument to the readdir_sequence constructor causes only the matching entries to be returned, which is a nice convenience. If neither are specified, it defaults to returning both, because this conforms to the API's behaviour and also because it is more efficient, as you'll see shortly. Since most directory enumeration—in my experience, at least—is not concerned with the dots directories, they are elided from the enumeration range by default. Specifying the readdir_sequence::includeDots flag causes them to be included in the range.

Entry filtering is performed within operator++(). Detection of files and/or directories is done by calling stat(), but this is only called when one type or the other is to be returned; hence no unnecessary costs ensue [8]. If the entry does not match, the for loop is not broken, and the next entry is retrieved. Dot elision is similarly done by detecting whether the entry is "." or ".." [9]. When readdir() returns NULL the enumeration is complete, and the iterator enters a state whereby a comparison with that returned by end() would evaluate to true, terminating the client iteration loop (assuming it's written correctly!).

In my next article, I'll describe the mapping of the UNIX glob() API, which supports a more refined STL Iterator model [10], and presents a number of different challenges to providing a simple and efficient sequence class. If you're interested, you can download the STLSoft libraries here.

Acknowledgements

Thanks to Jeremy Siek for wielding the scythe without mercy, and helping me dramatically improve on the first draft.

Notes

  1. Boost is an open-source organisation whose focus is the development of libraries that integrate with the C++ Standard Library, and is located at. It has thousands of members, including many of the top names in the C++ software community.
  2. STLSoft is an open-source organisation whose focus is the development of robust, lightweight, cross-platform STL-compatible software. It has fewer members than Boost.
  3. This is similar to the omission of operator[] from std::list.
  4. W. Richard Stevens, Advanced Programming In The UNIX Environment, Addison-Wesley, 1993.
  5. I prefer this string since I can use it without linking in a load of stuff from the standard library in contexts where I'm producing very small executable programs/dynamic-libraries (e.g. http://shellext.com/). It's also got a 32-bit footprint (on 32-bit systems), which is nice when you've got lots of empty ones, and affords its user predictable and consistent behaviour (i.e. no guessing whether it's got mad COW disease). It doesn't have all the unnecessary kitchen sink methods found in std::basic_string, since you can use algorithms for them. It is compatible with the IOStreams, while having no knowledge of them whatsoever. But I'm not given to attempting to sell its use to others; we can settle on it being an internal STLSoft implementation component.
  6. http://www.sgi.com/tech/stl/InputIterator.html
  7. As with the rest of the STL, the readdir_sequence is not written to be thread-safe, so rds_shared_handle does not use atomic integer operations. If you want thread-safety, you must handle that in client code.
  8. http://www.comeaucomputing.com/faqs/genfaq.html#whatcando
  9. The original version also incorrectly removed any other entry beginning with "..", e.g. "..any_file", which understandably made a couple of users unhappy.
  10. http://www.sgi.com/tech/stl/Iterators.html

Talk Back!

Discuss this article in the Articles Forum topic, Reading Unix-style Directories via STL-compliant Sequences.

About the Author

Matthew Wilson is a software development consultant for Synesis Software, and creator of the STLSoft libraries. He is author of the forthcoming book Imperfect C++ (to be published Sept 2004 by Addison-Wesley), and is currently working on his next two books, one of which is not about C++. Matthew can be contacted via http://imperfectcplusplus.com/.

<<  Page 2 of 2


Sponsored Links



Google
  Web Artima.com   
Copyright © 1996-2014 Artima, Inc. All Rights Reserved. - Privacy Policy - Terms of Use - Advertise with Us