Sponsored Link •
This article shows how to easily process UNIX-style directory entries as STL sequences.Copyright © 2004, Matthew Wilson
The C++ community is gradually moving towards the STL paradigm—containers, iterators, function objects, algorithms and adaptors—which provides great advantages in terms of commonality of expression, reduction in developer effort, and greater robustness, maintainability and reuse. However, apart from the standard library and one or two exceptions (Boost , STLSoft ), there is relatively little STL-compliant code, which is, in part, due to the complexities involved in its development.
In this article I'll demonstrate how one kind of enumeration
opendir()/readdir() API—can be
mapped into an STL-compliant sequence-like class providing Input
Iterators. I examine the implementation of
from UNIXSTL, the UNIX-specific sub-project of STLSoft ().
Before getting into the details, consider the advantages of an
STL-compliant approach. Let's say you need to load the names of the
/home/matty/ into a vector of strings.
Using the raw
opendir() API, your code might look something
like that shown in Listing 1.
It's not an enormous amount of code, to be sure, but it's still quite
a bit for something so conceptually straightforward. Let's look at the
readdir_sequence-based version, in Listing 2. It's clear in comparing
the two code snippets that
readdir_sequence represents a big
win over the version using the
opendir() API. There are
closedir(), is automatically handled, via Resource Acquisition Is Initialisation, which improves robustness. Indeed, the first example is not exception-safe, since the
std::stringconstructor and the
std::vector<>::push_back()method may throw exceptions; the second version is exception-safe.
The one flaw in the design is that the value_type is
dirent const*, which means you have to explicitly enumerate the
entries, rather than use algorithms or iterator-based constructors (see Sidebar).
Let's look now how it's implemented. Listing 3 shows the definition of the
readdir_sequence class. It provides
end() methods, which return iterators of member type
const_iterator, since only non-mutating access to the entries
is provided. Hence,
readdir_sequence provides a read-only
view on a directory.
It also provides an
empty() method, which tests
end(). I've deliberately
size() member, since the size could only be
obtained by conducting an enumeration over the range, which is a costly
operation. Not providing this method is an unequivocal documentation of
this fact . If one
really wants to calculate the range size, it can be done via
For convenience to client code, a
is provided which returns the search directory, having ensured, in the
constructor, that it has a trailing path name separator ('/'). This means
that if you need to express the returned values in absolute form, you can
do so simply, as in:
dirNames.push_back(dir.get_directory() + (*b)->d_name);
The two parameters to the constructor are the directory to search and
the flags, which control the search. I'll look at these shortly when I
Since all the member variables of this class are fully-fledged value types, there is no need to proscribe or provide explicit implementations of either the copy constructor or the copy assignment operator. The compiler provided ones will work quite nicely and safely.
Another notable aspect to the class is that the type of the string
m_directory, depends on whether the symbol
PATH_MAX is defined . It if is, then the
operating environment has a fixed maximum path limit, and the class uses
the STLSoft string
basic_static_string, which has an internal
array of the given size, so there's no memory allocation involved. If
PATH_MAX is not defined, then the operating environment does
not have a fixed maximum path limit, so the STLSoft string
basic_simple_string is used instead . It's a small
efficiency, certainly, but I'm funny like that.
A further saving comes from the use of the
member. Since the directory is constant for the life of the
sequence/iterator, and it's contents are not revealed directly outside the
iterator instance, you are able to reuse it for constructing the full path
names of the entries, in order to call
stat() on them. The
m_dirLen member remembers the original length of the
directory alone, so it can be truncated to that length (which includes
path name separator) ready for each entry name to be appended to it.
opendir() API provides a two-step process to directory
enumeration (in contrast to, say, Win32's
so the sequence class starts the enumeration—in its
begin() method—and hands the DIR pointer over to the
iterator to walk through the matched items.