The Artima Developer Community
Sponsored Link

The C++ Source
Wild-card Searches of UNIX Directories with Random-Access Iterators
by Matthew Wilson
September 13, 2004

<<  Page 2 of 3  >>

Advertisement

glob()-ing using unixstl::glob_sequence

That's a lot of work for something conceptually simple. Now contrast that with the sublime simplicity of the functionally equivalent form using the UNIXSTL[2] glob_sequence class [3], defined in the following program.

// enumwithglobsequence.cpp: Enumerating sub-directories using unixstl::glob_sequence
#include <unixstl_glob_sequence.h>  // unixstl::glob_sequence

#include <algorithm>                // std::copy
#include <iterator>                 // std::ostream_iterator
#include <iostream>                 // std::cout, std::endl
#include <string>                   // std::string
#include <vector>                   // std::vector

using std::copy;
using std::cout;
using std::endl;
using std::ostream_iterator;
using std::string;
using std::vector;
using unixstl::glob_sequence;

const char HOME[]     = "/home/matty/";
const char PATTERN[]  = ".*";

int main()
{
  glob_sequence  dir(HOME, PATTERN, glob_sequence::files);
  vector<string> dotNames(dir.begin(), dir.end());

  cout << "Dumping . files in " << HOME << endl;
  copy(dotNames.begin(), dotNames.end(), ostream_iterator<string>(cout, "\n"));
  return 0;
}

I think you'll agree that there's a considerable saving in lines-of-code, about 25:2. Because the value_type of glob_sequence (see globsequence.h) is char const*, for which std::string provides a conversion constructor [4], we can pass the iterators from its begin() and end() methods to the constructor of the std::vector<std::string> instance dotNames.

It's Not Just About Dropping the SLOCs

The advantages afforded by using glob_sequence are more than just a reduction in client code quantity; the other issues of concern are also addressed. Automatic resource deallocation and exception-safety is provided by glob_sequence's destructor calling globfree(); classic Resource Acquisition Is Initialisation [4]. Because we are using RAII, a failure code from glob() results in an instance of glob_sequence_exception (see code below) being thrown from the constructor of glob_sequence, rather than requiring users to test the "validity" of the sequence instance after construction. Here is the class definition for glob_sequence_exception:

// globsequenceexception.h: Declaration of the unixstl::glob_sequence_exception class
// Includes (as shown in globsequence.h)

class glob_sequence_exception
  : public std::exception
{
/// Types
public:
  typedef std::exception          parent_class_type;
  typedef glob_sequence_exception class_type;

/// Construction
public:
  ss_explicit_k glob_sequence_exception(int globStatus, int errno_)
    : m_globStatus(globStatus)
    , m_errno(errno_)
  {}

/// Accessors
public:
  char const  *what() const /* throw() */
  {
    return "glob_sequence failure";
  }
  int get_globstatus() const
  {
    return m_globStatus;
  }
  int get_errno() const
  {
    return m_errno;
  }

// Members
private:
  int const  m_globStatus;
  int const  m_errno;

// Not to be implemented
private:
  class_type &operator =(class_type const &);
};

Just as with readdir_sequence, all the filtering of files and/or directories, and elision of dots directories is selected by specifying the appropriate flags to the glob_sequence constructor. You may select files, or directories (with or without dots directories), or both, just by specifying the requisite flags. Furthermore, there are a number of other flags that offer even more power than is provided by readdir_sequence. The flag noSort causes the flag GLOB_NOSORT to be passed to the underlying call to glob(), which prevents it from sorting the results. One can presume that this will result in faster searching for some/all implementations, so this flag is included in the flags parameter's default value. The flag markDirs causes the GLOB_MARK flag to be passed to glob(), which marks all directory entries with a terminating path name separator ('/'). This saves you the effort of having to check for the path name separator and adding one prior to concatenating on file names or other search patterns.

Finally, the absolutePath flag causes the search directory, and therefore all the results, to be evaluated as an absolute path. In other words, if the working directory is "/usr/include", and the search directory is given as "..", then the entries will be rooted from "/usr/", rather than from "../". This functionality is provided by the class (see globsequencemethods.h), rather than a part of glob()'s otherwise impressively rich option set.

<<  Page 2 of 3  >>


Sponsored Links



Google
  Web Artima.com   
Copyright © 1996-2014 Artima, Inc. All Rights Reserved. - Privacy Policy - Terms of Use - Advertise with Us