The Artima Developer Community
Sponsored Link

Weblogs Forum
Yet Another Recursive Descent Parser for C++

13 replies on 1 page. Most recent reply: Dec 21, 2004 10:44 AM by Christopher Diggins

Welcome Guest
  Sign In

Go back to the topic listing  Back to Topic List Click to reply to this topic  Reply to this Topic Click to search messages in this forum  Search Forum Click for a threaded view of the topic  Threaded View   
Previous Topic   Next Topic
Flat View: This topic has 13 replies on 1 page
Christopher Diggins

Posts: 1215
Nickname: cdiggins
Registered: Feb, 2004

Yet Another Recursive Descent Parser for C++ (View in Weblogs)
Posted: Dec 15, 2004 10:23 PM
Reply to this message Reply
Summary
I just finished the first version of the YARD (Yet Another Recursive Descent) Parser for C++, and posted it to CodeProject.com. The YARD parser is a simple regular expression parsing toolset which can operate on generic data.
Advertisement
The YARD (Yet Another Recursive Descent) parser is now posted online at CodeProject.com. The article describes how to use the YARD parser as a regular expression string tokenizer.

The YARD parser is a very simple regular expression pattern matcher which can match regular expression in arbitrary data. The YARD parser is designed as a simpler alternative to external tools like Bison and Flex, and is even simpler to use than the Boost Spirit Library. The YARD parser is designed differently than most RD-parsers in that it uses template meta-functions to describe the grammar productions. The grammar productions are combined using regular expression operators. It isn't as complicated as it might seem. The production rules for an identifier in Heron or C++ is expressed as:

  typedef re_or<MatchLetter, MatchChar<'_'> > MatchIdentFirstChar;
  typedef re_or<MatchIdentFirstChar, MatchNumber> MatchIdentOtherChar;
  typedef re_and<MatchIdentFirstChar, re_star<MatchIdentOtherChar> > MatchIdent;

I am currently working on an XML parser using YARD. The YARD library will be the core parsing engine for the next version of HeronFront.


Florian Heidenreich

Posts: 2
Nickname: ganymed
Registered: Dec, 2004

Re: Yet Another Recursive Descent Parser for C++ Posted: Dec 16, 2004 1:57 AM
Reply to this message Reply
Thank you very much for the interesting article (I came here from CodeProject.com).
I really enjoy reading your code and learn a lot from it!

Best regards,
~ Florian Heidenreich

John D. Mitchell

Posts: 244
Nickname: johnm
Registered: Apr, 2003

Re: Yet Another Recursive Descent Parser for C++ Posted: Dec 16, 2004 8:11 AM
Reply to this message Reply
> The YARD parser is a very simple regular expression
> pattern matcher which can match regular expression in
> arbitrary data.

So is YARD an actual parser generator or is it just a lexer generator? From your wording and the code snippet, it looks like just a lexer generator.

Christopher Diggins

Posts: 1215
Nickname: cdiggins
Registered: Feb, 2004

Re: Yet Another Recursive Descent Parser for C++ Posted: Dec 16, 2004 5:57 PM
Reply to this message Reply
> Thank you very much for the interesting article (I came
> here from CodeProject.com).
> I really enjoy reading your code and learn a lot from it!
>
> Best regards,
> ~ Florian Heidenreich

Thank you very much for the kind words!

Christopher Diggins

Posts: 1215
Nickname: cdiggins
Registered: Feb, 2004

Re: Yet Another Recursive Descent Parser for C++ Posted: Dec 16, 2004 6:08 PM
Reply to this message Reply
> So is YARD an actual parser generator or is it just
> a lexer generator? From your wording and the code
> snippet, it looks like just a lexer generator.

YARD is a set of meta-functions for building either parsers or lexers, using meta-functions, as if writing the grammar as if writing a BNF specification.

The article at CodeProject shows only how to yield it as a lexer. I will be showing how to build an entire XML parser in the next article. Here is a preview of what the rules that will be used in the XML parser will look like:


struct Misc : public
re_or3<
Comment,
PI,
S
>
{ };

struct prolog : public
re_and3<
re_opt<XMLDecl>,
re_star<Misc>,
re_opt<
re_and<
doctypedecl,
re_star<Misc>
>
>
>
{ };

struct document : public
re_and3<
prolog,
element,
re_star<Misc>
>
{ };


Have I explained myself a bit better?

John D. Mitchell

Posts: 244
Nickname: johnm
Registered: Apr, 2003

Re: Yet Another Recursive Descent Parser for C++ Posted: Dec 16, 2004 7:36 PM
Reply to this message Reply
(A) So YARD isn't actually a generator, it's just a stylistic approach to writing regular expression based recognizers. Or am I still missing something?

(B) Why use such a simplistic, manual approach when there are good tools out there?

Christopher Diggins

Posts: 1215
Nickname: cdiggins
Registered: Feb, 2004

Re: Yet Another Recursive Descent Parser for C++ Posted: Dec 16, 2004 9:30 PM
Reply to this message Reply
> (A) So YARD isn't actually a generator, it's just a
> stylistic approach to writing regular expression based
> recognizers. Or am I still missing something?

No, it isn't a code-generator. I wouldn't call it "just a stylistic approach", unless you would consider the boost Spirit library to be such a thing. YARD is designed as a simple alternative to Spirit.

> (B) Why use such a simplistic, manual approach when there
> are good tools out there?

Using code-generation tools have lots of well-known problems, such as long development cycles. They are also hard to maintain and modify beyond their original purposes. I designed YARD so that I could have a very simple and easy to adapt code base, which could be reused for a multitude of tasks. YARD is also very compact.

Christopher Diggins

Posts: 1215
Nickname: cdiggins
Registered: Feb, 2004

Re: Yet Another Recursive Descent Parser for C++ Posted: Dec 16, 2004 9:32 PM
Reply to this message Reply
I have just made YARD available for download as a zip without requiring registration on my web site at: http://www.cdiggins.com/yard.zip

Glen Ritchie

Posts: 12
Nickname: gmanndsu
Registered: Dec, 2004

Re: YetAnoRecDes (YETORED) Parser for C++ Posted: Dec 20, 2004 5:36 PM
Reply to this message Reply
Interesting.

What good tools out there would you suggest? Besides Spirit?

Would you use Haskel? LISP? ANTLR?

Or why not just use a word processor with search and replace?

Glen

Glen Ritchie

Posts: 12
Nickname: gmanndsu
Registered: Dec, 2004

Re: YetAnoRecDes (YETORED) Parser for C++ Posted: Dec 20, 2004 5:36 PM
Reply to this message Reply
Whoops, sorry. Yes, I mispelled Haskell.

Sorry.

Christopher Diggins

Posts: 1215
Nickname: cdiggins
Registered: Feb, 2004

Re: YetAnoRecDes (YETORED) Parser for C++ Posted: Dec 20, 2004 6:07 PM
Reply to this message Reply
> What good tools out there would you suggest? Besides
> Spirit?

Good tools for what?

> Would you use Haskel? LISP? ANTLR?

For writing recursive descent parsers? I am only interested in tools for parsing context-free grammars in C++, without using separate code-generation tools. I don't know about many other tools for doing that.

> Or why not just use a word processor with search and
> replace?

To do what exactly? Count words? I think perhaps you don't realize that the YARD parser is a tool for matching context free grammars (CFG). A string tokenizer is simply an example of what we can do with the YARD parser. Counting words is an example of an application of a string tokenizer.

Glen Ritchie

Posts: 12
Nickname: gmanndsu
Registered: Dec, 2004

Re: (YETORED) Posted: Dec 20, 2004 6:19 PM
Reply to this message Reply
Sorry Chris,

These two replies were directed at "John D. Mitchell."

Oh yes, my word processor remark. Just a silly thought. But, hey why not implement a context free grammar with Word Perfect? (Yes, WP is that smart.) Perl could though.

Glen

John D. Mitchell

Posts: 244
Nickname: johnm
Registered: Apr, 2003

Re: YetAnoRecDes (YETORED) Parser for C++ Posted: Dec 20, 2004 9:00 PM
Reply to this message Reply
> What good tools out there would you suggest? Besides
> Spirit?
>
> Would you use Haskel? LISP? ANTLR?
>
> Or why not just use a word processor with search and
> replace?

Depends on what you're wanting to be able to do and in which language you wish to work.

For serious translators, Antlr is a good choice (though, I'm biased).

For building systems bottom up in a constructivist, linguistic manner, then a language like Lisp with hygienic macros has some serious benefits.

In the bigger picture, it's all about the language:
http://www.artima.com/weblogs/viewpost.jsp?thread=81574

Christopher Diggins

Posts: 1215
Nickname: cdiggins
Registered: Feb, 2004

Re: YARD Parser for C++ Posted: Dec 21, 2004 10:44 AM
Reply to this message Reply
> For building systems bottom up in a constructivist,
> linguistic manner, then a language like Lisp with hygienic
> macros has some serious benefits.

Template metaprogramming in C++, is in of itself also an efficient functional programming language, and is very well suited for recursive descent parsing.

I have posted a new version of the yard parser on my web site ( http://www.cdiggins.com/yard.zip ) which contains an XML grammar and XML parser. I have described the new YARD parser in great depth in a new article at CodeProject.com ( http://www.codeproject.com/useritems/yard-xml-parser.asp ).

Flat View: This topic has 13 replies on 1 page
Topic: Service Oriented Architectures - Separating Hype From Reality Previous Topic   Next Topic Topic: Excerpt: Beyond Lifestreams, the inevitable demise of the Desktop Metaphor

Sponsored Links



Google
  Web Artima.com   

Copyright © 1996-2019 Artima, Inc. All Rights Reserved. - Privacy Policy - Terms of Use