Flex and Bison in C++: An Example

Flex and Bison are used together to create parsers, usually for programming-related tasks like parsing source files or SQL statements. While not as easy on the programmer as newer libraries like Spirit OR ANTLR, they are more efficient since they generate raw C code that can be compiled into your application. They are also capable of outputting C++ code, but there is a lack of clear documentation / examples available demonstrating this.

What I present here are my efforts over the last couple of days towards creating a program that uses the two together in C++ mode. The result is trivial – numbers are recognized and divided by five – since the focus of this program is to demonstrate using Flex and Bison. If you do not know Flex and Bison, I recommend Lex & Yacc by Tom Niemann (Flex is a clone of Lex, Bison is a clone of Yacc).

Download the example source code here.

This is an example, not a tutorial – I’m not going to take you through the code line by line. I do have some notes for you, though:

  • waffleshop.y
    • The require line tells Bison that this is meant for Bison 2.4.1. While it might work on other versions of Bison, the C++ code generated is ‘experimental’ and subject to change between versions.
    • The skeleton line specifies an alternate template to use – lalr1.cc is the C++ version.
    • The parse-param options tell Bison that we want the class to have an additional member variable – the scanner. Since Bison calls the scanner to get a token, the Bison class needs to have a reference to the scanner. The lex-param option tells Bison that when it calls yylex, it should pass the scanner as an additional argument. Our implementation of yylex invokes the passed scanner and returns the result.
    • There are two sections of code: the first is ‘%code requires’. The code inside of this block is put in the generated header file as well as the c file. The other %code block contains code that only goes in the c file, and since we don’t want anything else calling the yylex global function we make it static to the file.
  • WaffleshopParser.h
    • This is a convenience class that bundles the parser and the scanner together; it is good Object-Oriented design to do this, and is recommended in the Bison documentation.
  • WaffleshopScanner.h
    • FlexLexer.h is provided by Flex, and defines the base class that generated Flex scanner classes inherit from. The preprocessor directive surrounding it is necessary because FlexLexer.h is a mess (it says so right in the Flex generated code on line 16).
    • The yylex function has to be overloaded because Bison passes a pointer to the yylval variable. It would be nice if we could just specify this in YY_DECL, but the base Flex class has yylval with no arguments defined as pure virtual, forcing us to implement it anyway; I just went ahead and used it.

I used the following resources to construct this example:

Please post any suggestions or questions – I am no expert on either Flex or Bison and am always interested in improving my code!

Update: After you have gleaned all you can from this example, check out this followup about creating an INI file parser using Flex and Bison in C++.

Leave a comment ?

18 Comments.

  1. Hey, thanks. I’ve never seen these tools used in C++ mode, this actually makes me more inclined to use them. I hated how crusty the code output of Bison felt, but this alleviates most problems with the C code.

  2. Well i don’t agree with your statement that the C lexer and parsers are not thread safe, you can generate the reentrant parsers fine. Flex is an implementation of the LEX language to generate a lexer and bison is an implementation of YACC each with many many more rich features in you lexer and parser are neither portable to anything other than flex and bison neither is the C++ code since you use #pragma once. http://en.wikipedia.org/wiki/Pragma_once

    Really i find this syntactic sugar for calling yyparse and really you havent got enough features to set input sources via get/set since fully abstracted C++ gives rise to the pattern of dont touch the implementation code rather use member access on the object. Though really I don’t see the benefit in using C++ over C other than STL.

    • I was not trying to say C scanners / parsers could not be made thread safe – all I meant was that C++ scanners / parsers don’t require any additional configuration to make them so. I’ll reword the paragraph to make this clearer.

      The example is intentionally sparse in features to make it simple and easy to follow / understand. For a more feature rich (and more complex) example, check out Timo Bingmann’s example.

      As far as my example not being portable to anything besides Flex and Bison, the title of the post is ‘Flex and Bison in C++…’ :P

      • I ended up removing the offending paragraph completely – it really didn’t add anything to the post and gave people the wrong impression about the differences between the C skeleton and C++ skeleton. If you are playing with tools like Flex and Bison you probably know enough to decide for yourself whether you should use C or C++, so the pros and cons of this argument are left for the reader to decide.

  3. I am quite new to this, but have just tried to compile the files in Windows,
    ===
    bison waffleshop.y
    flex waffleshop.l
    g++ lex.yy.cc Main.cpp waffleshop.tab.c -o waffleshop
    ===
    and failed with a lot of errors output.

    Does this work for windows? Some of the errors are listed for your reference:
    ===
    In file included from lex.yy.cc:238:
    /usr/include/FlexLexer.h:130: error: expected unqualified-id before numeric constant
    lex.yy.cc: In member function `virtual int Waffleshop::FlexScanner::yylex()’:
    lex.yy.cc:481: error: `cin’ undeclared (first use this function)
    lex.yy.cc:481: error: (Each undeclared identifier is reported only once for each function it appears in.)
    lex.yy.cc:484: error: `cout’ undeclared (first use this function)
    lex.yy.cc:486: error: `yy_current_buffer’ undeclared (first use this function)
    lex.yy.cc: At global scope:
    lex.yy.cc:698: error: `ostream’ has not been declared
    lex.yy.cc:699: error: ISO C++ forbids declaration of `arg_yyout’ with no type
    lex.yy.cc:699: error: prototype for `yyFlexLexer::yyFlexLexer(istream*, int*)’ does not match any in class `yyFlexLexer’
    /usr/include/FlexLexer.h:112: error: candidates are: yyFlexLexer::yyFlexLexer(const yyFlexLexer&)
    /usr/include/FlexLexer.h:116: error: yyFlexLexer::yyFlexLexer(std::istream*, std::ostream*)
    lex.yy.cc: In constructor `yyFlexLexer::yyFlexLexer(istream*, int*)’:
    lex.yy.cc:700: error: cannot convert `istream*’ to `std::istream*’ in assignment
    lex.yy.cc:701: error: cannot convert `int*’ to `std::ostream*’ in assignment
    lex.yy.cc:718: error: `yy_current_buffer’ undeclared (first use this function)
    lex.yy.cc: In destructor `virtual yyFlexLexer::~yyFlexLexer()’:
    lex.yy.cc:730: error: `yy_current_buffer’ undeclared (first use this function)

    ===

    Could you please give me some advices?

  4. Sorry, I need to describe my problem more.

    bison waffleshop.y
    flex waffleshop.l

    work properly and no error output, but next

    g++ lex.yy.cc Main.cpp waffleshop.tab.c -o waffleshop

    returns quite a lot error reports.

  5. Does this solution work with Visual studio environment? Or we need g++ type of compiler only.

  6. Micah Villmow

    Robert,
    What is the code license on your examples?
    Thanks,

    • The WTFPL sounds good to me; if you would prefer a different license let me know. Leaving a post saying ‘this is useful, thanks’ would be great, but there are absolutely no restrictions on what you do with the code.

  7. hi.
    now i tested your example code on mingw.

    here g++ (mingw) compile method.

    > g++ lex.yy.cc Main.cpp waffleshop.tab.c -o waffleshop
    > returns quite a lot error reports.

    modify FlexLexer.h and lex.yy.cc files like below. ( – remove + add)

    * FlexLexer.h line 47
    -#include
    +#include
    +using std::istream;
    +using std::ostream;

    * lex.yy.cc line 23~25
    -#include
    -class istream;
    -#include
    +#include
    +#include
    +#include
    +using namespace std ;

    * result
    C:\ex\parser>waffleshop
    10
    INTEGER / 5 = 2
    HI
    5
    INTEGER / 5 = 1
    200
    INTEGER / 5 = 40
    ^C

    ;-)

  8. hmm. skip special characters.

    * FlexLexer.h line 47
    -#include iostream.h
    +#include iostream
    +using std::istream;
    +using std::ostream;

    * lex.yy.cc line 23~25
    -#include stdlib.h
    -class istream;
    -#include unistd.h
    +#include stdio.h
    +#include stdlib.h
    +#include iostream
    +using namespace std ;

  9. Robert have you got any answers for jung? I am also struggling with the errors.
    Thanks
    Daniel

  10. I mean answers for John and Jung concerning compiling your example in windows with g++ or with visual c++ 6.0 dev studio (not .NET)

  11. Hi man! Thanks for this piece of code. It’s been really helpful.

  12. Hi! Thanks for the code. Is really helping me.

    How can I change the yyin in this case? In the regular FLEX/BISON I just assign a FILE to yyin in the main. I don’t know what to do in this case.

    Thanks in advance

  13. A huge thanks for the example. All around were only snippets so you tarball was very welcome.

    For information it lasted me around 6 hours to convert a simple C flex-bison to C++.

  14. Great work man! First example I could find that use C++ and compiles fine.

    Surely I’ll write a post in my blog and cite yours.

Leave a Comment