Close

Parsing!

A project log for CDL - Circuit Description Language

An improved hardware description language.

reed-fosterReed Foster 01/30/2018 at 17:460 Comments

After a little bit of work and testing, my lexer was complete and I was ready to work on generating a syntax tree from the tokenized source code. This is where the language grammar really starts to come in. With the lexer, all I was doing was matching strings with ones that I knew were special and creating Token objects from each string. Each Token object contains the string from the source code as well as another string that specifies the type of token (identifier, operator, keyword, etc.). With the lexer, context didn't matter, whereas when parsing the token stream, the parser needs to follow a syntax rule. Using my definition of the grammar, I created a parsing method for each nonterminal which would "eat" tokens to parse terminals and call parsing methods on any nonterminals included in the definition of the original nonterminal. This process was methodical enough that I actually considered automating the process. I dismissed the idea, however, after I realized that it'd be somewhat involved, and at this point in my project, I don't want to waste time on an endeavor that could end up being rather complicated. (I realize that there are tools such as lex and yacc that I could use, but that's not the point; I feel as though I'll learn more by doing everything myself rather than using libraries). Anyhow, I eventually completed the parser (after a couple hours of bug-chasing). In order to actually verify that the parser output (the syntax tree), I had to write the framework for my code generator (the second stage of compilation) to traverse the syntax tree. For this, I copied the technique I found on this blog for recursive traversal of the tree.

Here's a screenshot of a tree generated by my lexer and parser when given the following sample source code.

component CompName
{
    int genericint;
    bool genericbool;
    vec genericvec;

    port
    {
        input int inputint;
        input vec inputvec;
        output bool outputbool;
    }

    arch
    {
        signal vec foo;
        foo <= fox < banana;
        CompType compinst = new CompType(lol = 3, foo = 5, banana = x"4");
    }
} 

Discussions