Production Rule
A production rule defines a syntax element that is composed of a series of in-order references to tokens and other productions. Unlike tokens however, they can contain recursion into other productions that might potentially reference themselves. Because of this, the fully expanded form of a DFA, used by tokens, is not possible. The reason for this limitation is due to the finite nature of computers and the relative infinite memory requirement necessary to fully expand a self-recursive production.

The method used to alleviate this limitation is fairly straight forward: a representation of the DFA which expands its states only as needed, which results in an automation that is a bit slower to initialize, but finite enough to be represented on today's machines. The state discovery only needs to occur once per unique path, while the same file might not be parsed twice, two portions of a file might use the same states and therefore the state discovery is skipped in those instances.

So far, the automation handles normal recursion, and the three known types of left recursion: direct, indirect and hidden.

Literal strings within production rules are replaced with token references upon compilation (a process I call deliteralization). The compiler scans for an element of an enumerator-type token which matches the source literal. If none exists, one is created in a token named __EXTRACTIONS. Since its name is unfriendly, it's best to define the literal somewhere else so when the productions are deliteralized, it is easier to refer to.

Syntax

ProductionName ::= "Terminal":CapturedLiteralName; Token:CapturedTokenName; |
    SubRule:CapturedRuleName;


For productions, ::=, is the specifier after the production's name that determines it being a production, versus a token or an error. '>' immediately following '::=' is reserved for simple alternation rules. Its intent is to help simplify the resulted AST:

AlternationGrouping ::=>
    A1 |
    A2 |
    A3;


In the above example, A1, A2, and A3 are all marked with the interface named IAlternationGrouping.

Templates

Templates are a special kind of production which allow named elements throughout the production to be replaced. A template with no conditions is simply imported on the use-site.

Here's an example of a template used to represent operator precedence:
//Error ::= Identifier Operators.Eq String Operators.Comma HexNumber Operators.EntryTerminal;
InvalidAssociation = "Invalid association '{0}'.  Expected: Left or Right", 0x100;

/* *
 * @ before strings means case insensitive.
 * */
PrecedenceAssociation := @"Left":Left; | @"Right":Right;;

/* *
 * @"Expect", allows you to define a slight requirement for an input.
 * Special cases are: Token, Rule, or TokenOrRule.
 * Token can be a full token or a single part of a token.
 * '+' means that it's an array, multiple levels of '+' are expected in groups.
 * The first '+' requires that all following inputs are also '+' based. 
 * this ensures that indexing can be done.
 * */
PrecedenceHelper<
    Current:Expect=Rule;+, 
    OperatorSeries:Expect=Token;+, 
    Next+, 
    Association:Expect=PrecedenceAssociation;+> ::=
    
    #if Association == Left
        #ifndef Current
            //Define the entity associated.
            #define Current = Current:Left; OperatorSeries:Operator; Next:Right; | Next:Right;;
        #else
            #if Index == 0
                //Return, this becomes part of the resulted 
                //grammar of the calling rule.
                #return Current:Left; OperatorSeries:Operator; Next:Right; | Next:Right;;
            #else
                //Don't return this, add it to the already defined entity.
                #addrule Current, Current:Left; OperatorSeries:Operator; Next:Right; | Next:Right;;
            #endif
        #endif 
    #elif Association == Right
        #ifndef Current
            //Define the entity associated.
            #define Current = Next:Left; OperatorSeries:Operator; Current:Right; | Next:Left;;
        #else
            #if Index == 0
                //Return, this becomes part of the resulted 
                //grammar of the calling rule.
                #return Next:Left; OperatorSeries:Operator; Current:Right; | Next:Left;;
            #else
                //Don't return this, add it to the already defined entity.
                #addrule Current, Next:Left; OperatorSeries:Operator; Current:Right; | Next:Left;;
            #endif
        #endif
    #else
        #throw InvalidAssociation, Association;
    #endif;


An example usage of the above template:
Expression+ ::=> 
    NullCoalescingExpression |
    /* *
     * Templates must exist as a part of a rule, 
     * thus their use in such a manner is oft a liklihood.
     * Null rules exist of simply a '()' (which is what this
     * reduces to) are removed.
     * */
    PrecedenceHelper<
        NullCoalescingExpression, "??", AssignmentExpression, Left,
        AssignmentExpression, ('=' | "*=" | "/=" | "%=" |"+=" | "-=" | "<<=" | ">>=" | "&=" | "^=" | "|="), ConditionalExpression, Right,
        LogicalOrExpression, "||", LogicalAndExpression, Left,
        LogicalAndExpression, "&&", BitwiseOrExpression, Left,
        BitwiseOrExpression, '|', BitwiseExclusiveOrExpression, Left,
        BitwiseExclusiveOrExpression, '^', BitwiseAndExpression, Left,
        BitwiseAndExpression, '&', IdentityCheckExpression, Left,
        IdentityCheckExpression, ("==" | "!="), RelationalExpression, Left,
        RelationalExpression, ('<' | "<=" | '>' | ">="), ShiftLevelExpression, Left,
        ShiftLevelExpression, ("<<" | ">>"), AddOrSubtLevelExpression, Left,
        AddOrSubtLevelExpression, ('+' |  '-'), MulDivLevelExpression, Left,
        MulDivLevelExpression, ('*' | '/' | '%'), UnaryOperatorExpression, Left>;

Last edited Aug 19, 2009 at 7:17 PM by AlexanderMorou, version 3

Comments

No comments yet.