Update: July 19, 2015:

Well after a lot more testing I've found:

  1. Left Recursive Rules parse well
  2. Left Recursive Rules predict poorly

Here's the basic scenario:

A ::=> B | C | D;
B ::= A 'b';
C ::= 'c'
D ::= A 'd';
Quark ::= A 'a' | B | C | D;

If we have the following string: cbdbdbdbdbdbda

Predict on Quark says: If you have 'c' followed by 'b', Parse B, 'c' followed by 'd' parse D, 'c' followed by 'a' Parse A Match A.

A person looking at this immediately says: That's wrong!   You have to predict 'c', then predict 'b', 'd', until you hit something else, or an 'a' to make the right decision.  I'm evaluating some left recursive short circuits that might've overly restricted the predictions.  Time will tell.

Edit:

This mostly affects situations where what's next needs to make a decision on a potentially left-recursive rule, examples are: NullableType (def: Type '?' wherein NullableType is a Type,) ArrayType (NonArrayType ArrayRanks, wherein NonArrayType is left-recursive on Type due to NullableType)

Update: July 12, 2015:

Posted a Pre-Pre-Pre-Alpha preview.  Intended to showcase current progress.   Current task is Error Recovery and reporting (the release does not have it, I'm just beginning to write it in.)

Limitations

The parser was initially started prior to the author's awareness of multi-state parsers. Should you need something like a pre-processor, you'll likely have to write it as two separate parsers, or write your language to be tolerant of pre-processors (like the OILexer sample grammar)

Status

As of commit 111564: Project appears able to produce a parser capable of parsing the .oilexer grammar description language files.  This requires the commit on Abstraction 39856, download both, unblock their contents.  In the folder you want to deploy, create an 'Oilexer' folder, and an 'Abstraction' folder. Unzip Oilexer's code in Oilexer and Abstraction in the Abstraction folder (don't include the commit# and do not nest in a sub-folder that Explorer tries to do by default.)

There is currently no error-recovery/tracking.  This is the current task.  Further, languages which introduce follow ambiguities while in the middle of a reduction, within a prediction, have a high chance of predicting incorrectly if the symbol is required by the actual calling rule.  This is the task after the current error recovery/reporting.

Please report any issues you have in the Discussions, though I won't be able to help with build issues as I've tested downloading and building myself and it appears to build fine.  If you are missing any files, please make note of which files.  AllenCopeland.pfx intentionally removed due to it representing my digital signature.

I expect this to be buggy, I can only throw so much at it on my own.  I haven't had time to construct a fully fledged language out of this, though a word of caution: the Tuples{5}.oilexer in the Oilexer Grammar is an example of how to provide syntax validation through specifying exactly what possible combinations are valid (versus allowing a repetition over the series and reporting an error on duplicates.)  Using tuples greater than 5 can yield an exponential state explosion that might yield an indeterminate compilation (never finishes.)

Known Issues

During LL(*) unbound lookahead calculation, when one of the stacks of work is emptied, OILexer has a 10% chance to hang, +/- a few percent based on the timing of the host system.  I need to investigate the monitor lock order to see if I'm entering them in an inconsistent manner.

Last edited Jul 19, 2015 at 7:28 AM by AlexanderMorou, version 30