Class: LL1

page. LL1

Tests and wraps a LL(1) grammar for top-down parsing. An object of this class supports one call to a parsing algorithm at any one time. The object is serially reusable.

page.LL1#parser is a recursive parsing algorithm for page.BNF and page.EBNF grammars. page.LL1#parse is a stack-based parsing algorithm for page.BNF grammars (which can be derived from page.EBNF grammars). The algorithms have different strategies for error recovery.

Either algorithm can pull input from a callback function. Successive pieces of input can be pushed to successive calls of page.LL1#parse.

A BNF grammar undergoes the following tests:

node check action
page.BNF delegate to all non-terminals.
page.BNF.NT first of alternative right-hand sides in rules should be pairwise disjoint, first and follow should not overlap if empty input is acceptable.
page.BNF.T nothing to do.

If a non-terminal is defined with left recursion, it will have ambiguous alternatives. Such a grammar should not be used here.

An EBNF grammar could be checked by first converting to BNF. However, the following tests relate errors better to the original grammar:

node check action
page.EBNF delegate to rules' right-hand sides.
page.EBNF.NT first and follow should not overlap if it accepts empty input.
page.BNF.T nothing to do.
page.EBNF.Alt first and follow should not overlap if it accepts empty input; first of choices should be pairwise disjoint; delegate to choices.
page.EBNF.Lst first and follow should not overlap; first of body and delimiter should be pairwise disjoint if either accepts empty input; delegate to body and delimiter.
page.EBNF.Rep first and follow should not overlap; delegate to body.
page.EBNF.Seq first and follow should not overlap if it accepts empty input; delegate to terms.

new LL1(_grammar [, _observer])

Wraps a suitable grammar for top-down parsing; checks if a grammar is suitable, i.e., LL(1).

Parameters:
Name Type Argument Description
_grammar page.BNF | page.EBNF

to be used for parsing; it should not be left-recursive. Either grammar is checked whether it is LL(1).

_observer Observer <optional>

to observe recognition.

Properties:
Name Type Description
bnf page.BNF

BNF grammar to control recognition.

ebnf page.EBNF

EBNF grammar to control recognition, if available.

observer Observer

to observe recognition.

greedy number

number of first/follow overlaps, would be resolved in favor of first.

ambiguous number

number of first overlaps in alternatives of the same non-terminal, would be resolved in favor of the first alternative but usually makes the grammar unsuitable because it can indicate left recursion.

errors number

counted by page.error method, reset just before parsing is started.

states Array.<{rule: page.BNF.Rule, position: number}>

currently active rules, exists only while page.LL1#parse is used.

values Array.<Array.<object>>

currently active value lists, exists only while page.LL1#reflect is used.

Source:

Methods


dump()

Displays errors, overlaps, and active rules and delegates to the grammar.

Source:
Returns:
Type
string

parse(_input [, _observer])

Checks top-down if input conforms to a BNF grammar. For an EBNF grammar this method uses the corresponding BNF grammar. This method can be called repeatedly with portions of input.

node parse action
page.BNF.NT compare next input and first to visit the first possible rule for the non-terminal and return the result. It is an error if no rule can be visited and the non-terminal cannot accept no input.
page.BNF.Rule send a 'rule' message to the observer if any; visit the descendants; send a 'reduce' message. It is an error if the observer returns a string.
page.BNF.T send a 'shift' message to the observer if any and advance in input. It is an error if the observer returns a string.

The method uses a stack of active rules and attempts error recovery. If an input symbol cannot be matched, the parser function will search the rest of the current and all outer active rules for a possible match with the input symbol. If there is no match, the input symbol is discarded and the search is tried again until an irrecoverable error is reported at end of input.

The method is greedy, i.e., it will match input to a rule as long as possible; therefore, some types of ambiguous grammars are still suitable for parsing.

Create parsers and scanner
var parser = new page.LL1(page.BNF.from(' ... grammar described in BNF ...'));
var parser = new page.LL1(page.EBNF.from(' ... grammar described in EBNF ...'));
var scanner = parser.tokenizer(/ skip pattern /);
Parse -- all input
parser.parse(scanner('... input ...').concat([ null ]));
Parse -- push tuples
var result = parser.parse(scanner('... start of input ...'));
while (result === true)
  result = parser.parse(scanner('... more input ...')); // eventually null input
Parse a stream -- pull tuples
function stream () { return scanner('... start/more input/null ...'); }
parser.parse(stream);
Trace
parser.parse(stream, page.trace());
Collect token values
var result = parser.parse(stream, parser.reflect());
print(result); // array with token values
Create observer for traced building
var environment = ... something useful for the builder ...;
var builder = {
  ruleName: // matched by reflection
    function (_values, _environment, _tuple, _sender) {
      return ... anything
      throw string for error message
      throw Error with message to abort
    }
    ...
};
var observer = page.trace(parser.reflect(environment, builder));
Build
var result = parser.parse(stream, observer);
print(result); // array with builder value(s)
Parameters:
Name Type Argument Description
_input Array.<page.Tuple> | Tokenizer

either the next portion of input and null or an empty array will match $eof, or a Tokenizer which will be called without arguments and each time must return the next portion of input, null, or an empty array.

_observer Observer <optional>

to observe recognition; cannot be specified once parsing is in progress.

Source:
See:
Throws:

from the observer to abort parsing.

Type
Error
Returns:

true to request another input array, or the top-level value (which could be null) returned by the observer, if any, when the end of input is reached.

Type
boolean | object

parser(_input [, _observer])

Checks top-down if input conforms to an EBNF or BNF grammar.

Starting with the start symbol, this method recursively visits the grammar. A node will only be visited if the next input symbol is in the set of first terminals of the node.

node parse action
page.EBNF.NT delegate to the corresponding rule.
page.EBNF.Rule send a 'rule' message to the observer if any; visit the descendant; send a 'reduce' message. It is an error if the observer returns a string.
page.EBNF.Alt visit the first possible alternative and return the result.
page.EBNF.Lst visit the body and then delimiter and body as often as possible and allowed; return an error if there are not enough iterations, or return the last result or the first string returned by the descendant.
page.EBNF.Rep visit the descendant as often as possible and allowed; return an error if there are not enough iterations, or return the last result or the first string returned by the descendant.
page.EBNF.Seq visit the descendants; return an error if a descendant cannot be visited, or return null or the first string returned by a descendant.
page.BNF.T send a 'shift' message to the observer if any and advance in input. It is an error if the observer returns a string.

The method employs recursion, rather than a rule stack, and cannot perform deep error recovery.

The method is greedy, i.e., it will match input to a rule as long as possible; therefore, some types of ambiguous grammars are still suitable for parsing.

Create parsers and scanner
var parser = new page.LL1(page.BNF.from(' ... grammar described in BNF ...'));
var parser = new page.LL1(page.EBNF.from(' ... grammar described in EBNF ...'));
var scanner = parser.tokenizer(/ skip pattern /);
Parse string
parser.parser(scanner('... input ...'));
Parse a stream — pull tuples
function stream () { return scanner('... start/more input/null ...'); }
parser.parser(stream);
Trace
parser.parser(stream, page.trace());
Collect token values
var result = parser.parser(stream, parser.reflect());
print(result); // array with token values
Create observer for traced building
var environment = ... something useful for the builder ...;
  ruleName: // matched by reflection
    function (_values, _environment, _tuple, _sender) {
      return ... anything
      throw string for error message
      throw Error with message to abort
    }
    ...
};
var observer = page.trace(parser.reflect(environment, builder));
Build
var result = parser.parser(stream, observer);
print(result); // array with builder value(s)
Parameters:
Name Type Argument Description
_input Array.<page.Tuple> | Tokenizer

either the complete input or a Tokenizer which will be called without arguments and each time must return the next portion of input, null, or an empty array.

_observer Observer <optional>

to observe recognition.

Source:
See:
Throws:

from the observer to abort parsing.

Type
Error
Returns:

the top-level value (which could be null) returned by the observer, if any, when the end of input is reached.

Type
object

reflect( [_env] [, _obj])

Creates a function which can observe page.LL1#parse or page.LL1#parser.

The observer function collects a list of all token values and gives each rule a chance to structure the list of values collected for the rule. It maintains a stack of value lists as follows:

action effect
'rule' pushes the stack with an empty value list and returns null.
'shift' ignores a current literal or pushes the current token's associated string value onto the top-level value list and returns null.
'reduce' pops the top-level value list, interacts with _obj if possible, or concatenates the list to the next level value list and returns that.
'error' returns the error message.

For 'reduce', if a property with the rule's non-terminal's name can be found through _obj, it should be a Builder function which is then called with _env and the popped top-level value list which contains the values collected for the rule in order. If the result is an array it is concatenated to the next level value list; otherwise the result is pushed onto the next level value list. This new list is returned by the observer.

The name of the non-terminal is searched in the properties of _obj or by inheritance, i.e., if one property name is {} it's associated value is searched recursively.

Parameters:
Name Type Argument Description
_env object <optional>

environment, passed through to the functions found through the properties in _obj.

_obj object <optional>

collection of Builder functions, i.e., semantic actions.

Source:
See:
Returns:

a function to observe a parser.

Type
Observer

tokenizer( [_skip])

Delegates to .bnf to create a function which tokenizes a string.

Parameters:
Name Type Argument Description
_skip string | RegExp <optional>

a pattern to define ignorable character sequences; this pattern must not accept empty input and it must use (:? ) rather than ( ) for grouping; the default is to skip white space.

Source:
See:
Returns:

a function which takes a string and returns a list of page.Tuple elements. The list contains one tuple with the error token for each character which is neither ignorable nor a literal or part of a token. The function is based on a regular expression which can be displayed as a member .pattern.

Type
Tokenizer

toString()

Displays number of errors and overlaps, if any.

Source:
Returns:
Type
string