问题描述:

I use the following Java-code to instantiate a parser generated with ANTLR.

package foo;

public class Test1 {

public static void main(String[] args) throws RecognitionException {

CharStream stream = new ANTLRStringStream("foo ");

BugLexer lexer = new BugLexer(stream);

CommonTokenStream tokenStream = new CommonTokenStream(lexer);

BugParser parser = new BugParser(tokenStream);

parser.specification();

}

}

My grammar:

grammar Bug;

options {

language = Java;

}

@header {

package foo;

}

@lexer::header {

package foo;

}

specification :

'foo' EOF

;

WS

: (' ' | '\t' | '\n' | '\r')+ {$channel = HIDDEN;}

;

SCOLON

: (~ ';')+

;

And the error I get:

line 1:0 mismatched input 'foo ' expecting 'foo'

I would expect the space in the input to be ignored, but its not.. The antlr interpreter in eclipse says its fine so I suppose my Java code is wrong somehow, but I just don't see it...

Note: If I remove the rule for SCOLON then theres not bug for the input.

网友答案:

ANTLR's lexer tries to match as much as possible for each token. Therefor "foo " is being tokenized as a single SCOLON token and not as a 'foo'- and WS token.

Note that your SCOLON rule:

SCOLON
 : (~ ';')+
 ;

suggests by its name to match just a single semi-colon, but in fact matches one ore more characters other than a semi-colon. Perhaps it should have been this instead:

SCOLON
 : ';'
 ;

?

EDIT

Heinrich Ody wrote:

I somehow thought there is a priority (given by order of declaration) on which token ANTLR attempts to match the input. Thanks for your response.

That is correct: whenever two (or more) rules match the same amount of characters, the rule defined first will "win". But if a rule defined last matches the most characters, it "wins".

相关阅读:
Top