Lexer issues, in particular: Java backends do not accept/require whitespace between consecutive tokens

The following grammar should parse `⟦ ab c`.
```
Whatever. Main ::= Uni Foo Bar;

token Uni '⟦' ;
token Foo letter letter;
token Bar (char - 'a');
```
This is the situation in the different backends:
- [x] Haskell: yes
- [ ] Ocaml: `ocamllex` refuses generated lexer definition with error
  ```
  File "Lextest.mll", line 42, character 11: illegal escape sequence \1.
  ```
- [ ] C: parsing fails with `error: 1,1: syntax error at ?`
- [ ] CPP: parsing fails with `Parse error on line 1`
- [ ] Java: parsing fails with
  ```
  Syntax Error, trying to recover and continue parse... for input symbol "" spanning from unknown:-1/-1(-1) to unknown:-1/-1(-1)
  At line -1, near "ab c" :
     Unrecoverable Syntax Error
  ```
- [ ] Java/ANTLR: parsing fails with
  ```
  line 1:1 extraneous input ' ' expecting Foo
  At line 1, column 1 :
     extraneous input ' ' expecting Foo
  ```

The parsers generated by the Java backends accept instead the input without the spaces: `⟦abc`.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Lexer issues, in particular: Java backends do not accept/require whitespace between consecutive tokens #322

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Lexer issues, in particular: Java backends do not accept/require whitespace between consecutive tokens #322

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions