diff options
author | Jo-Philipp Wich <jo@mein.io> | 2022-10-05 21:34:59 +0200 |
---|---|---|
committer | Jo-Philipp Wich <jo@mein.io> | 2022-10-05 23:01:05 +0200 |
commit | 21ace5e5c7c98271d78ff9cdf2b61e3ac70704d8 (patch) | |
tree | bc2f084916c574acb4c6e10ddf5146a796c4b7f7 /tests/custom | |
parent | f8e00b4b83dad76e183b8293870cfe3110f1fa94 (diff) |
lexer: fixes for regex literal parsing
- Ensure that regexp extension escapes are consistently handled;
substitute `\d`, `\D`, `\s`, `\S`, `\w` and `\W` with `[[:digit:]]`,
`[^[:digit:]]`, `[[:space:]]`, `[^[:space:]]`, `[[:alnum:]_]` and
`[^[:alnum:]_]` character classes respectively since not all POSIX
regexp implementations implement all of those extensions
- Preserve `\b`, `\B`, `\<` and `\>` boundary matches
Fixes: a45f2a3 ("lexer: improve regex literal handling")
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
Diffstat (limited to 'tests/custom')
-rw-r--r-- | tests/custom/00_syntax/21_regex_literals | 29 |
1 files changed, 28 insertions, 1 deletions
diff --git a/tests/custom/00_syntax/21_regex_literals b/tests/custom/00_syntax/21_regex_literals index 7466a2e..44f0079 100644 --- a/tests/custom/00_syntax/21_regex_literals +++ b/tests/custom/00_syntax/21_regex_literals @@ -4,7 +4,7 @@ within regular expression literals is subject of the underlying regular expression engine. -- Expect stdout -- -[ "/Hello world/", "/test/gis", "/test/g", "/test1 / test2/", "/1\n\\.\u0007\bc☀\\\\/" ] +[ "/Hello world/", "/test/gis", "/test/g", "/test1 / test2/", "/1\n\\.\u0007\\bc☀\\\\/" ] -- End -- -- Testcase -- @@ -117,3 +117,30 @@ literal delimitters. ]); %} -- End -- + + +Testing that regex extension macros are substituted only outside of +bracket set expressions. + +-- Expect stdout -- +[ + "/ \\b \\B [\b B] /", + "/ \\< \\> [< >] /", + "/ [[:digit:]] [^[:digit:]] [d D] /", + "/ [[:space:]] [^[:space:]] [s S] /", + "/ [[:alnum:]_] [^[:alnum:]_] [w W] /" +] +-- End -- + +-- Testcase -- +{% + printf("%.J\n", [ + / \b \B [\b \B] /, // \b outside brackets is a word boundary, + // \b within brackets is backspace + / \< \> [\< \>] /, + / \d \D [\d \D] /, + / \s \S [\s \S] /, + / \w \W [\w \W] / + ]); +%} +-- End -- |