Age | Commit message (Collapse) | Author |
|
This is a cosmetic change to bring the code in line with the common prefix
format of the other code in the tree.
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
The compiler will keep fetching tokens until hitting EOF, so ensure that
the lexer produces EOF after an unrecognized character error.
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Enabling raw code mode allows writing ucode scripts without any template
tag decorations (that is, without the need to provide an initial opening
'{%' tag).
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
The token type split allows us to drop the token value union in the
reserved word list with a subsequent commit.
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Turn the Infinity and NaN keywords into global properties.
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Introduce support for declaring constant variables through the `const`
keyword. Variables declared with `const` follow the same scoping rules
as `let` declared ones.
In contrast to normal variables, `const` ones may not be assigned to
after their declaration. Any attempt to do so will result in a syntax
error during compilation.
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Turn `delete` into a proper operator mimicking ECMAScript semantics.
Also ensure to transparently turn deprecated `delete(obj, propname)`
function calls into `delete obj.propname` expressions during compilation.
When strict mode is active, legacy delete() calls throw a syntax error
instead.
Finally drop the `delete()` function from the stdlib as it is shadowed
by the delete operator syntax now.
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Skip interpreter lines in any source buffer and handle the skipping in the
lexer itself, to avoid reporting wrongly shifted token offsets to the
compiler, resulting in wrong error locations and source contexts.
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
- Instead of disambiguating division operator vs. regexp literal by looking
at the preceeding token, raise a "no regexp" flag within the appropriate
parser states to tell the lexer how to treat a forward slash when parsing
the next token
- Introduce another "no keyword" flag which disables parsing labels into
keywords when reading the next token and set it in the appropriate parser
states. This allows using reserved names in object declarations and
property access expressions
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
- Shuffle typedefs to avoid need for non-compliant forward declarations
- Fix non-compliant empty struct initializers
- Remove use of braced expressions
- Remove use of anonymous unions
- Avoid `void *` pointer arithmetic
- Fix several warnings reported by gcc -pedantic mode and clang 11 compilation
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Instead of relying on json_object values internally, use custom types to
represent the different ucode value types which brings a number of
advantages compared to the previous approach:
- Due to the use of tagged pointers, small integer, string and bool
values can be stored directly in the pointer addresses, vastly
reducing required heap memory
- Ability to create circular data structures such as
`let o; o = { test: o };`
- Ability to register custom `tostring()` function through prototypes
- Initial mark/sweep GC implementation to tear down circular object
graphs on VM deinit
The change also paves the way for possible future extensions such as
constant variables and meta methods for custom ressource types.
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Fixes bunch of following warnings:
lexer.c:68:37: warning: missing field 'parse' initializer [-Wmissing-field-initializers]
lexer.c:138:34: warning: missing field '' initializer [-Wmissing-field-initializers]
Signed-off-by: Petr Štetiar <ynezz@true.cz>
|
|
A logic flaw in the lineinfo encoding function led to an infinite tight
loop when a buffer chunk with 128 byte or more got consumed, which may
happen when parsing very long literals.
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
While parsing string literals, actually consume the backslash introducing an
escape sequence to prevent it from ending up in the produced string if the
scanner is at the end of the buffer and the remaining buffer contents are
flushed after the consumer loop.
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Replace the former AST walking interpreter implementation with a single pass
bytecode compiler and a corresponding virtual machine.
The rewrite lays the groundwork for a couple of improvements with will be
subsequently implemented:
- Ability to precompile ucode sources into binary byte code
- Strippable debug information
- Reduced runtime memory usage
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Instead of obtaining and caching direct opcode pointers, use relative
references when dealing with opcodes since direct or indirect calls to
uc_execute_op() might lead to reallocations of the opcode array, shifting
memory addresses and invalidating pointers taken before the invocation.
Such stale pointer accesses could be commonly triggered when one part
of the processed expression was a require() or include() call loading
relatively large ucode sources.
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
- Eliminate dead code left after regex literal parsing changes
- Properly handle short octal sequences at end of string
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Ensure that the single char escapes `\a`, `\b`, `\e`, `\f`, `\n`,
`\r`, `\t` and `\v` keep working. Since they're not part of the POSIX
extended regular expression spec, they're not handled by the RE engine
so we need to substitute them by their actual byte value while parsing
the literal.
Fixes: ac5cb87 ("syntax: fix string and regex literal parsing quirks")
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
- Do not interprete escape sequences in regexp literals
- Do not improperly substitute control escape sequences such as
`\n` or `\a` after a backslash
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Optimize the strncmp() based token lookup with an integer comparison
approach which roughly cuts the time of the source code parsing phase
in half.
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
This brings the utpl script syntax closer to ES5/ES6 and allows to use
existing syntax highlightings in IDEs and editors.
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
In the alternative `if` syntax mode, support a specific `elif` keyword instead
of requiring an `else` branch followed by a disjunct `if` statement.
The advantage is that templates do not require error-prone redundant `endif`
keywords in else-if ladders.
After this change, the following example:
{% if (...): %}
One condition
{% else if (...): %}
Another condition
{% else if (...): %}
A third condition
{% else %}
Final condition
{% endif; endif; endif %}
... can be simplified into:
{% if (...): %}
One condition
{% elif (...): %}
Another condition
{% elif (...): %}
A third condition
{% else %}
Final condition
{% endif %}
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Get rid of the distinction between lexer/parser errors and runtime
exceptions, use exceptions everywhere instead.
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
- Keep an open FILE* reference to processed source files in order to
be able to rewind and extract error context later
- Build a proper call stack when invoking utpl functions
- Report call stack in exceptions
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Rewrite the lexer into a restartable state machine to support parsing from
file streams without the need to read the entire source text into memory
first.
As a side effect, the length of labels and strings is unlimited now.
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Instead of propagating failures to the caller, print a generic error
message and terminate program execution through abort().
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Also treat "in" as relational operator.
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
This allows number literals that exceed the range INT64_MIN..INT64_MAX
to be truncated to the respective min and max values in a defined manner.
It also makes it possible to have the expression `{{ -9223372036854775808 }}`
actually result in `-9223372036854775808`. Since negation and number
declaration are separate operations, the value would be first truncated to
`9223372036854775807` and then negated, making it impossible to write a
literal INT64_MIN value without tracking the overflow.
Also fix the number parsing logic to not trucate intergers to 32bit.
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
- unify operand and value tag structures
- use a contiguous array for storing opcodes
- use relative offsets for next and children ops
- defer function creation to runtime
- rework "this" context handling by storing context pointer in scope tags
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Support a new keyword `this` which allows functions to access the context
they're called upon.
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|