Age | Commit message (Collapse) | Author |
|
When compiling expressions followed by a unary operator, the compiler
triggered a segmentation fault due to invoking an unset infix parser
routine.
Explicitly handle this case and raise a syntax error if such an
invalid expression is encountered.
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Implement support for ECMAScript 6 template literals which allow simple
interpolation of variable values into strings without resorting to
`sprintf()` or manual string concatenation.
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
- Support ES2016 exponentiation (**) and exponentiation assignment (**=)
- Support ES2020 nullish coalescing (??) and logical nullish assignment (??=)
- Support ES2021 logical and assignment (&&=) and logical or assignment (||=)
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
When compiling a switch statement with duplicate `default` cases or a switch
statement with syntax errors before the body block, two error handling cases
were hit in the code that prematurely returned from the function without
resetting the compiler's patchlist pointer away from the on-stack patchlist
that had been set up for the switch statement.
Upon processing a subsequent break or continue control statement, a realloc
was performed on the then invalid patchlist contents, triggering a
segmentation fault or libc assert.
Solve this issue by not returning from the function but breaking the switch
body parsing loop.
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
When patching jump targets for break statments while compiling for-loop
statments, we need jump beyond the instructions popping intermediate loop
variables off the stack but before the pop instructions removing local
loop body variables to prevent a stack position mismatch between compiler
and vm.
Before that change, local loop body variables remained on the stack,
breaking the expected stack layout.
Fixes: b3d758b compiler: ("fix for/break miscompilation")
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
- Instead of treating individual program functions as managed ucode types,
demote uc_function_t values to pointers into a uc_program_t entity
- Promote uc_program_t to a managed type
- Let uc_closure_t claim references to the owning program of the enclosed
uc_function_t
- Redefine public APIs uc_compile() and uc_vm_execute() APIs to return and
expect an uc_program_t object respectively
- Remove vallist indirection for function loading and let the compiler
emit the function id directly when producing function construction code
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Extend source objects with a `runpath` field which contains the original
path of the source being executed by the VM.
When instantiating source objects from file paths, the `runpath` will be
set to the `filename`. When instantiating source buffers using
`uc_source_new_buffer()`, the runpath is initially unset.
A new function `uc_source_runpath_set()` can be used to adjust the runtime
path being associated with a source object.
Extend bytecode loading logic to set the source buffer runtime path to the
precompiled bytecode file path being loaded and executed. This is required
for `sourcepath()` and relative paths in `include()` to function correctly
when executing precompiled programs.
Finally rename `uc_program_from_file()` and `uc_program_to_file()` to
`uc_program_load()` and `uc_program_write()` respectively since the load
part now operates on an `uc_source_t` input buffer instead of a plain
`FILE *` handle.
Adjust users of these API functions accordingly.
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Drop support for the `local` keyword and `delete` function calls.
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Introduce a new default enable CMake option "COMPILE_SUPPORT" which
allows to disable source code compilation in the ucode interpreter.
Such an interpreter will only be able to load precompiled ucode files.
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
- Introduce new command line flags `-o` and `-O` to write compiled program
code into the specified output file
- Add support for transparently executing precompiled files, the
lexical analyzing and com,pilation phase is skipped in this case
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
- Move source object pointer into program entity which is referenced by
each function
- Move lineinfo related routines into source.c and use them from lexer.c
since lineinfo encoding does not belong into the lexical analyzer.
- Implement initial infrastructure for detecting source file type,
this is required later to differentiate between plaintext and
precompiled bytecode files
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Instead of storing constant values per function, maintain a global program
wide list for all constant values within the current compilation unit.
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
- Introduce a new "program" entity which holds the list of functions
created during compilation
- Instead of storing pointers to the in-memory function representation
in the constant list, store the index of the function within the
program's function list
- When loading functions from the constant list, retrieve the function
by index from the program entity
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
- Parse integer literals as unsigned numeric values in order to be able
to represent the entire unsigned 64bit value range
- Stop parsing minus-prefixed integer literals as negative numbers but
treat them as separate minus operator followed by a positive integer
instead
- Only store unsigned numeric constants in bytecode
- Rework numeric comparison logic to be able to handle full 64bit
unsigned integers
- If possible, yield unsigned 64 bit results for additions
- Simplify numeric value conversion API
- Compile code with -fwrapv for defined signed overflow semantics
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Introduce new operators `?.`, `?.[…]` and `?.(…)` to simplify looking up
deeply nested property chain in a secure manner.
The `?.` operator behaves like the `.` property access operator but yields
`null` if the left hand side is `null` or not an object.
Like `?.`, the `?.[…]` operator behaves like the `[…]` computed property
access but yields `null` if the left hand side is `null` or neither an
object or array.
Finally the `?.(…)` operator behaves like the function call operator `(…)`
but yields `null` if the left hand side is `null` or not a callable
function.
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
When compiling certain expressions as first statement of an ucode
program, e.g. a while loop in raw mode, a jump instruction to offset
zero is emitted which was incorrectly treated as placeholder by the
compiler.
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
- Ensure that most functions follow the subject_verb naming schema
- Move type related function from value.c to types.c
- Rename value.c to vallist.c
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Ensure that all custom typedef and vector declaration type names end with
a "_t" suffix.
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Instead of relying on a switch/case mapping of token values to corresponding
VM instructions, infer the instruction number arithmetically.
This shrinks the compiled size on x86/64 by about 250 bytes.
Also emit I_LE and I_GE instructions for `<=` and `>=` comparisons instead
of transforming these into I_GT and I_LT negations.
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
The token type split allows us to drop the token value union in the
reserved word list with a subsequent commit.
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Introduce support for declaring constant variables through the `const`
keyword. Variables declared with `const` follow the same scoping rules
as `let` declared ones.
In contrast to normal variables, `const` ones may not be assigned to
after their declaration. Any attempt to do so will result in a syntax
error during compilation.
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Turn `delete` into a proper operator mimicking ECMAScript semantics.
Also ensure to transparently turn deprecated `delete(obj, propname)`
function calls into `delete obj.propname` expressions during compilation.
When strict mode is active, legacy delete() calls throw a syntax error
instead.
Finally drop the `delete()` function from the stdlib as it is shadowed
by the delete operator syntax now.
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
In a loop statement like `for (let x = 1, y = 2; ...)` the initialization
statement was incorrectly interpreted as `let x = 1; y = 2` instead of the
correct `let ..., y = 2`, triggering reference error exceptions in strict
mode.
Solve the issue by continue parsing the rest of the comma expression
seqence as declaration list expression when the initializer is compiled
in local mode.
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Due to the special code path parsing the leading label portion of a
parenthesized expression, slashes following a label were improperly
treated as regular expression literal delimitters, emitting a syntax
error when an otherwise valid expression such as `a / 1` was being
parsed as first sub expression of a parenthesized expression.
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
When emitting byte code for break or continue statements, ensure that local
variables in all containing scopes up to the loop body scope are popped,
not just those in the same scope the statement is located in.
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Due to the special code path parsing the leading label portion of a
parenthesized expression, keywords following a property access operator
(TK_DOT, `.`) weren't properly handled, emitting a syntax error when an
otherwise valid expression such as `value.default` was being parsed as
first sub expression of a parenthesized expression.
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Support per-file and per-function `"use strict";` statement to opt into
strict variable handling from ucode source code.
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
- Instead of disambiguating division operator vs. regexp literal by looking
at the preceeding token, raise a "no regexp" flag within the appropriate
parser states to tell the lexer how to treat a forward slash when parsing
the next token
- Introduce another "no keyword" flag which disables parsing labels into
keywords when reading the next token and set it in the appropriate parser
states. This allows using reserved names in object declarations and
property access expressions
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
- Shuffle typedefs to avoid need for non-compliant forward declarations
- Fix non-compliant empty struct initializers
- Remove use of braced expressions
- Remove use of anonymous unions
- Avoid `void *` pointer arithmetic
- Fix several warnings reported by gcc -pedantic mode and clang 11 compilation
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Instead of relying on json_object values internally, use custom types to
represent the different ucode value types which brings a number of
advantages compared to the previous approach:
- Due to the use of tagged pointers, small integer, string and bool
values can be stored directly in the pointer addresses, vastly
reducing required heap memory
- Ability to create circular data structures such as
`let o; o = { test: o };`
- Ability to register custom `tostring()` function through prototypes
- Initial mark/sweep GC implementation to tear down circular object
graphs on VM deinit
The change also paves the way for possible future extensions such as
constant variables and meta methods for custom ressource types.
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Fixes: 97bf297 ("compiler: ensure that alternative if/for/while syntax has own block scope")
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
The `if ...: endif`, `for ...: endfor`, `while ...: endwhile` etc. syntax
statements are supposed to have their own lexical scope, like curly brace
blocks in normal statements.
Without this, local variable declarations within such blocks would
incorrectly shift stack offsets for the remainder of the program.
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
- Initialize stack slots belonging to skipped local variable declarations
- Group switch case value tests after switch statement body
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
When patching jump targets for break statments while compiling for-loop
statments, jump beyond the instructions popping intermediate loop variables
off the stack to fix a stack position mismatch between compiler and vm.
Before that change, local loop body variables got popped twice, breaking
the expected stack layout.
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
When skipping over the catch block of a try/catch statement, make sure to
emit the jump after the try scope variables have been popped off the stack
in order to prevent a stack position mismatch between compiler and vm.
Fixes: 9ad9afb ("compiler: fix try/catch miscompilation")
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Ensure that an arrow function body expression is parsed with P_ASSIGN
precedence to not greedily consume comma expressions.
This ensures that an expression like
() => 1, 2
is parsed as function [() => 1], integer [2] and not as
function [() => 1, 2].
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Simplify handling of default case in switch statements. Instead of jumping
over the default block, simply record the start address of the block since
the initial switch jump is patched into the first non-default case already.
This also leads to slightly smaller bytecode.
Previously, when a case branch fell through into a default block, it did
hit the default skip jump which jumped back into the first case which then
fell through into the default skip jump, leading to an endless loop.
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
When skipping catch blocks with exception variables, jump beyond the
instruction popping the exception variable off the stack to fix a
stack position mismatch between compiler and vm.
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Replace the former AST walking interpreter implementation with a single pass
bytecode compiler and a corresponding virtual machine.
The rewrite lays the groundwork for a couple of improvements with will be
subsequently implemented:
- Ability to precompile ucode sources into binary byte code
- Strippable debug information
- Reduced runtime memory usage
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|