Age | Commit message (Collapse) | Author |
|
Ensure that deleting object keys during iteration is safe by keeping a
global chain of per-object iterators which are advanced to the next key
when the entry that is about to be iterated is deleted.
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
When removing locals from all scopes, upvalues need to be considered like
in uc_compiler_leave_scope(). Closing them is required to avoid leaving
lingering references to stack values that went out of scope, which would
lead to invalid memory accesses in subsequent code when such upvalues are
used by closures.
Fixes: #187
Signed-off-by: Felix Fietkau <nbd@nbd.name>
[add testcase, reword commit message]
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
ECMAScript allows using `as` and `from` as identifiers so follow suit
and don't treat them specially while parsing. Extend the compiler logic
instead to check for TK_LABEL tokens with the expected value to properly
parse import and export statements.
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
The `%g` printf format used for serializing double values into strings
will not include any decimal place if the value happens to be integral,
leading to an unwanted double to integer conversion when serializing
and subsequently deserializing an integral double value as JSON.
Solve this issue by checking the serialized string result for a decimal
point or exponential notation and appending `.0` if neither is found.
Ref: #173
Suggested-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
The existing `ucv_compare()` implementation utilized `strcmp()` to compare
two ucode string values, which may lead to incorrect results for strings
containing null bytes as the comparison prematurely aborts when encountering
the first null.
Rework the string comparison logic to use `memcmp()` for comparing both ucv
strings with each other in order to ensure that expressions such as
`"" == "\u0000"` lead to the expected `false` result.
Ref: https://github.com/openwrt/luci/issues/6530
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Enable signal dispatching by default for standalone ucode programs.
Also adjust the `gc()` testcase output as the default number of
allocations with enabled signal dispatching changes slightly.
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
- When skipping the interpreter line, don't count it's newline twice
- Fix reporting byte offsets beyond the end of line
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Extend `uc_sort()` to utilize `ucv_object_sort()` in order to support
reordering object keys.
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
- Fix `ucv_array_unshift()` improperly rejecting operation on empty arrays
- Fix `uc_unshift()` improperly reversing maintaining argument order
- Add missing test coverage for `push()`, `pop()`, `unshift()` and
`shift()` array operations.
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Expose the `isatty(3)` libc function in the fs module to allow checking
whether a file descriptor refers to a terminal.
Signed-off-by: Petr Štetiar <ynezz@true.cz>
|
|
Adjust expected testcase outputs after double format change in the
previous commit.
Fixes: 4c654df ("types: adjust double printing format")
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
The compiler emitted incorrect bytecode for logical assignment operations
on property expressions. The generated instructions left the stack in an
unclean state when the assignment condition was not fulfilled, causing a
stack layout mismatch between compiler and vm, leading to undefined
variable accesses and other non-deterministic behavior.
Solve this issue by rewriting the bytecode generation to yield an
instruction sequence that does not leave garbage on the stack.
The implementation is not optimal yet, as an expression in the form
`obj.prop ||= val` will load `obj.prop` twice. This is acceptable for
now as the load operation has no side effect, but should be solved in
a better way by introducing new instructions that allow for swapping
stack slots, allowing the vm to operate on a copy of the loaded value.
Also rewrite the corresponding test case to trigger a runtime error
on code versions before this fix.
Fixes: fdc9b6a ("compiler: fix `??=`, `||=` and `&&=` logical assignment semantics")
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Invoking `sleep(1000)` in the CI container often sleeps slightly longer
than exactly 1000ms, causing the test output to mismatch.
Relax the test requirement to simply ensure that t2 > t1.
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Follow ES6 semantics and ensure that arrow functions with a block body
don't implicitly return the value of the last executed statement.
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
When compiling logical assignment expressions, ensure that the right hand
side of the assignment is not evaluated when the assignment condition is
unfulfilled.
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
- Ensure that regexp extension escapes are consistently handled;
substitute `\d`, `\D`, `\s`, `\S`, `\w` and `\W` with `[[:digit:]]`,
`[^[:digit:]]`, `[[:space:]]`, `[^[:space:]]`, `[[:alnum:]_]` and
`[^[:alnum:]_]` character classes respectively since not all POSIX
regexp implementations implement all of those extensions
- Preserve `\b`, `\B`, `\<` and `\>` boundary matches
Fixes: a45f2a3 ("lexer: improve regex literal handling")
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Implement a new function `slice()` to complement the existing `splice()`
function and model it's semantics after the ES6 `Array.slice()` version.
Fixes: #106
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Track last emitted statement type in compiled code and only generate final
`return null` opcodes if there is no preceeding `return` statement.
Also use this statement tracking to avoid emitting invalid return opcodes
for arrow function bodies with trailing empty statements.
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
- Do not treat slashes within bracket expressions as delimitters
- Do not escape slashes when stringifying regex sources
- Allow all escape sequence types in regex literals
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Instead of having one global export table per VM instance maintain one table
per program instance. This is required to avoid clobbering the export list
in case `import` using code is loaded at runtime through `require()`,
`loadfile()` etc.
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Extend the split() and replace() functions to accept an additional optional
`limit` argument which limits the amount of split operations / substitutions
performed by these functions.
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Extend the `render()` function to accept a function value as first argument,
which allows running arbitrary ucode functions and capturing their output.
This is especially useful in conjunction with `loadfile()` or `loadstring()`
to dynamically compile templates and rendering their output into a string.
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
- getenv(): Allow querying the entire environment by omiting variable name
- split(): Properly handle null bytes in subject and separator strings
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Introduce new functions dealing with on-the-fly compilation of code and
execution of functions with different global scope.
The `loadstring()` and `loadfile()` functions will compile the given
ucode source string or ucode file path respectively and return the entry
function of the resulting program.
An optional dictionary specifying parse options may be given as second
argument.
Both functions return `null` on invalid arguments and throw an exception
in case of compilation errors.
The `call()` function allows invoking a given function value with a
different `this` context and/or a different global environment.
Finally refactor the existing `uc_require_ucode()` implementation to
reuse the new `uc_loadfile()` and `uc_call()` implementations and adjust
as well as simplify affected testcases.
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Introduce a new stdlib function `gc()` which allows controlling the periodic
garbage collector from ucode.
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
If a compile error is raised at offset 0, try to resolve line and
character position anyway.
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Indent inner messages and prepend them with a vertical bar to increase
visual separation of messages. Also include file name in source context
output when the compiled program contains more than one source file.
Adjust affected testcase outputs accordingly.
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
The current implementation of the module export offset tracking was
inadequate and failed to properly handle larger module dependency
graphs. In order to properly support nested module imports/exports,
the following changes have been introduced:
- Gather export slots during module compilation and emit corresponding
export opcodes as one contiguous block at the end of the module
function body, right before the final return. This ensures that
interleaved imports of other modules do not place foreign exports
between our module exports.
- Track the number of program wide allocated export slots in order
to derive per-module-source offsets for the global VM export list.
- Derive import opcode source index from the module source export
offset and the index of the requested name within the module source
export name list.
- Improve error reporting for circular module imports.
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
This commit introduces syntax level support for ES6 style module import
and export statements. Imports are resolved at compile time and the
corresponding module code is compiled into the main program.
Also add testcases to cover import and export statement semantics.
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Replace all occurrences for the test file directory path with "." in stderr
and stdout results to ensure stable test outputs.
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
- Use nested switches instead of lookup tables to detect tokens
- Simplify input buffer logic
- Reduce amount of intermediate states
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
When a template was parsed with global block left stripping disabled, then
any text preceding an expression or statement block start tag was incorrectly
prepended to the first token value of the block, leading to syntax errors in
the compiler.
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
When compiling continue statements nested in switches, the compiler only
emitted pop statements for the local variables in the switch body scope,
but not for the locals in the scope(s) leading up to the containing loop
body.
Extend the compilers internal patchlist structure to keep track of the
type of scope tied to the patchlist and extend `continue` statement
compilation logic to select the appropriate parent patch list in order
to determine the amount of locals (stack slots) to clear before the
emitted jump instruction.
As a result, the `uc_compiler_backpatch()` implementation can be simplified
somewhat since we do not need to propagate entries to parent lists anymore.
Also add a further regression test case to cover this issue.
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
When a switch statement containing cases with local variable declarations
and no default case is evalulated and none of the the cases matched, the
local variable slots were never initialized but got popped off the stack
when execution resumed after the switch scope, leading to a mismatch in
stack layout between compiler and runtime, causing local variables to
yield wrong values or a stack underflow triggering a segmentation fault.
Solve this issue by patching the last conditional case match jump to hop
beyond the local variable pop instructions when no default case is defined.
Also extend the regression test case dealing with other switch related
stack mismatch issues to cover this particular problem as well.
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
- Recognize new number literal prefixes `0o` and `0O` for octal as well
as `0b` and `0B` for binary number literals
- Treat number literals with leading zeros as octal while parsing but
as decimal ones on implicit number conversions, means `012` will yield
`10` while `+"012"` or `"012" + 0` will yield `12`
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
For string cases, turn `int()` into a thin `strtoll()` wrapper which
attempts to parse the initial portion of the string as a decimal integer
literal, optionally preceded by white space and a sign character.
Also introduce an optional `base` argument for string cases while we're
at it and adjust the existing stdlib test case accordingly.
The function now behaves mostly the same as ECMAScript `parseInt(val, 10)`
for string cases, means it will recognize `012` as `12` and not `10` and
it will accept trailing non-digit characters after the initial portition
of the input string.
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
- Fix segfault on passing string haystack with non-string needle argument
- Perform strict equality tests against array haystacks
- Make string searches binary safe
- Improve left index string search performance
- Improve right index array search performance
- Add missing test coverage for index() and rindex()
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
When compiling expressions followed by a unary operator, the compiler
triggered a segmentation fault due to invoking an unset infix parser
routine.
Explicitly handle this case and raise a syntax error if such an
invalid expression is encountered.
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Add two new functions to deal with encoding and decoding of hexadecimal
digit strings:
- hexenc() - convert the given input value into a lower case hex digit
string, implicitely converting the input argument to a string value
if needed
- hexdec() - decode the given input hex digit string into a byte string,
skipping whitespace or optionally specified characters in the input
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Implement support for ECMAScript 6 template literals which allow simple
interpolation of variable values into strings without resorting to
`sprintf()` or manual string concatenation.
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
When a managed function is indirectly invoked during bytecode execution,
e.g. when calling the tostring() method of an object prototype during
string concatenation, the invoked function must stop executing bytecode
upon return to hand control back to caller.
Extend `uc_vm_execute_chunk()` to track the amount of nested function
calls it performs and hand back control to the caller once the toplevel
callframe returns. Also bubble unhandled exceptions only as far as up
to the original caller.
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
|
|
When invoking a native function as toplevel VM call which indirectly
triggers an unhandled exception in managed code, the callframes are
completely reset before the C function returns, leading to invalid
memory accesses when `uc_vm_call_native()` subsequently popped it's
own callframe again.
This issue did not surface by executing script code through the
interpreter since in this case the VM will always execute a managed
code as toplevel call, but it could be triggered by invoking a native
function triggering an exception through the C API using `uc_vm_call()`
on a fresh `uc_vm_t` context or by utilizing the CLI interpreters `-l`
flag to preload a native code library triggering an exception.
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Extend the `uc_json()` implementation to accept readable objects in
addition to plain input strings. This allows parsing JSON input directly
from open file handles, sockets or other kinds of producer objects without
the need to store the entire JSON source string intermediately in memory.
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Make sure fs.dirname() doesn't truncate the last character of the
returned path. Previously ucv_string_new_length was called with a
length which no longer included the last character (which had just
been tested not to be a '/' or '.' and hence broke the loop at that
point).
Signed-off-by: Daniel Golle <daniel@makrotopia.org>
[testcase added]
Signed-off-by: Paul Spooren <mail@aparcar.org>
[testcase folded into this commit and fixed]
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Treat the char value as unsigned when testing its value to yield consistent
results on both platforms with signed chars and those with unsigned chars
by default (e.g. ARM ones). This also avoids encoding byte values > 127 as
\uXXXX escape sequences, potentially breaking the strng contents.
Fixes: #62
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Add five new functions to deal with date calculation and timing:
- localtime(), gmtime() - return a broken down calendar date and time
specification from the given epoch (or now, if absent) in local and
UTC time respectively
- timelocal(), timegm() - the inverse operation for the former functions,
taking a date and time specification (interpreted as local or UTC time
respectively) and turning it into an epoch value
- clock() - return the second and nanosecond values of the system clock,
useful for time/performance measurements
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
lib: add argument position support (`%m$`) to `sprintf()` and `printf()`
|
|
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|