Age | Commit message (Collapse) | Author |
|
Extend the `render()` function to accept a function value as first argument,
which allows running arbitrary ucode functions and capturing their output.
This is especially useful in conjunction with `loadfile()` or `loadstring()`
to dynamically compile templates and rendering their output into a string.
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
- getenv(): Allow querying the entire environment by omiting variable name
- split(): Properly handle null bytes in subject and separator strings
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Introduce new functions dealing with on-the-fly compilation of code and
execution of functions with different global scope.
The `loadstring()` and `loadfile()` functions will compile the given
ucode source string or ucode file path respectively and return the entry
function of the resulting program.
An optional dictionary specifying parse options may be given as second
argument.
Both functions return `null` on invalid arguments and throw an exception
in case of compilation errors.
The `call()` function allows invoking a given function value with a
different `this` context and/or a different global environment.
Finally refactor the existing `uc_require_ucode()` implementation to
reuse the new `uc_loadfile()` and `uc_call()` implementations and adjust
as well as simplify affected testcases.
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Implement a new flag `-g` which takes an interval value and enables the
periodic GC with the given interval for cyclic object structures in the
VM if specified.
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Introduce a new stdlib function `gc()` which allows controlling the periodic
garbage collector from ucode.
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Utilize the new I_DYNLINK vm opcode to support import statements referring
to dynamic extension modules.
During compilation, the compiler will try to infer the type of the imported
module from the resolved file path; if it ends with `.so`, the module is
assumed to by a dynamic extension and loading/binding of the module is
deferred to runtime using I_DYNLINK opcodes.
Additionally, the `-c` cli option gained support for a new compiler flag
`dynlink=...` which allows forcing a particular module name expression
to be treated as dynamic extension. This is useful to e.g. force resolving
`import { x } from "foo"` to a dynamic extension `foo.so` loaded at runtime
even if a plain `foo.uc` exists in the search path during compilation or if
no such module is available at build time.
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
If a compile error is raised at offset 0, try to resolve line and
character position anyway.
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Indent inner messages and prepend them with a vertical bar to increase
visual separation of messages. Also include file name in source context
output when the compiled program contains more than one source file.
Adjust affected testcase outputs accordingly.
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
The current implementation of the module export offset tracking was
inadequate and failed to properly handle larger module dependency
graphs. In order to properly support nested module imports/exports,
the following changes have been introduced:
- Gather export slots during module compilation and emit corresponding
export opcodes as one contiguous block at the end of the module
function body, right before the final return. This ensures that
interleaved imports of other modules do not place foreign exports
between our module exports.
- Track the number of program wide allocated export slots in order
to derive per-module-source offsets for the global VM export list.
- Derive import opcode source index from the module source export
offset and the index of the requested name within the module source
export name list.
- Improve error reporting for circular module imports.
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
This commit introduces syntax level support for ES6 style module import
and export statements. Imports are resolved at compile time and the
corresponding module code is compiled into the main program.
Also add testcases to cover import and export statement semantics.
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Replace all occurrences for the test file directory path with "." in stderr
and stdout results to ensure stable test outputs.
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
- Use nested switches instead of lookup tables to detect tokens
- Simplify input buffer logic
- Reduce amount of intermediate states
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
When a template was parsed with global block left stripping disabled, then
any text preceding an expression or statement block start tag was incorrectly
prepended to the first token value of the block, leading to syntax errors in
the compiler.
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
When compiling continue statements nested in switches, the compiler only
emitted pop statements for the local variables in the switch body scope,
but not for the locals in the scope(s) leading up to the containing loop
body.
Extend the compilers internal patchlist structure to keep track of the
type of scope tied to the patchlist and extend `continue` statement
compilation logic to select the appropriate parent patch list in order
to determine the amount of locals (stack slots) to clear before the
emitted jump instruction.
As a result, the `uc_compiler_backpatch()` implementation can be simplified
somewhat since we do not need to propagate entries to parent lists anymore.
Also add a further regression test case to cover this issue.
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
When a switch statement containing cases with local variable declarations
and no default case is evalulated and none of the the cases matched, the
local variable slots were never initialized but got popped off the stack
when execution resumed after the switch scope, leading to a mismatch in
stack layout between compiler and runtime, causing local variables to
yield wrong values or a stack underflow triggering a segmentation fault.
Solve this issue by patching the last conditional case match jump to hop
beyond the local variable pop instructions when no default case is defined.
Also extend the regression test case dealing with other switch related
stack mismatch issues to cover this particular problem as well.
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
- Recognize new number literal prefixes `0o` and `0O` for octal as well
as `0b` and `0B` for binary number literals
- Treat number literals with leading zeros as octal while parsing but
as decimal ones on implicit number conversions, means `012` will yield
`10` while `+"012"` or `"012" + 0` will yield `12`
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
For string cases, turn `int()` into a thin `strtoll()` wrapper which
attempts to parse the initial portion of the string as a decimal integer
literal, optionally preceded by white space and a sign character.
Also introduce an optional `base` argument for string cases while we're
at it and adjust the existing stdlib test case accordingly.
The function now behaves mostly the same as ECMAScript `parseInt(val, 10)`
for string cases, means it will recognize `012` as `12` and not `10` and
it will accept trailing non-digit characters after the initial portition
of the input string.
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
- Fix segfault on passing string haystack with non-string needle argument
- Perform strict equality tests against array haystacks
- Make string searches binary safe
- Improve left index string search performance
- Improve right index array search performance
- Add missing test coverage for index() and rindex()
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
When compiling expressions followed by a unary operator, the compiler
triggered a segmentation fault due to invoking an unset infix parser
routine.
Explicitly handle this case and raise a syntax error if such an
invalid expression is encountered.
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Add two new functions to deal with encoding and decoding of hexadecimal
digit strings:
- hexenc() - convert the given input value into a lower case hex digit
string, implicitely converting the input argument to a string value
if needed
- hexdec() - decode the given input hex digit string into a byte string,
skipping whitespace or optionally specified characters in the input
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Implement support for ECMAScript 6 template literals which allow simple
interpolation of variable values into strings without resorting to
`sprintf()` or manual string concatenation.
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
When a managed function is indirectly invoked during bytecode execution,
e.g. when calling the tostring() method of an object prototype during
string concatenation, the invoked function must stop executing bytecode
upon return to hand control back to caller.
Extend `uc_vm_execute_chunk()` to track the amount of nested function
calls it performs and hand back control to the caller once the toplevel
callframe returns. Also bubble unhandled exceptions only as far as up
to the original caller.
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
|
|
When invoking a native function as toplevel VM call which indirectly
triggers an unhandled exception in managed code, the callframes are
completely reset before the C function returns, leading to invalid
memory accesses when `uc_vm_call_native()` subsequently popped it's
own callframe again.
This issue did not surface by executing script code through the
interpreter since in this case the VM will always execute a managed
code as toplevel call, but it could be triggered by invoking a native
function triggering an exception through the C API using `uc_vm_call()`
on a fresh `uc_vm_t` context or by utilizing the CLI interpreters `-l`
flag to preload a native code library triggering an exception.
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Do not continue loading other libraries or executing the main code if
loading one of the preload libraries fails.
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Extend the `uc_json()` implementation to accept readable objects in
addition to plain input strings. This allows parsing JSON input directly
from open file handles, sockets or other kinds of producer objects without
the need to store the entire JSON source string intermediately in memory.
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Make sure fs.dirname() doesn't truncate the last character of the
returned path. Previously ucv_string_new_length was called with a
length which no longer included the last character (which had just
been tested not to be a '/' or '.' and hence broke the loop at that
point).
Signed-off-by: Daniel Golle <daniel@makrotopia.org>
[testcase added]
Signed-off-by: Paul Spooren <mail@aparcar.org>
[testcase folded into this commit and fixed]
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Treat the char value as unsigned when testing its value to yield consistent
results on both platforms with signed chars and those with unsigned chars
by default (e.g. ARM ones). This also avoids encoding byte values > 127 as
\uXXXX escape sequences, potentially breaking the strng contents.
Fixes: #62
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Add five new functions to deal with date calculation and timing:
- localtime(), gmtime() - return a broken down calendar date and time
specification from the given epoch (or now, if absent) in local and
UTC time respectively
- timelocal(), timegm() - the inverse operation for the former functions,
taking a date and time specification (interpreted as local or UTC time
respectively) and turning it into an epoch value
- clock() - return the second and nanosecond values of the system clock,
useful for time/performance measurements
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
lib: add argument position support (`%m$`) to `sprintf()` and `printf()`
|
|
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Different libc implementations produce different syntax error messages
on invalid regular expression patterns, so rework the test case to
produce stable output across all environments.
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
A typo in the custom order function of the test case caused the test case
to yield differently sorted results on OS X, triggered by differences in
the libc's `qsort()` implementation.
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Since OS X `getopt()` does not handle optional arguments, we need to
always pass a value to `-T`.
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
This ensures that GNU readlink is preferred over OS X own readlink when
executing test cases. This is required due to lacking `-f` flag support
on OS X.
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
- Let `require()` always evaluate the executed code in raw mode
- Let `render()` always evaluate the executed code in template mode
- Let `include()` inherit the raw mode semantics of the calling scope
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
- Change command line flags to be align better with those of other
interpreters and with the gcc compiler, e.g. `-D` and `-U` to
define and undefine globals, `-e` to execute script expression etc.
- Pass only excess CLI arguments as `ARGV` to scripts, e.g.
`ucode -e 'print("Hello world")' -- -x -y` would pass only
`[ "-x", "-y" ]` as ARGV contents
- Default to raw mode and introduce flag to enable template mode
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
When executing an object literal declaration using non-string computed
property name values, the VM crashed caused by an attempt to use a NULL
pointer (result of ucv_string_get() on a non-string value) as hash table
key.
Fix this issue by using the `ucv_key_set()` infrastructure which deals
with the implicit stringification of non-string key values.
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
- Support ES2016 exponentiation (**) and exponentiation assignment (**=)
- Support ES2020 nullish coalescing (??) and logical nullish assignment (??=)
- Support ES2021 logical and assignment (&&=) and logical or assignment (||=)
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Fixes: 4ce69a8 ("fs: implement access(), mkstemp(), file.flush() and proc.flush()")
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
When compiling a switch statement with duplicate `default` cases or a switch
statement with syntax errors before the body block, two error handling cases
were hit in the code that prematurely returned from the function without
resetting the compiler's patchlist pointer away from the on-stack patchlist
that had been set up for the switch statement.
Upon processing a subsequent break or continue control statement, a realloc
was performed on the then invalid patchlist contents, triggering a
segmentation fault or libc assert.
Solve this issue by not returning from the function but breaking the switch
body parsing loop.
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
The most common usecase is extracting the value of a single byte at a
specific offset, e.g. to scan a string char-by-char to construct a hash.
Furthermore, constructing an array which contains the results of multiple
`ord()` invocations is trivial while efficiently extracting a single byte
value without the overhead of an intermediate array is not.
Due to that, change `ord()` to always return a single integer byte value
at the offset specified as second argument or at offset 0 in case no
argument was supplied.
That means that `ord("Abc", 0, 1, 2)` will now return `65` instead of the
former `[ 65, 98, 99 ]` result.
Code relying on the former behaviour should either perform multiple calls
to `ord()`, passing different offsets each time or switch to the `struct`
module which allows efficient unpacking of string data.
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Due to using signed byte values when writing/reading short strings
to/from pointer addresses, 8 bit characters where incorrectly clamped
to `-1` (`255`).
Fix this issue by treating the input string as `uint8_t` array.
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
When patching jump targets for break statments while compiling for-loop
statments, we need jump beyond the instructions popping intermediate loop
variables off the stack but before the pop instructions removing local
loop body variables to prevent a stack position mismatch between compiler
and vm.
Before that change, local loop body variables remained on the stack,
breaking the expected stack layout.
Fixes: b3d758b compiler: ("fix for/break miscompilation")
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Ensure that that the testcase files are executed within the temporary
testcase work directory to simplify testing relative path resolution.
Also fixup the duplicate resource regression test breaking due to that.
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|
|
A performance shortcut in `ucv_is_equal()` incorrectly led to `NaN === NaN`
being true. Fix the issue by only comparing pointers when the involved
types are not doubles.
Due to fixing `NaN !== NaN`, the `uniq()` function now requires a special
case to treat multiple NaNs equal for the sake of generating an array of
unique values.
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
|