ucode - The ucode Scripting Language

Age	Commit message (Collapse)	Author
2022-08-29	lib: extend render() to support function values	Jo-Philipp Wich
	Extend the `render()` function to accept a function value as first argument, which allows running arbitrary ucode functions and capturing their output. This is especially useful in conjunction with `loadfile()` or `loadstring()` to dynamically compile templates and rendering their output into a string. Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2022-08-29	lib: improve getenv() and split() implementations	Jo-Philipp Wich
	- getenv(): Allow querying the entire environment by omiting variable name - split(): Properly handle null bytes in subject and separator strings Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2022-08-24	lib: introduce three new functions call(), loadstring() and loadfile()	Jo-Philipp Wich
	Introduce new functions dealing with on-the-fly compilation of code and execution of functions with different global scope. The `loadstring()` and `loadfile()` functions will compile the given ucode source string or ucode file path respectively and return the entry function of the resulting program. An optional dictionary specifying parse options may be given as second argument. Both functions return `null` on invalid arguments and throw an exception in case of compilation errors. The `call()` function allows invoking a given function value with a different `this` context and/or a different global environment. Finally refactor the existing `uc_require_ucode()` implementation to reuse the new `uc_loadfile()` and `uc_call()` implementations and adjust as well as simplify affected testcases. Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2022-08-12	main: introduce -g flag to allow enabling periodic gc from cli	Jo-Philipp Wich
	Implement a new flag `-g` which takes an interval value and enables the periodic GC with the given interval for cyclic object structures in the VM if specified. Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2022-08-12	lib: implement gc()	Jo-Philipp Wich
	Introduce a new stdlib function `gc()` which allows controlling the periodic garbage collector from ucode. Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2022-08-06	compiler: add import statement support for dynamic extensions	Jo-Philipp Wich
	Utilize the new I_DYNLINK vm opcode to support import statements referring to dynamic extension modules. During compilation, the compiler will try to infer the type of the imported module from the resolved file path; if it ends with `.so`, the module is assumed to by a dynamic extension and loading/binding of the module is deferred to runtime using I_DYNLINK opcodes. Additionally, the `-c` cli option gained support for a new compiler flag `dynlink=...` which allows forcing a particular module name expression to be treated as dynamic extension. This is useful to e.g. force resolving `import { x } from "foo"` to a dynamic extension `foo.so` loaded at runtime even if a plain `foo.uc` exists in the search path during compilation or if no such module is available at build time. Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2022-08-06	compiler: don't treat offset 0 special at syntax errors	Jo-Philipp Wich
	If a compile error is raised at offset 0, try to resolve line and character position anyway. Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2022-08-05	compiler: improve formatting of nested syntax error messages	Jo-Philipp Wich
	Indent inner messages and prepend them with a vertical bar to increase visual separation of messages. Also include file name in source context output when the compiled program contains more than one source file. Adjust affected testcase outputs accordingly. Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2022-08-05	compiler: rework export index allocation	Jo-Philipp Wich
	The current implementation of the module export offset tracking was inadequate and failed to properly handle larger module dependency graphs. In order to properly support nested module imports/exports, the following changes have been introduced: - Gather export slots during module compilation and emit corresponding export opcodes as one contiguous block at the end of the module function body, right before the final return. This ensures that interleaved imports of other modules do not place foreign exports between our module exports. - Track the number of program wide allocated export slots in order to derive per-module-source offsets for the global VM export list. - Derive import opcode source index from the module source export offset and the index of the requested name within the module source export name list. - Improve error reporting for circular module imports. Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2022-07-30	compiler: add support for import/export statements	Jo-Philipp Wich
	This commit introduces syntax level support for ES6 style module import and export statements. Imports are resolved at compile time and the corresponding module code is compiled into the main program. Also add testcases to cover import and export statement semantics. Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2022-07-30	tests: run_tests.sh: substitute dynamic test directory path in output	Jo-Philipp Wich
	Replace all occurrences for the test file directory path with "." in stderr and stdout results to ensure stable test outputs. Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2022-07-28	lexer: rewrite token scanner	Jo-Philipp Wich
	- Use nested switches instead of lookup tables to detect tokens - Simplify input buffer logic - Reduce amount of intermediate states Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2022-07-12	lexer: fix parsing with disabled block left stripping	Jo-Philipp Wich
	When a template was parsed with global block left stripping disabled, then any text preceding an expression or statement block start tag was incorrectly prepended to the first token value of the block, leading to syntax errors in the compiler. Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2022-06-30	compiler: fix stack mismatch on continue statements nested in switches	Jo-Philipp Wich
	When compiling continue statements nested in switches, the compiler only emitted pop statements for the local variables in the switch body scope, but not for the locals in the scope(s) leading up to the containing loop body. Extend the compilers internal patchlist structure to keep track of the type of scope tied to the patchlist and extend `continue` statement compilation logic to select the appropriate parent patch list in order to determine the amount of locals (stack slots) to clear before the emitted jump instruction. As a result, the `uc_compiler_backpatch()` implementation can be simplified somewhat since we do not need to propagate entries to parent lists anymore. Also add a further regression test case to cover this issue. Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2022-06-27	compiler: fix stack mismatch on nonmatching switch statements with locals	Jo-Philipp Wich
	When a switch statement containing cases with local variable declarations and no default case is evalulated and none of the the cases matched, the local variable slots were never initialized but got popped off the stack when execution resumed after the switch scope, leading to a mismatch in stack layout between compiler and runtime, causing local variables to yield wrong values or a stack underflow triggering a segmentation fault. Solve this issue by patching the last conditional case match jump to hop beyond the local variable pop instructions when no default case is defined. Also extend the regression test case dealing with other switch related stack mismatch issues to cover this particular problem as well. Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2022-06-01	syntax: adjust number literal parsing and string to number conversion	Jo-Philipp Wich
	- Recognize new number literal prefixes `0o` and `0O` for octal as well as `0b` and `0B` for binary number literals - Treat number literals with leading zeros as octal while parsing but as decimal ones on implicit number conversions, means `012` will yield `10` while `+"012"` or `"012" + 0` will yield `12` Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2022-06-01	lib: refactor `uc_int()`	Jo-Philipp Wich
	For string cases, turn `int()` into a thin `strtoll()` wrapper which attempts to parse the initial portion of the string as a decimal integer literal, optionally preceded by white space and a sign character. Also introduce an optional `base` argument for string cases while we're at it and adjust the existing stdlib test case accordingly. The function now behaves mostly the same as ECMAScript `parseInt(val, 10)` for string cases, means it will recognize `012` as `12` and not `10` and it will accept trailing non-digit characters after the initial portition of the input string. Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2022-05-30	lib: rework uc_index() implementation	Jo-Philipp Wich
	- Fix segfault on passing string haystack with non-string needle argument - Perform strict equality tests against array haystacks - Make string searches binary safe - Improve left index string search performance - Improve right index array search performance - Add missing test coverage for index() and rindex() Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2022-05-20	compiler: fix segmentation fault on compiling unexpected unary expressions	Jo-Philipp Wich
	When compiling expressions followed by a unary operator, the compiler triggered a segmentation fault due to invoking an unset infix parser routine. Explicitly handle this case and raise a syntax error if such an invalid expression is encountered. Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2022-05-19	lib: introduce hexenc() and hexdec()	Jo-Philipp Wich
	Add two new functions to deal with encoding and decoding of hexadecimal digit strings: - hexenc() - convert the given input value into a lower case hex digit string, implicitely converting the input argument to a string value if needed - hexdec() - decode the given input hex digit string into a byte string, skipping whitespace or optionally specified characters in the input Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2022-04-13	syntax: implement support for ES6 template literals	Jo-Philipp Wich
	Implement support for ECMAScript 6 template literals which allow simple interpolation of variable values into strings without resorting to `sprintf()` or manual string concatenation. Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2022-04-13	vm: stop executing bytecode on return of nested calls	Jo-Philipp Wich
	When a managed function is indirectly invoked during bytecode execution, e.g. when calling the tostring() method of an object prototype during string concatenation, the invoked function must stop executing bytecode upon return to hand control back to caller. Extend `uc_vm_execute_chunk()` to track the amount of nested function calls it performs and hand back control to the caller once the toplevel callframe returns. Also bubble unhandled exceptions only as far as up to the original caller. Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2022-04-07	Merge pull request #68 from jow-/vm-callframe-double-free-fix	Jo-Philipp Wich

2022-04-07	vm: fix callframe double free on unhanded exceptions	Jo-Philipp Wich
	When invoking a native function as toplevel VM call which indirectly triggers an unhandled exception in managed code, the callframes are completely reset before the C function returns, leading to invalid memory accesses when `uc_vm_call_native()` subsequently popped it's own callframe again. This issue did not surface by executing script code through the interpreter since in this case the VM will always execute a managed code as toplevel call, but it could be triggered by invoking a native function triggering an exception through the C API using `uc_vm_call()` on a fresh `uc_vm_t` context or by utilizing the CLI interpreters `-l` flag to preload a native code library triggering an exception. Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2022-04-07	main: abort when failing to load a preload library	Jo-Philipp Wich
	Do not continue loading other libraries or executing the main code if loading one of the preload libraries fails. Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2022-04-07	lib: let `json()` accept input objects implementing `read()` method	Jo-Philipp Wich
	Extend the `uc_json()` implementation to accept readable objects in addition to plain input strings. This allows parsing JSON input directly from open file handles, sockets or other kinds of producer objects without the need to store the entire JSON source string intermediately in memory. Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2022-03-31	fs: fix off-by-one in fs.dirname() function	Daniel Golle
	Make sure fs.dirname() doesn't truncate the last character of the returned path. Previously ucv_string_new_length was called with a length which no longer included the last character (which had just been tested not to be a '/' or '.' and hence broke the loop at that point). Signed-off-by: Daniel Golle <daniel@makrotopia.org> [testcase added] Signed-off-by: Paul Spooren <mail@aparcar.org> [testcase folded into this commit and fixed] Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2022-03-31	types: fix escape sequence encoding of high byte values in JSON strings	Jo-Philipp Wich
	Treat the char value as unsigned when testing its value to yield consistent results on both platforms with signed chars and those with unsigned chars by default (e.g. ARM ones). This also avoids encoding byte values > 127 as \uXXXX escape sequences, potentially breaking the strng contents. Fixes: #62 Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2022-03-23	treewide: replace some leftover "utpl" occurrences, update .gitignore	Jo-Philipp Wich
	Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2022-03-22	lib: add date and time related functions	Jo-Philipp Wich
	Add five new functions to deal with date calculation and timing: - localtime(), gmtime() - return a broken down calendar date and time specification from the given epoch (or now, if absent) in local and UTC time respectively - timelocal(), timegm() - the inverse operation for the former functions, taking a date and time specification (interpreted as local or UTC time respectively) and turning it into an epoch value - clock() - return the second and nanosecond values of the system clock, useful for time/performance measurements Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2022-03-21	Merge pull request #53 from jow-/string-format-argpos-support	Jo-Philipp Wich
	lib: add argument position support (`%m$`) to `sprintf()` and `printf()`
2022-03-20	lib: add argument position support (`%m$`) to `sprintf()` and `printf()`	Jo-Philipp Wich
	Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2022-03-15	tests: 21_regex_literals: generalize syntax error test case	Jo-Philipp Wich
	Different libc implementations produce different syntax error messages on invalid regular expression patterns, so rework the test case to produce stable output across all environments. Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2022-03-15	tests: 16_sort: fix logic flaw exposed on OS X	Jo-Philipp Wich
	A typo in the custom order function of the test case caused the test case to yield differently sorted results on OS X, triggered by differences in the libc's `qsort()` implementation. Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2022-03-15	tests: run_tests.sh: pass dummy value to `-T` flag	Jo-Philipp Wich
	Since OS X `getopt()` does not handle optional arguments, we need to always pass a value to `-T`. Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2022-03-15	tests: run_tests.sh: use greadlink if available	Jo-Philipp Wich
	This ensures that GNU readlink is preferred over OS X own readlink when executing test cases. This is required due to lacking `-f` flag support on OS X. Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2022-03-14	lib: adjust require(), render() and include() raw mode semantics	Jo-Philipp Wich
	- Let `require()` always evaluate the executed code in raw mode - Let `render()` always evaluate the executed code in template mode - Let `include()` inherit the raw mode semantics of the calling scope Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2022-03-14	main: rework CLI frontend	Jo-Philipp Wich
	- Change command line flags to be align better with those of other interpreters and with the gcc compiler, e.g. `-D` and `-U` to define and undefine globals, `-e` to execute script expression etc. - Pass only excess CLI arguments as `ARGV` to scripts, e.g. `ucode -e 'print("Hello world")' -- -x -y` would pass only `[ "-x", "-y" ]` as ARGV contents - Default to raw mode and introduce flag to enable template mode Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2022-03-14	vm: fix crash on object literals with non-string computed properties	Jo-Philipp Wich
	When executing an object literal declaration using non-string computed property name values, the VM crashed caused by an attempt to use a NULL pointer (result of ucv_string_get() on a non-string value) as hash table key. Fix this issue by using the `ucv_key_set()` infrastructure which deals with the implicit stringification of non-string key values. Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2022-03-07	syntax: support add new operators	Jo-Philipp Wich
	- Support ES2016 exponentiation () and exponentiation assignment (=) - Support ES2020 nullish coalescing (??) and logical nullish assignment (??=) - Support ES2021 logical and assignment (&&=) and logical or assignment (\|\|=) Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2022-02-15	tests: fix proto() testcase	Jo-Philipp Wich
	Fixes: 4ce69a8 ("fs: implement access(), mkstemp(), file.flush() and proc.flush()") Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2022-02-11	compiler: fix patchlist corruption on switch statement syntax errors	Jo-Philipp Wich
	When compiling a switch statement with duplicate `default` cases or a switch statement with syntax errors before the body block, two error handling cases were hit in the code that prematurely returned from the function without resetting the compiler's patchlist pointer away from the on-stack patchlist that had been set up for the switch statement. Upon processing a subsequent break or continue control statement, a realloc was performed on the then invalid patchlist contents, triggering a segmentation fault or libc assert. Solve this issue by not returning from the function but breaking the switch body parsing loop. Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2022-02-11	lib: change `ord()` to always return single byte value	Jo-Philipp Wich
	The most common usecase is extracting the value of a single byte at a specific offset, e.g. to scan a string char-by-char to construct a hash. Furthermore, constructing an array which contains the results of multiple `ord()` invocations is trivial while efficiently extracting a single byte value without the overhead of an intermediate array is not. Due to that, change `ord()` to always return a single integer byte value at the offset specified as second argument or at offset 0 in case no argument was supplied. That means that `ord("Abc", 0, 1, 2)` will now return `65` instead of the former `[ 65, 98, 99 ]` result. Code relying on the former behaviour should either perform multiple calls to `ord()`, passing different offsets each time or switch to the `struct` module which allows efficient unpacking of string data. Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2022-02-11	vallist: fix storing/retrieving short strings with 8bit byte values	Jo-Philipp Wich
	Due to using signed byte values when writing/reading short strings to/from pointer addresses, 8 bit characters where incorrectly clamped to `-1` (`255`). Fix this issue by treating the input string as `uint8_t` array. Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2022-02-08	compiler: fix incorrect loop break targets	Jo-Philipp Wich
	When patching jump targets for break statments while compiling for-loop statments, we need jump beyond the instructions popping intermediate loop variables off the stack but before the pop instructions removing local loop body variables to prevent a stack position mismatch between compiler and vm. Before that change, local loop body variables remained on the stack, breaking the expected stack layout. Fixes: b3d758b compiler: ("fix for/break miscompilation") Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2022-02-03	tests: add functional tests for builtin functions	Jo-Philipp Wich
	Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2022-02-03	run_tests.sh: change workdir to testcase directory during execution	Jo-Philipp Wich
	Ensure that that the testcase files are executed within the temporary testcase work directory to simplify testing relative path resolution. Also fixup the duplicate resource regression test breaking due to that. Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2022-02-03	run_tests.sh: support placing supplemental testcase files	Jo-Philipp Wich
	Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2022-02-03	run_tests.sh: always treat outputs as text data	Jo-Philipp Wich
	Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2022-01-26	vm: fix NaN strict equality tests	Jo-Philipp Wich
	A performance shortcut in `ucv_is_equal()` incorrectly led to `NaN === NaN` being true. Fix the issue by only comparing pointers when the involved types are not doubles. Due to fixing `NaN !== NaN`, the `uniq()` function now requires a special case to treat multiple NaNs equal for the sake of generating an array of unique values. Signed-off-by: Jo-Philipp Wich <jo@mein.io>