ucode - The ucode Scripting Language

Age	Commit message (Collapse)	Author
2024-07-29	tests: replace test runner shell script with ucode implementation	Jo-Philipp Wich
	The ucode interpreter and libraries are mature enough to execute their own testcases now, so replace the existing shell script with an equivalent ucode implementation. Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2024-06-17	fs: add lock() file method	Felix Fietkau
	Implements file based locking on a given file handle. Signed-off-by: Felix Fietkau <nbd@nbd.name>
2024-06-17	fs: add truncate() file method	Felix Fietkau
	Trunates the file referenced by a file handle Signed-off-by: Felix Fietkau <nbd@nbd.name>
2024-03-13	vm: rework `in` operator semantics	Jo-Philipp Wich
	- Ensure that testing for array membership does strict equality tests - Ensure that `(NaN in [ NaN ]) == true` - Do not perform implicit value conversion when testing for object keys, to avoid nonsensical results such as `([] in { "[ ]": true }) == true` - Add test cases for the `in` operator Fixes: #193 Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2024-02-21	vm: rework object iteration	Jo-Philipp Wich
	Ensure that deleting object keys during iteration is safe by keeping a global chain of per-object iterators which are advanced to the next key when the entry that is about to be iterated is deleted. Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2024-02-13	compiler: close upvalues on loop control statements	Felix Fietkau
	When removing locals from all scopes, upvalues need to be considered like in uc_compiler_leave_scope(). Closing them is required to avoid leaving lingering references to stack values that went out of scope, which would lead to invalid memory accesses in subsequent code when such upvalues are used by closures. Fixes: #187 Signed-off-by: Felix Fietkau <nbd@nbd.name> [add testcase, reword commit message] Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2023-11-06	syntax: don't treat `as` and `from` as reserved keywords	Jo-Philipp Wich
	ECMAScript allows using `as` and `from` as identifiers so follow suit and don't treat them specially while parsing. Extend the compiler logic instead to check for TK_LABEL tokens with the expected value to properly parse import and export statements. Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2023-10-09	types: ensure double serializatiion with decimal places	Jo-Philipp Wich
	The `%g` printf format used for serializing double values into strings will not include any decimal place if the value happens to be integral, leading to an unwanted double to integer conversion when serializing and subsequently deserializing an integral double value as JSON. Solve this issue by checking the serialized string result for a decimal point or exponential notation and appending `.0` if neither is found. Ref: #173 Suggested-by: Felix Fietkau <nbd@nbd.name> Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2023-08-22	types: improve comparison reliability of binary strings	Jo-Philipp Wich
	The existing `ucv_compare()` implementation utilized `strcmp()` to compare two ucode string values, which may lead to incorrect results for strings containing null bytes as the comparison prematurely aborts when encountering the first null. Rework the string comparison logic to use `memcmp()` for comparing both ucv strings with each other in order to ensure that expressions such as `"" == "\u0000"` lead to the expected `false` result. Ref: https://github.com/openwrt/luci/issues/6530 Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2023-07-27	main: enable signal dispatching in the standalone cli interpreter	Jo-Philipp Wich
	Enable signal dispatching by default for standalone ucode programs. Also adjust the `gc()` testcase output as the default number of allocations with enabled signal dispatching changes slightly. Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2023-07-12	source: fix source offset accounting	Jo-Philipp Wich
	- When skipping the interpreter line, don't count it's newline twice - Fix reporting byte offsets beyond the end of line Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2023-05-30	lib: support object ordering in `uc_sort()`	Jo-Philipp Wich
	Extend `uc_sort()` to utilize `ucv_object_sort()` in order to support reordering object keys. Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2023-01-23	types: fix array unshift operations and add test coverage	Jo-Philipp Wich
	- Fix `ucv_array_unshift()` improperly rejecting operation on empty arrays - Fix `uc_unshift()` improperly reversing maintaining argument order - Add missing test coverage for `push()`, `pop()`, `unshift()` and `shift()` array operations. Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2023-01-09	fs: add `isatty()` function	Petr Štetiar
	Expose the `isatty(3)` libc function in the fs module to allow checking whether a file descriptor refers to a terminal. Signed-off-by: Petr Štetiar <ynezz@true.cz>
2022-12-02	tests: fixup testcases	Jo-Philipp Wich
	Adjust expected testcase outputs after double format change in the previous commit. Fixes: 4c654df ("types: adjust double printing format") Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2022-11-29	compiler: fix bytecode for logical assignments of properties	Jo-Philipp Wich
	The compiler emitted incorrect bytecode for logical assignment operations on property expressions. The generated instructions left the stack in an unclean state when the assignment condition was not fulfilled, causing a stack layout mismatch between compiler and vm, leading to undefined variable accesses and other non-deterministic behavior. Solve this issue by rewriting the bytecode generation to yield an instruction sequence that does not leave garbage on the stack. The implementation is not optimal yet, as an expression in the form `obj.prop \|\|= val` will load `obj.prop` twice. This is acceptable for now as the load operation has no side effect, but should be solved in a better way by introducing new instructions that allow for swapping stack slots, allowing the vm to operate on a copy of the loaded value. Also rewrite the corresponding test case to trigger a runtime error on code versions before this fix. Fixes: fdc9b6a ("compiler: fix `??=`, `\|\|=` and `&&=` logical assignment semantics") Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2022-11-29	tests: relax sleep() test	Jo-Philipp Wich
	Invoking `sleep(1000)` in the CI container often sleeps slightly longer than exactly 1000ms, causing the test output to mismatch. Relax the test requirement to simply ensure that t2 > t1. Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2022-11-22	compiler: ensure that arrow functions with block bodies return no value	Jo-Philipp Wich
	Follow ES6 semantics and ensure that arrow functions with a block body don't implicitly return the value of the last executed statement. Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2022-11-15	compiler: fix `??=`, `\|\|=` and `&&=` logical assignment semantics	Jo-Philipp Wich
	When compiling logical assignment expressions, ensure that the right hand side of the assignment is not evaluated when the assignment condition is unfulfilled. Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2022-10-05	lexer: fixes for regex literal parsing	Jo-Philipp Wich
	- Ensure that regexp extension escapes are consistently handled; substitute `\d`, `\D`, `\s`, `\S`, `\w` and `\W` with `[[:digit:]]`, `[^[:digit:]]`, `[[:space:]]`, `[^[:space:]]`, `[[:alnum:]_]` and `[^[:alnum:]_]` character classes respectively since not all POSIX regexp implementations implement all of those extensions - Preserve `\b`, `\B`, `\<` and `\>` boundary matches Fixes: a45f2a3 ("lexer: improve regex literal handling") Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2022-10-04	lib: implement slice() function	Jo-Philipp Wich
	Implement a new function `slice()` to complement the existing `splice()` function and model it's semantics after the ES6 `Array.slice()` version. Fixes: #106 Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2022-10-04	compiler: optimize function return opcode generation	Jo-Philipp Wich
	Track last emitted statement type in compiled code and only generate final `return null` opcodes if there is no preceeding `return` statement. Also use this statement tracking to avoid emitting invalid return opcodes for arrow function bodies with trailing empty statements. Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2022-10-04	lexer: improve regex literal handling	Jo-Philipp Wich
	- Do not treat slashes within bracket expressions as delimitters - Do not escape slashes when stringifying regex sources - Allow all escape sequence types in regex literals Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2022-09-30	vm: maintain export symbol tables per program	Jo-Philipp Wich
	Instead of having one global export table per VM instance maintain one table per program instance. This is required to avoid clobbering the export list in case `import` using code is loaded at runtime through `require()`, `loadfile()` etc. Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2022-09-05	lib: add limit support to split() and replace()	Jo-Philipp Wich
	Extend the split() and replace() functions to accept an additional optional `limit` argument which limits the amount of split operations / substitutions performed by these functions. Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2022-08-29	lib: extend render() to support function values	Jo-Philipp Wich
	Extend the `render()` function to accept a function value as first argument, which allows running arbitrary ucode functions and capturing their output. This is especially useful in conjunction with `loadfile()` or `loadstring()` to dynamically compile templates and rendering their output into a string. Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2022-08-29	lib: improve getenv() and split() implementations	Jo-Philipp Wich
	- getenv(): Allow querying the entire environment by omiting variable name - split(): Properly handle null bytes in subject and separator strings Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2022-08-24	lib: introduce three new functions call(), loadstring() and loadfile()	Jo-Philipp Wich
	Introduce new functions dealing with on-the-fly compilation of code and execution of functions with different global scope. The `loadstring()` and `loadfile()` functions will compile the given ucode source string or ucode file path respectively and return the entry function of the resulting program. An optional dictionary specifying parse options may be given as second argument. Both functions return `null` on invalid arguments and throw an exception in case of compilation errors. The `call()` function allows invoking a given function value with a different `this` context and/or a different global environment. Finally refactor the existing `uc_require_ucode()` implementation to reuse the new `uc_loadfile()` and `uc_call()` implementations and adjust as well as simplify affected testcases. Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2022-08-12	lib: implement gc()	Jo-Philipp Wich
	Introduce a new stdlib function `gc()` which allows controlling the periodic garbage collector from ucode. Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2022-08-06	compiler: don't treat offset 0 special at syntax errors	Jo-Philipp Wich
	If a compile error is raised at offset 0, try to resolve line and character position anyway. Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2022-08-05	compiler: improve formatting of nested syntax error messages	Jo-Philipp Wich
	Indent inner messages and prepend them with a vertical bar to increase visual separation of messages. Also include file name in source context output when the compiled program contains more than one source file. Adjust affected testcase outputs accordingly. Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2022-08-05	compiler: rework export index allocation	Jo-Philipp Wich
	The current implementation of the module export offset tracking was inadequate and failed to properly handle larger module dependency graphs. In order to properly support nested module imports/exports, the following changes have been introduced: - Gather export slots during module compilation and emit corresponding export opcodes as one contiguous block at the end of the module function body, right before the final return. This ensures that interleaved imports of other modules do not place foreign exports between our module exports. - Track the number of program wide allocated export slots in order to derive per-module-source offsets for the global VM export list. - Derive import opcode source index from the module source export offset and the index of the requested name within the module source export name list. - Improve error reporting for circular module imports. Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2022-07-30	compiler: add support for import/export statements	Jo-Philipp Wich
	This commit introduces syntax level support for ES6 style module import and export statements. Imports are resolved at compile time and the corresponding module code is compiled into the main program. Also add testcases to cover import and export statement semantics. Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2022-07-30	tests: run_tests.sh: substitute dynamic test directory path in output	Jo-Philipp Wich
	Replace all occurrences for the test file directory path with "." in stderr and stdout results to ensure stable test outputs. Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2022-07-28	lexer: rewrite token scanner	Jo-Philipp Wich
	- Use nested switches instead of lookup tables to detect tokens - Simplify input buffer logic - Reduce amount of intermediate states Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2022-07-12	lexer: fix parsing with disabled block left stripping	Jo-Philipp Wich
	When a template was parsed with global block left stripping disabled, then any text preceding an expression or statement block start tag was incorrectly prepended to the first token value of the block, leading to syntax errors in the compiler. Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2022-06-30	compiler: fix stack mismatch on continue statements nested in switches	Jo-Philipp Wich
	When compiling continue statements nested in switches, the compiler only emitted pop statements for the local variables in the switch body scope, but not for the locals in the scope(s) leading up to the containing loop body. Extend the compilers internal patchlist structure to keep track of the type of scope tied to the patchlist and extend `continue` statement compilation logic to select the appropriate parent patch list in order to determine the amount of locals (stack slots) to clear before the emitted jump instruction. As a result, the `uc_compiler_backpatch()` implementation can be simplified somewhat since we do not need to propagate entries to parent lists anymore. Also add a further regression test case to cover this issue. Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2022-06-27	compiler: fix stack mismatch on nonmatching switch statements with locals	Jo-Philipp Wich
	When a switch statement containing cases with local variable declarations and no default case is evalulated and none of the the cases matched, the local variable slots were never initialized but got popped off the stack when execution resumed after the switch scope, leading to a mismatch in stack layout between compiler and runtime, causing local variables to yield wrong values or a stack underflow triggering a segmentation fault. Solve this issue by patching the last conditional case match jump to hop beyond the local variable pop instructions when no default case is defined. Also extend the regression test case dealing with other switch related stack mismatch issues to cover this particular problem as well. Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2022-06-01	syntax: adjust number literal parsing and string to number conversion	Jo-Philipp Wich
	- Recognize new number literal prefixes `0o` and `0O` for octal as well as `0b` and `0B` for binary number literals - Treat number literals with leading zeros as octal while parsing but as decimal ones on implicit number conversions, means `012` will yield `10` while `+"012"` or `"012" + 0` will yield `12` Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2022-06-01	lib: refactor `uc_int()`	Jo-Philipp Wich
	For string cases, turn `int()` into a thin `strtoll()` wrapper which attempts to parse the initial portion of the string as a decimal integer literal, optionally preceded by white space and a sign character. Also introduce an optional `base` argument for string cases while we're at it and adjust the existing stdlib test case accordingly. The function now behaves mostly the same as ECMAScript `parseInt(val, 10)` for string cases, means it will recognize `012` as `12` and not `10` and it will accept trailing non-digit characters after the initial portition of the input string. Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2022-05-30	lib: rework uc_index() implementation	Jo-Philipp Wich
	- Fix segfault on passing string haystack with non-string needle argument - Perform strict equality tests against array haystacks - Make string searches binary safe - Improve left index string search performance - Improve right index array search performance - Add missing test coverage for index() and rindex() Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2022-05-20	compiler: fix segmentation fault on compiling unexpected unary expressions	Jo-Philipp Wich
	When compiling expressions followed by a unary operator, the compiler triggered a segmentation fault due to invoking an unset infix parser routine. Explicitly handle this case and raise a syntax error if such an invalid expression is encountered. Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2022-05-19	lib: introduce hexenc() and hexdec()	Jo-Philipp Wich
	Add two new functions to deal with encoding and decoding of hexadecimal digit strings: - hexenc() - convert the given input value into a lower case hex digit string, implicitely converting the input argument to a string value if needed - hexdec() - decode the given input hex digit string into a byte string, skipping whitespace or optionally specified characters in the input Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2022-04-13	syntax: implement support for ES6 template literals	Jo-Philipp Wich
	Implement support for ECMAScript 6 template literals which allow simple interpolation of variable values into strings without resorting to `sprintf()` or manual string concatenation. Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2022-04-13	vm: stop executing bytecode on return of nested calls	Jo-Philipp Wich
	When a managed function is indirectly invoked during bytecode execution, e.g. when calling the tostring() method of an object prototype during string concatenation, the invoked function must stop executing bytecode upon return to hand control back to caller. Extend `uc_vm_execute_chunk()` to track the amount of nested function calls it performs and hand back control to the caller once the toplevel callframe returns. Also bubble unhandled exceptions only as far as up to the original caller. Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2022-04-07	Merge pull request #68 from jow-/vm-callframe-double-free-fix	Jo-Philipp Wich

2022-04-07	vm: fix callframe double free on unhanded exceptions	Jo-Philipp Wich
	When invoking a native function as toplevel VM call which indirectly triggers an unhandled exception in managed code, the callframes are completely reset before the C function returns, leading to invalid memory accesses when `uc_vm_call_native()` subsequently popped it's own callframe again. This issue did not surface by executing script code through the interpreter since in this case the VM will always execute a managed code as toplevel call, but it could be triggered by invoking a native function triggering an exception through the C API using `uc_vm_call()` on a fresh `uc_vm_t` context or by utilizing the CLI interpreters `-l` flag to preload a native code library triggering an exception. Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2022-04-07	lib: let `json()` accept input objects implementing `read()` method	Jo-Philipp Wich
	Extend the `uc_json()` implementation to accept readable objects in addition to plain input strings. This allows parsing JSON input directly from open file handles, sockets or other kinds of producer objects without the need to store the entire JSON source string intermediately in memory. Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2022-03-31	fs: fix off-by-one in fs.dirname() function	Daniel Golle
	Make sure fs.dirname() doesn't truncate the last character of the returned path. Previously ucv_string_new_length was called with a length which no longer included the last character (which had just been tested not to be a '/' or '.' and hence broke the loop at that point). Signed-off-by: Daniel Golle <daniel@makrotopia.org> [testcase added] Signed-off-by: Paul Spooren <mail@aparcar.org> [testcase folded into this commit and fixed] Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2022-03-31	types: fix escape sequence encoding of high byte values in JSON strings	Jo-Philipp Wich
	Treat the char value as unsigned when testing its value to yield consistent results on both platforms with signed chars and those with unsigned chars by default (e.g. ARM ones). This also avoids encoding byte values > 127 as \uXXXX escape sequences, potentially breaking the strng contents. Fixes: #62 Signed-off-by: Jo-Philipp Wich <jo@mein.io>