ucode - The ucode Scripting Language

Age	Commit message (Collapse)	Author
2022-03-14	lib: adjust require(), render() and include() raw mode semantics	Jo-Philipp Wich
	- Let `require()` always evaluate the executed code in raw mode - Let `render()` always evaluate the executed code in template mode - Let `include()` inherit the raw mode semantics of the calling scope Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2022-03-14	lib: fix potential integer underflow on empty render output	Jo-Philipp Wich
	The current `uc_render()` implementation uses a `fseek()` call on the `open_memstream()` provided `FILE *` stream to reserve headroom for the `uc_string_t` header. The `fseek()` call alone does not guarantee that the underlying buffer length is updated on all libc implementations though. This may lead to an integer underflow later on when the `uc_string_t` header length is substracted from the buffer length after invoking a template that did not produce any output write operations. In such a case, a very large value is assigned to `ustr->length` leading to uninitialized or out-of-bounds memory accesses later on. Solve this issue by writing the header structure as data using `fwrite()` which should yield the expected behaviour on all libc environments. Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2022-02-11	lib: change `ord()` to always return single byte value	Jo-Philipp Wich
	The most common usecase is extracting the value of a single byte at a specific offset, e.g. to scan a string char-by-char to construct a hash. Furthermore, constructing an array which contains the results of multiple `ord()` invocations is trivial while efficiently extracting a single byte value without the overhead of an intermediate array is not. Due to that, change `ord()` to always return a single integer byte value at the offset specified as second argument or at offset 0 in case no argument was supplied. That means that `ord("Abc", 0, 1, 2)` will now return `65` instead of the former `[ 65, 98, 99 ]` result. Code relying on the former behaviour should either perform multiple calls to `ord()`, passing different offsets each time or switch to the `struct` module which allows efficient unpacking of string data. Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2022-02-07	treewide: rework function memory model	Jo-Philipp Wich
	- Instead of treating individual program functions as managed ucode types, demote uc_function_t values to pointers into a uc_program_t entity - Promote uc_program_t to a managed type - Let uc_closure_t claim references to the owning program of the enclosed uc_function_t - Redefine public APIs uc_compile() and uc_vm_execute() APIs to return and expect an uc_program_t object respectively - Remove vallist indirection for function loading and let the compiler emit the function id directly when producing function construction code Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2022-02-03	lib: fix leaking tokener in uc_json() on parse exception	Jo-Philipp Wich
	Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2022-02-03	lib: fix infinite loop on empty regexp matches in uc_replace()	Jo-Philipp Wich
	The regular expression `/()/` will match the empty string, causing the match loop to never advance. Add extra logic to deal with this case, similar to the empty separator string logic. Apply a similar exception to replacements of empty search strings, those should yield the same result as empty regexp matches. Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2022-02-03	lib: fix infinite loop on empty regexp matches in uc_match()	Jo-Philipp Wich
	The regular expression `/()/` will match the empty string, causing the match loop to never advance. Add extra logic to deal with this case. Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2022-02-03	lib: fix infinite loop on empty regexp matches in uc_split()	Jo-Philipp Wich
	The regular expression `/()/` will match the empty string, causing the match loop to never advance. Add extra logic to deal with this case, similar to the empty separator string logic. Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2022-01-29	program: rename bytecode load/write functions, track path of executed file	Jo-Philipp Wich
	Extend source objects with a `runpath` field which contains the original path of the source being executed by the VM. When instantiating source objects from file paths, the `runpath` will be set to the `filename`. When instantiating source buffers using `uc_source_new_buffer()`, the runpath is initially unset. A new function `uc_source_runpath_set()` can be used to adjust the runtime path being associated with a source object. Extend bytecode loading logic to set the source buffer runtime path to the precompiled bytecode file path being loaded and executed. This is required for `sourcepath()` and relative paths in `include()` to function correctly when executing precompiled programs. Finally rename `uc_program_from_file()` and `uc_program_to_file()` to `uc_program_load()` and `uc_program_write()` respectively since the load part now operates on an `uc_source_t` input buffer instead of a plain `FILE *` handle. Adjust users of these API functions accordingly. Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2022-01-29	lib: fix memory leak in uc_require_ucode()	Jo-Philipp Wich
	We need to release the compiled module function after we executed it in our VM context. Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2022-01-26	vm: fix NaN strict equality tests	Jo-Philipp Wich
	A performance shortcut in `ucv_is_equal()` incorrectly led to `NaN === NaN` being true. Fix the issue by only comparing pointers when the involved types are not doubles. Due to fixing `NaN !== NaN`, the `uniq()` function now requires a special case to treat multiple NaNs equal for the sake of generating an array of unique values. Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2022-01-26	lib: fix exists() error return value	Jo-Philipp Wich
	The current implementation incorrectly returned `false` which got treated as `NULL` instead of a boolean `false` value. Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2022-01-23	lib: rework format string handling	Jo-Philipp Wich
	Instead of extracting and forwarding recognized conversion directives from the user supplied format string, properly parse the format string into its components and reassemble a canonical representation of the conversion directive internally before handing it to the libc's sprintf() implementation. Also take care of selecting the proper conversion specifiers for signed and unsigned 64bit integer values to fix broken `%d`, `%i`, `%u`, `%o`, `%x` and `%X` formats on 32bit systems. While reworking the format logic, also slightly improve `%s` argument handling by not duplicate the given value if it already is a string, which reduces the amount of required heap memory. Ref: https://bugs.openwrt.org/index.php?do=details&task_id=4234 Ref: https://git.openwrt.org/3d3d03479d5b4a976cf1320d29f4bd4937d5a4ba Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2022-01-20	lib: fix %J string formats with precision specifier	Jo-Philipp Wich
	Previous refactoring of the code led to an invalid internal format pettern being used to output the formatted JSON data. Fixes: 9041e24 ("lib: fix uninitialized memory access on handling %J string formats") Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2022-01-18	source: refactor source file handling	Jo-Philipp Wich
	- Move source object pointer into program entity which is referenced by each function - Move lineinfo related routines into source.c and use them from lexer.c since lineinfo encoding does not belong into the lexical analyzer. - Implement initial infrastructure for detecting source file type, this is required later to differentiate between plaintext and precompiled bytecode files Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2022-01-07	lib: implement uniq() function	Jo-Philipp Wich
	The uniq() function allows extracting all unique values from a given input array in an efficient manner. It is roughly equivalent to the following ucode idiom: let seen = {} let unique = filter(array, item => !seen[item]++); In contrast to the code above, `uniq()` does not rely on implicit stringification of item values but performs strict equality tests internally. If equivalence of stringified results is desired, the following code can be used: let unique = uniq(map(array, item => "" + item)); Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2022-01-04	treewide: rework numeric value handling	Jo-Philipp Wich
	- Parse integer literals as unsigned numeric values in order to be able to represent the entire unsigned 64bit value range - Stop parsing minus-prefixed integer literals as negative numbers but treat them as separate minus operator followed by a positive integer instead - Only store unsigned numeric constants in bytecode - Rework numeric comparison logic to be able to handle full 64bit unsigned integers - If possible, yield unsigned 64 bit results for additions - Simplify numeric value conversion API - Compile code with -fwrapv for defined signed overflow semantics Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2021-12-07	treewide: fix "resource" misspellings	Jo-Philipp Wich
	Fix various misspelling of "resource". This commit changes the exported libucode ABI. Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2021-10-23	lib: increase refcount when returning cached module instance	Jo-Philipp Wich
	Subsequent requires of the same module returned the cached module instance without increasing the refcount, leading to use-after-free on VM tear down or garbage collection cycles. Solve this issue by properly incrementing the refcount before returning the cached module instance. Fixes: #25 Fixes: 96f140b ("lib, vm: ensure that require() compiles modules only once") Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2021-10-22	lib: fix uninitialized memory access on handling %J string formats	Jo-Philipp Wich
	When parsing the padding size specification of a `J` format, e.g. `%.4J`, the internally called `atoi()` function might read beyond the end of the initialized memory within the format buffer, leading to non-deterministic results. Avoid overreading the initialized memory by parsing the padding length manually digit-by-digit. Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2021-07-11	treewide: harmonize function naming	Jo-Philipp Wich
	- Ensure that most functions follow the subject_verb naming schema - Move type related function from value.c to types.c - Rename value.c to vallist.c Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2021-07-11	treewide: move header files into dedicated directory	Jo-Philipp Wich
	Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2021-07-11	treewide: consolidate typedef naming	Jo-Philipp Wich
	Ensure that all custom typedef and vector declaration type names end with a "_t" suffix. Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2021-07-11	lib, vm: reimplement exit() as exception type	Jo-Philipp Wich
	Instead of invoking exit(3) from uc_exit(), use a new EXCEPTION_EXIT exception type to instruct the VM to shutdown cleanly. This is required to not terminate the host program in case libucode is embedded and loaded scripts invoke the exit() function. Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2021-07-11	vm: move global scope allocation into uc_vm_init()	Jo-Philipp Wich
	Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2021-07-11	vm: add getter and setter for vm globals scope	Jo-Philipp Wich
	Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2021-07-11	lib: rename uc_add_proto_functions() to uc_add_functions()	Jo-Philipp Wich
	The naming is an artifact from before the introduction of the new type system. In the current code, there is nothing special about prototypes, they're simple object values. Also introduce a new singular uc_add_function() convenience macro while we're at it. Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2021-07-11	lib: expose stdlib function array	Jo-Philipp Wich
	Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2021-07-11	treewide: move ressource type registry into vm instance	Jo-Philipp Wich
	Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2021-07-07	lib: fix refcount imbalance in uc_require_path()	Jo-Philipp Wich
	Ensure to increase the refcount of the module scope value when caching it in the global module registry to avoid a double free on VM teardown. Fixes: 96f140b ("lib, vm: ensure that require() compiles modules only once") Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2021-07-07	lib, vm: ensure that require() compiles modules only once	Jo-Philipp Wich
	Cache the result of a successful require operation in a global.modules object and return the cached value for subsequent require calls on the same module name. This ensures that a given module is only compiled and executed once, regardless of how many times it is used. A reload of the module can be forced by deleting the corresponding key in the global module table. Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2021-06-08	treewide: let uc_cmp() use instruction instead of token numbers	Jo-Philipp Wich
	This allows us to drop some token->instruction mapping case switches in the VM. Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2021-06-08	lib: implement b64enc() and b64dec() functions	Jo-Philipp Wich
	The new functions allow encoding and decoding base64 values respectively. Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2021-06-07	lib: only consider context of calling function for callbacks	Jo-Philipp Wich
	Do not walk up the entire call stack but specifically use the context of the callframe calling the C function. Fixes: 3e893e6 ("lib: pass-through "this" context to library function callbacks") Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2021-06-07	lib: implement min() and max() functions	Jo-Philipp Wich
	Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2021-06-07	lib: pass-through "this" context to library function callbacks	Jo-Philipp Wich
	Ensure that callbacks invoked by filter(), map(), sort() and replace() inherit the "this" context that invoked the respective C function. Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2021-06-04	lib: implement `sourcepath()` function	Jo-Philipp Wich
	The sourcepath() function allows querying the filesystem path of the source file currently being executed by ucode. The optional depth argument can be used to walk up the include stack to determine the path of the file that included the current file, the path of the parent file of the parent file and so on. By specifying a truish value as second argument, only the directory portion of the source file path is returned. This is useful to e.g. discover ressources relative to the current source file directory. Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2021-06-02	lib: fix negative uc_index() return value on 32bit systems	Jo-Philipp Wich
	The numeric return value was incorrectly stored in an unsigned size_t variable which got later wrapped in an ucode signed 64bit integer value. This worked by accident on 64bit systems since (int64_t)(size_t)(-1) == -1, but it failed on 32bit ones where (int64_t)(size_t)(-1) yields 4294967295 due to the different sizes of the size_t and int64_t types. Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2021-05-25	syntax: drop Infinity and NaN keywords	Jo-Philipp Wich
	Turn the Infinity and NaN keywords into global properties. Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2021-05-25	lib: rename uc_lib_init() to uc_load_stdlib()	Jo-Philipp Wich
	Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2021-05-25	main, lib: move allocation of globals object into lib function	Jo-Philipp Wich
	Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2021-05-18	syntax: implement `delete` as proper operator	Jo-Philipp Wich
	Turn `delete` into a proper operator mimicking ECMAScript semantics. Also ensure to transparently turn deprecated `delete(obj, propname)` function calls into `delete obj.propname` expressions during compilation. When strict mode is active, legacy delete() calls throw a syntax error instead. Finally drop the `delete()` function from the stdlib as it is shadowed by the delete operator syntax now. Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2021-05-18	lib: implement wildcard() function	Jo-Philipp Wich
	Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2021-05-11	lib: implement regexp(), a function to construct regexp instances at runtime	Jo-Philipp Wich
	Provide a new ucode function regexp() which allows constructing regular expression instances from separate source and flag strings. Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2021-05-10	lib: implement render(), an include variant capturing output in a string	Jo-Philipp Wich
	Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2021-05-10	vm: implement mechanism to change output file descriptor	Jo-Philipp Wich
	Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2021-05-07	lib: fix uc_sort()	Jo-Philipp Wich
	The internally used qsort(3) expects [-n, 0, n] return values from the comparator function instead of a true/false value to denote lower than or equal results. Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2021-05-04	lib: implement assert()	Jo-Philipp Wich
	Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2021-05-04	lib: add support for pretty printing JSON to printf() and sprintf()	Jo-Philipp Wich
	Honour precision specifiers when parsing `J` format strings to enable or disable JSON pretty printing. Signed-off-by: Jo-Philipp Wich <jo@mein.io>
2021-05-04	lib: gracefully handle truncated format strings in uc_printf_common()	Jo-Philipp Wich
	Signed-off-by: Jo-Philipp Wich <jo@mein.io>